2013-09-25 12:51:09

by Jiri Olsa

[permalink] [raw]
Subject: [RFC 00/21] perf tools: Add toggling events support

hi,
sending *RFC* for toggling events support.

Adding perf interface that allows to create toggle events, which can
enable or disable another event. Whenever the toggle event is triggered
(has overflow), it toggles another event state and either starts or
stops it.

The goal is to be able to create toggling tracepoint events to enable and
disable HW counters, but the interface is generic enough to be used for
any kind of event.

It's based on the Frederic's patchset:
https://lkml.org/lkml/2011/3/14/346

Most of the changelogs info is on wiki:
https://perf.wiki.kernel.org/index.php/Jolsa_Features_Togle_Event

In a nutshell:
The interface is added to the sys_perf_event_open syscall
and new ioctl was added for completeness, check:
perf: Add event toggle sys_perf_event_open interface
perf: Add event toggle ioctl interface

The perf tool interface is pretty rough at the moment. We use
'on' and 'off' terms to specify the toggling event, like:
-e 'cycles,irq_entry/on=cycles/,irq_exit/off=cycles/'

Meaning:
- irq_entry toggles on (starts) cycles, and irq_exit toggled off (stops) cycles.
- cycles is started as paused

Looking forward to some ideas for better interface in here ;-)

The patchset is available at:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
perf/core_toggle

thanks for comments,
jirka


Example:
Define toggle(on/off) events:
# perf probe -a fork_entry=do_fork
# perf probe -a fork_exit=do_fork%return

Following record session samples only within do_fork function:
# perf record -g -e '{cycles,cache-misses}:k,probe:fork_entry/on=cycles/,probe:fork_exit/off=cycles/' \
perf bench sched messaging

Following stat session measure cycles within do_fork function:
# perf stat -e '{cycles,cache-misses}:k,probe:fork_entry/on=cycles/,probe:fork_exit/off=cycles/' \
perf bench sched messaging

# Running sched/messaging benchmark...
# 20 sender and receiver processes per group
# 1 groups == 40 processes run

Total time: 0.073 [sec]

Performance counter stats for './perf bench sched messaging -g 1':

20,935,464 cycles # 0.000 GHz
18,897 cache-misses
40 probe:fork_entry
40 probe:fork_exit

0.086319682 seconds time elapsed

Example:
Measure interrupts cycles:
# ./perf stat -e 'cycles,cycles/name=cycles_irq/,irq:irq_handler_entry/on=cycles_irq/,irq:irq_handler_exit/off=cycles_irq/' -a sleep 10

Performance counter stats for 'sleep 10':

50,680,084,994 cycles # 0.000 GHz [100.00%]
652,690 cycles_irq # 0.000 GHz
33 irq:irq_handler_entry [100.00%]
33 irq:irq_handler_exit

10.002084400 seconds time elapsed

Check uprobes example at:
https://perf.wiki.kernel.org/index.php/Jolsa_Features_Togle_Event#Example_-_using_u.28ret.29probes


Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Stephane Eranian <[email protected]>
---
Frederic Weisbecker (2):
perf: Be more specific on pmu related event init naming
perf: Split allocation and initialization code

Jiri Olsa (19):
perf tools: Introduce perf_evlist__wait_workload function
perf tools: Separate sys_perf_event_open call into evsel_open
perf x86: Update event count properly for read syscall
perf: Move event state initialization before/behind the pmu add/del calls
perf: Add event toggle sys_perf_event_open interface
perf: Add event toggle ioctl interface
perf: Toggle whole group in toggle event overflow
perf: Add new 'paused' attribute
perf: Account toggle masters for toggled event
perf: Support event inheritance for toggle feature
perf tests: Adding event simple toggling test
perf tests: Adding event group toggling test
perf tests: Adding event inherit toggling test
perf tools: Allow numeric event to change name via name term
perf tools: Add event_config_optional parsing rule
perf tools: Rename term related parsing function/variable properly
perf tools: Carry term string value for symbols events
perf tools: Add support to parse event on/off toggle terms
perf tools: Add record/stat support for toggling events

arch/x86/kernel/cpu/perf_event.c | 6 +-
include/linux/perf_event.h | 12 +++
include/uapi/linux/perf_event.h | 7 +-
kernel/events/core.c | 396 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
tools/perf/Makefile | 6 ++
tools/perf/arch/x86/tests/toggle-event-raw-64.S | 28 ++++++
tools/perf/builtin-record.c | 7 ++
tools/perf/builtin-stat.c | 12 +++
tools/perf/tests/builtin-test.c | 12 +++
tools/perf/tests/perf-record.c | 1 +
tools/perf/tests/task-exit.c | 5 ++
tools/perf/tests/tests.h | 3 +
tools/perf/tests/toggle-event-group.c | 195 +++++++++++++++++++++++++++++++++++++++++
tools/perf/tests/toggle-event-inherit.c | 132 ++++++++++++++++++++++++++++
tools/perf/tests/toggle-event-raw.c | 106 ++++++++++++++++++++++
tools/perf/util/evlist.c | 97 +++++++++++++++++++++
tools/perf/util/evlist.h | 3 +
tools/perf/util/evsel.c | 53 ++++++-----
tools/perf/util/evsel.h | 4 +
tools/perf/util/parse-events.c | 131 +++++++++++++++++++---------
tools/perf/util/parse-events.h | 9 +-
tools/perf/util/parse-events.l | 6 +-
tools/perf/util/parse-events.y | 68 +++++++++------
tools/perf/util/record.c | 2 +
24 files changed, 1167 insertions(+), 134 deletions(-)
create mode 100644 tools/perf/arch/x86/tests/toggle-event-raw-64.S
create mode 100644 tools/perf/tests/toggle-event-group.c
create mode 100644 tools/perf/tests/toggle-event-inherit.c
create mode 100644 tools/perf/tests/toggle-event-raw.c


2013-09-25 12:51:12

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 01/21] perf tools: Introduce perf_evlist__wait_workload function

Introducing perf_evlist__wait_workload function and adding it
to the tests that fork workload in order to properly wait the
child process. Otherwise other tests could wait wrong child
process and be wrongly synced.

Also restoring signals handlers properly for task exit test.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/perf-record.c | 1 +
tools/perf/tests/task-exit.c | 5 +++++
tools/perf/util/evlist.c | 10 ++++++++++
tools/perf/util/evlist.h | 1 +
4 files changed, 17 insertions(+)

diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 82ac715..b1fc1c0 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -308,6 +308,7 @@ out_close_evlist:
out_delete_maps:
perf_evlist__delete_maps(evlist);
out_delete_evlist:
+ perf_evlist__wait_workload(evlist);
perf_evlist__delete(evlist);
out:
return (err < 0 || errs > 0) ? -1 : 0;
diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c
index b07f8a1..d299565 100644
--- a/tools/perf/tests/task-exit.c
+++ b/tools/perf/tests/task-exit.c
@@ -103,11 +103,16 @@ retry:
err = -1;
}

+ /* Restore defaults and wait the child process. */
+ signal(SIGCHLD, SIG_DFL);
+ signal(SIGUSR1, SIG_DFL);
+
perf_evlist__munmap(evlist);
out_close_evlist:
perf_evlist__close(evlist);
out_delete_maps:
perf_evlist__delete_maps(evlist);
+ perf_evlist__wait_workload(evlist);
perf_evlist__delete(evlist);
return err;
}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0d71a9..c4d382d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1005,6 +1005,16 @@ out_err:
return err;
}

+int perf_evlist__wait_workload(struct perf_evlist *evlist)
+{
+ int status, err = -1, pid = evlist->workload.pid;
+
+ if (pid > 0)
+ err = waitpid(pid, &status, 0);
+
+ return err ? err : status;
+}
+
int perf_evlist__prepare_workload(struct perf_evlist *evlist,
struct perf_target *target,
const char *argv[], bool pipe_output,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 871b55a..0dbd8f8 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -98,6 +98,7 @@ bool perf_can_sample_identifier(void);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);

+int perf_evlist__wait_workload(struct perf_evlist *evlist);
int perf_evlist__prepare_workload(struct perf_evlist *evlist,
struct perf_target *target,
const char *argv[], bool pipe_output,
--
1.7.11.7

2013-09-25 12:51:18

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 03/21] perf x86: Update event count properly for read syscall

Within the read syscall we update the event count via
pmu->read callback if the event is in ACTIVE state.

This triggers event update (x86_perf_event_update) for
x86 pmu and leads to wrong event count in case the event
hasn't been started yet -> the event base is not adjusted
to new cpu and new count computation is based on previous
CPU prev_count.

Fixing this to update the counter only if it has been
started, so the event base is properly adjusted.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 8355c84..e89c773 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1570,7 +1570,11 @@ early_initcall(init_hw_perf_events);

static inline void x86_pmu_read(struct perf_event *event)
{
- x86_perf_event_update(event);
+ struct hw_perf_event *hwc = &event->hw;
+
+ /* Update only if the event has already been started. */
+ if (!(hwc->state & PERF_HES_ARCH))
+ x86_perf_event_update(event);
}

/*
--
1.7.11.7

2013-09-25 12:51:26

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 09/21] perf: Account toggle masters for toggled event

Keep track of toggle events within the toggled event.
It'll be handy for toggle inherit support.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
include/linux/perf_event.h | 1 +
kernel/events/core.c | 9 ++++++++-
2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6ede25c..801ff22 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -423,6 +423,7 @@ struct perf_event {
struct perf_event *toggled_event;
enum perf_event_toggle_flag toggle_flag;
int paused;
+ atomic_t toggled_cnt;
#endif /* CONFIG_PERF_EVENTS */
};

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f5f00a6..7df7a21 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1372,6 +1372,7 @@ static void __perf_event_toggle_detach(struct perf_event *event)
event->overflow_handler = NULL;
event->toggled_event = NULL;

+ atomic_dec(&toggled_event->toggled_cnt);
put_event(toggled_event);
}

@@ -7064,6 +7065,8 @@ static int perf_event_set_toggle_fd(struct perf_event *event, u64 __user *arg)
err = perf_event_set_toggle(event, toggled_event, event->ctx, flag);
if (err)
put_event(toggled_event);
+ else
+ atomic_inc(&toggled_event->toggled_cnt);

fdput(toggled_fd);
return err;
@@ -7251,6 +7254,8 @@ SYSCALL_DEFINE5(perf_event_open,
if (!atomic_long_inc_not_zero(&toggled_event->refcount))
goto err_context;

+ atomic_inc(&toggled_event->toggled_cnt);
+
err = perf_event_set_toggle(event, toggled_event, ctx, flags);
if (err)
goto err_toggle;
@@ -7328,8 +7333,10 @@ SYSCALL_DEFINE5(perf_event_open,
return event_fd;

err_toggle:
- if (toggled_event)
+ if (toggled_event) {
+ atomic_dec(&toggled_event->toggled_cnt);
put_event(toggled_event);
+ }
err_context:
perf_unpin_context(ctx);
put_ctx(ctx);
--
1.7.11.7

2013-09-25 12:51:31

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 10/21] perf: Be more specific on pmu related event init naming

From: Frederic Weisbecker <[email protected]>

This disambiguates the function names as we prepare to split
the event allocation and initialization codes.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
kernel/events/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7df7a21..2a19b64 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6598,7 +6598,7 @@ void perf_pmu_unregister(struct pmu *pmu)
free_pmu_context(pmu);
}

-struct pmu *perf_init_event(struct perf_event *event)
+struct pmu *perf_pmu_init_event(struct perf_event *event)
{
struct pmu *pmu = NULL;
int idx;
@@ -6775,7 +6775,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
goto err_ns;

- pmu = perf_init_event(event);
+ pmu = perf_pmu_init_event(event);
if (!pmu)
goto err_ns;
else if (IS_ERR(pmu)) {
--
1.7.11.7

2013-09-25 12:51:34

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 12/21] perf: Support event inheritance for toggle feature

The toggling sets relationship between events - toggle event
carries pointer and reference for toggled event. This needs
to be configured for child events as well.

During the fork events are processed/cloned with no regards
to the toggle setup, so we have no idea what we get first:
toggle or toggled events.

To avoid extra after-fork scanning for toggle events, we
use child pre-allocation whenever the toggling setup is
detected. This way we can pre-set the toggle dependencies
during the event cloning.

Described in following example.

Consider following setup:
- 'event A' toggles 'event B'
- 'event A' holds pointer/ref to 'event B'

Now we have fork:
(and have no idea which event gets inherited/cloned first)

1) the clone order is 'event A' 'event B'
- 'event A' is processed:
- 'event_cloned A' is created
- 'event_cloned A' needs pointer to 'event_cloned B'
which does not exist yet
- we pre-allocate 'event_cloned B' and setup 'event_cloned A's
toggled_event pointer/ref and also save it in 'event B' as
toggled_child

- 'event B' is processed
- we check if toggled_child is allocated
- we use it as 'event_cloned B'

2) the order is 'event B' 'event A'
- 'event B' is processed
- toggled_child is not allocated, we allocate 'event_cloned B'
and store it in 'event B' as toggled_child

- 'event A' is processed
- create 'event_cloned A'
- 'event A' is toggling event and have pointer to 'event B'
- we check if there's already initialized toggled_child in 'event B'
- initialize 'event_cloned A' toggled_event pointer with
'event_cloned B' taken toggled_child

Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
include/linux/perf_event.h | 2 +
kernel/events/core.c | 103 +++++++++++++++++++++++++++++++++++++++++----
2 files changed, 96 insertions(+), 9 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 801ff22..baefb79 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -424,6 +424,8 @@ struct perf_event {
enum perf_event_toggle_flag toggle_flag;
int paused;
atomic_t toggled_cnt;
+ struct perf_event *toggled_child;
+ int toggled_child_cnt;
#endif /* CONFIG_PERF_EVENTS */
};

diff --git a/kernel/events/core.c b/kernel/events/core.c
index fa1d229..edf161b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3156,6 +3156,8 @@ static void free_event_rcu(struct rcu_head *head)
if (event->ns)
put_pid_ns(event->ns);
perf_event_free_filter(event);
+ if (event->toggled_child)
+ kfree(event->toggled_child);
kfree(event);
}

@@ -6749,6 +6751,9 @@ static int perf_init_event(struct perf_event *event,
event->overflow_handler = overflow_handler;
event->overflow_handler_context = context;

+ if (parent_event && atomic_read(&parent_event->toggled_cnt))
+ event->toggled_cnt = parent_event->toggled_cnt;
+
perf_event__state_init(event);

pmu = NULL;
@@ -6795,6 +6800,11 @@ err_ns:
return err;
}

+static struct perf_event *__perf_event_alloc(void)
+{
+ return kzalloc(sizeof(struct perf_event), GFP_KERNEL);
+}
+
/*
* Allocate and initialize a event structure
*/
@@ -6809,7 +6819,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
struct perf_event *event;
int err;

- event = kzalloc(sizeof(*event), GFP_KERNEL);
+ event = __perf_event_alloc();
if (!event)
return ERR_PTR(-ENOMEM);

@@ -7048,6 +7058,10 @@ perf_event_set_toggle(struct perf_event *event,
if (toggled_event->ctx->task != ctx->task)
return -EINVAL;

+ /* Temporary hack for toggled_child_cnt */
+ if (toggled_event->attr.inherit != event->attr.inherit)
+ return -EINVAL;
+
event->overflow_handler = perf_event_toggle_overflow;
event->toggle_flag = get_toggle_flag(flags);
event->toggled_event = toggled_event;
@@ -7686,6 +7700,66 @@ void perf_event_delayed_put(struct task_struct *task)
WARN_ON_ONCE(task->perf_event_ctxp[ctxn]);
}

+static void
+perf_event_toggled_set_child(struct perf_event *event,
+ struct perf_event *child)
+{
+ event->toggled_child = child;
+ event->toggled_child_cnt = atomic_read(&event->toggled_cnt) + 1;
+}
+
+static void perf_event_toggled_child_put(struct perf_event *parent)
+{
+ int cnt = --parent->toggled_child_cnt;
+
+ WARN_ON_ONCE(cnt < 0);
+ WARN_ON_ONCE(!parent->toggled_child);
+
+ if (!cnt)
+ parent->toggled_child = NULL;
+}
+
+static int
+perf_event_inherit_toggle(struct perf_event *event,
+ struct perf_event *parent)
+{
+ struct perf_event *toggled = parent->toggled_event;
+ struct perf_event *toggled_child = parent->toggled_child;
+
+
+ /*
+ * This @event is toggled by the childs of the its parent's togglers.
+ * If this child is inherited before its togglers, declare it so.
+ */
+ if (atomic_read(&event->toggled_cnt)) {
+ if (!parent->toggled_child)
+ perf_event_toggled_set_child(parent, event);
+ perf_event_toggled_child_put(parent);
+ }
+
+ /*
+ * This @event toggles the child of the event toggled by its @parent.
+ * If it's inherited before its toggled event, pre-allocate the toggled
+ * In any case, declare and attach the toggled to the @event.
+ */
+ if (toggled) {
+ toggled_child = toggled->toggled_child;
+ if (!toggled_child) {
+ toggled_child = __perf_event_alloc();
+ if (!toggled_child)
+ return -ENOMEM;
+ perf_event_toggled_set_child(toggled, toggled_child);
+ }
+
+ /* set inherited toggling */
+ event->toggled_event = toggled_child;
+ event->toggle_flag = parent->toggle_flag;
+ perf_event_toggled_child_put(toggled);
+ }
+
+ return 0;
+}
+
/*
* inherit a event from parent task to child task:
*/
@@ -7697,8 +7771,16 @@ inherit_event(struct perf_event *parent_event,
struct perf_event *group_leader,
struct perf_event_context *child_ctx)
{
- struct perf_event *child_event;
+ struct perf_event *child_event, *orig_parent_event = parent_event;
unsigned long flags;
+ int err;
+
+ child_event = parent_event->toggled_child;
+ if (!child_event) {
+ child_event = __perf_event_alloc();
+ if (!child_event)
+ return ERR_PTR(-ENOMEM);
+ }

/*
* Instead of creating recursive hierarchies of events,
@@ -7709,13 +7791,16 @@ inherit_event(struct perf_event *parent_event,
if (parent_event->parent)
parent_event = parent_event->parent;

- child_event = perf_event_alloc(&parent_event->attr,
- parent_event->cpu,
- child,
- group_leader, parent_event,
- NULL, NULL);
- if (IS_ERR(child_event))
- return child_event;
+ err = perf_init_event(child_event, &parent_event->attr,
+ parent_event->cpu, child,
+ group_leader, parent_event, NULL, NULL);
+ if (err)
+ return ERR_PTR(err);
+
+ if (perf_event_inherit_toggle(child_event, orig_parent_event)) {
+ free_event(child_event);
+ return NULL;
+ }

if (!atomic_long_inc_not_zero(&parent_event->refcount)) {
free_event(child_event);
--
1.7.11.7

2013-09-25 12:51:43

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 15/21] perf tests: Adding event inherit toggling test

Does the same as raw toggling test except it also creates
child process that runs same workload. Testing that the
instructions event's toggling is properly inherited and
count is propagated to the parent.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/tests.h | 1 +
tools/perf/tests/toggle-event-inherit.c | 132 ++++++++++++++++++++++++++++++++
4 files changed, 138 insertions(+)
create mode 100644 tools/perf/tests/toggle-event-inherit.c

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 90d7127..893c4a7 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -400,6 +400,7 @@ LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o
LIB_OBJS += $(OUTPUT)tests/toggle-event-raw.o
LIB_OBJS += $(OUTPUT)tests/toggle-event-group.o
+LIB_OBJS += $(OUTPUT)tests/toggle-event-inherit.o
ifeq ($(RAW_ARCH),x86_64)
LIB_OBJS += $(OUTPUT)arch/x86/tests/toggle-event-raw-64.o
endif
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 7e96550..6768266 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -124,6 +124,10 @@ static struct test {
.func = test__toggle_event_group,
},
{
+ .desc = "Toggle event inherit",
+ .func = test__toggle_event_inherit,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index db692bf..f0ef1050 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -42,5 +42,6 @@ int test__keep_tracking(void);
int test__parse_no_sample_id_all(void);
int test__toggle_event_raw(void);
int test__toggle_event_group(void);
+int test__toggle_event_inherit(void);

#endif /* TESTS_H */
diff --git a/tools/perf/tests/toggle-event-inherit.c b/tools/perf/tests/toggle-event-inherit.c
new file mode 100644
index 0000000..0c02fd8
--- /dev/null
+++ b/tools/perf/tests/toggle-event-inherit.c
@@ -0,0 +1,132 @@
+
+#include <unistd.h>
+#include <traceevent/event-parse.h>
+#include "thread_map.h"
+#include "evsel.h"
+#include "debug.h"
+#include "tests.h"
+
+/*
+ * This test creates following events:
+ *
+ * 1) tracepoint sys_enter_openat
+ * 2) tracepoint sys_enter_close
+ * 3) HW event instruction
+ *
+ * Events 1) and 2) are set to toggle ON and OFF
+ * respectively event 3).
+ * All events are created with inherit flag set.
+ *
+ * Workload executes test__toggle_event_raw_arch
+ * first in parent and then in the child. After
+ * the child is finished, we check we got 10
+ * instructions instead of 5, that means plus extra
+ * 5 from the child.
+ *
+ * Workload in test__toggle_event_raw_arch:
+ * - executes open_at syscall
+ * - executes 5 instructions
+ * - executes close syscall
+ *
+ * We read instruction event to validate
+ * we counted 5 instructions.
+ *
+ */
+
+extern int test__toggle_event_raw_arch(void);
+
+static int get_tp_id(const char *name)
+{
+ struct event_format *tp_format = event_format__new("syscalls", name);
+ u64 id = 0;
+
+ if (tp_format) {
+ id = tp_format->id;
+ pevent_free_format(tp_format);
+ }
+
+ return id;
+}
+
+#ifndef __x86_64__
+int test__toggle_event_inherit(void)
+{
+ pr_err("The toggle event test not implemented for arch.\n");
+ return 0;
+}
+#else
+int test__toggle_event_inherit(void)
+{
+ struct perf_event_attr attr_on = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_openat"),
+ .sample_period = 1,
+ .inherit = 1,
+ };
+ struct perf_event_attr attr_off = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_close"),
+ .sample_period = 1,
+ .inherit = 1,
+ };
+ struct perf_event_attr attr_instr = {
+ .type = PERF_TYPE_HARDWARE,
+ .config = PERF_COUNT_HW_INSTRUCTIONS,
+ .paused = 1,
+ .exclude_kernel = 1,
+ .exclude_hv = 1,
+ .inherit = 1,
+ };
+ int fd_on, fd_off, fd_instr;
+ __u64 value, instr;
+ int err, status;
+
+ fd_instr = sys_perf_event_open(&attr_instr, 0, -1, -1, 0);
+ if (fd_instr < 0) {
+ pr_err("failed to open instruction event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_on = sys_perf_event_open(&attr_on, 0, -1,
+ fd_instr,
+ PERF_FLAG_TOGGLE_ON);
+ if (fd_on < 0) {
+ pr_err("failed to open 'on' event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_off = sys_perf_event_open(&attr_off, 0, -1,
+ fd_instr,
+ PERF_FLAG_TOGGLE_OFF);
+ if (fd_off < 0) {
+ pr_err("failed to open 'off' event, errno %d\n", errno);
+ return -1;
+ }
+
+ instr = test__toggle_event_raw_arch();
+
+ err = fork();
+ if (err == 0) {
+ test__toggle_event_raw_arch();
+ _exit(0);
+ } else if (err < 0) {
+ pr_err("fork failed\n");
+ return -1;
+ }
+
+ waitpid(-1, &status, 0);
+
+ close(fd_on);
+ close(fd_off);
+
+ if (sizeof(value) != read(fd_instr, &value, sizeof(value)))
+ pr_err("failed to read instruction event, errno %d\n", errno);
+
+ instr *= 2;
+
+ pr_debug("got count %llu vs %llu\n", value, instr);
+
+ close(fd_instr);
+ return instr != value;
+}
+#endif /* __x86_64__ */
--
1.7.11.7

2013-09-25 12:51:38

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 13/21] perf tests: Adding event simple toggling test

Adding toggle toggle interface test into automated suite.

The test creates HW userspace instruction counter, which is
triggered by 'openat' tracepoint and disabled by 'close'
tracepoint.

The test compares number of the userspace instructions
measured by counter with expected count.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 4 +
tools/perf/arch/x86/tests/toggle-event-raw-64.S | 28 +++++++
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/tests.h | 1 +
tools/perf/tests/toggle-event-raw.c | 106 ++++++++++++++++++++++++
5 files changed, 143 insertions(+)
create mode 100644 tools/perf/arch/x86/tests/toggle-event-raw-64.S
create mode 100644 tools/perf/tests/toggle-event-raw.c

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 430878a..2072389 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -398,6 +398,10 @@ endif
LIB_OBJS += $(OUTPUT)tests/code-reading.o
LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o
+LIB_OBJS += $(OUTPUT)tests/toggle-event-raw.o
+ifeq ($(RAW_ARCH),x86_64)
+LIB_OBJS += $(OUTPUT)arch/x86/tests/toggle-event-raw-64.o
+endif

BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/arch/x86/tests/toggle-event-raw-64.S b/tools/perf/arch/x86/tests/toggle-event-raw-64.S
new file mode 100644
index 0000000..027fccd
--- /dev/null
+++ b/tools/perf/arch/x86/tests/toggle-event-raw-64.S
@@ -0,0 +1,28 @@
+
+/*
+ * XXX I'd normally do '#include <asm/unistd_64.h>', but it's
+ * overloaded in ./util/include/asm/ with empty file. So using
+ * my own syscall defines instead for now.
+ */
+#define __NR_openat 257
+#define __NR_close 3
+
+ .global test__toggle_event_raw_arch
+test__toggle_event_raw_arch:
+ movq $__NR_openat,%rax
+ xorq %rdi, %rdi
+ xorq %rdx, %rdx
+ xorq %rcx, %rcx
+ xorq %r8, %r8
+ xorq %r9, %r9
+ syscall
+
+ nop # 1
+ nop # 2
+
+ movq $__NR_close,%rax # 3
+ movq $-1, %rdi # 4
+ syscall # 5
+
+ mov $5, %rax
+ retq
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 1e67437..db9d924b 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -116,6 +116,10 @@ static struct test {
.func = test__parse_no_sample_id_all,
},
{
+ .desc = "Toggle event raw",
+ .func = test__toggle_event_raw,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index e0ac713..4f2a8a1 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -40,5 +40,6 @@ int test__code_reading(void);
int test__sample_parsing(void);
int test__keep_tracking(void);
int test__parse_no_sample_id_all(void);
+int test__toggle_event_raw(void);

#endif /* TESTS_H */
diff --git a/tools/perf/tests/toggle-event-raw.c b/tools/perf/tests/toggle-event-raw.c
new file mode 100644
index 0000000..5d4406d
--- /dev/null
+++ b/tools/perf/tests/toggle-event-raw.c
@@ -0,0 +1,106 @@
+
+#include <traceevent/event-parse.h>
+#include "thread_map.h"
+#include "evsel.h"
+#include "debug.h"
+#include "tests.h"
+
+/*
+ * This test creates following events:
+ *
+ * 1) tracepoint sys_enter_openat
+ * 2) tracepoint sys_enter_close
+ * 3) HW event instruction
+ *
+ * Events 1) and 2) are set to toggle ON and OFF
+ * respectively event 3).
+ *
+ * Workload in test__toggle_event_raw_arch:
+ * - executes open_at syscall
+ * - executes 5 instructions
+ * - executes close syscall
+ *
+ * We read instruction event to validate
+ * we counted 5 instructions.
+ *
+ */
+extern int test__toggle_event_raw_arch(void);
+
+static int get_tp_id(const char *name)
+{
+ struct event_format *tp_format = event_format__new("syscalls", name);
+ u64 id = 0;
+
+ if (tp_format) {
+ id = tp_format->id;
+ pevent_free_format(tp_format);
+ }
+
+ return id;
+}
+
+#ifndef __x86_64__
+int test__toggle_event_raw(void)
+{
+ pr_err("The toggle event test not implemented for arch.\n");
+ return 0;
+}
+#else
+int test__toggle_event_raw(void)
+{
+ struct perf_event_attr attr_on = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_openat"),
+ .sample_period = 1,
+ };
+ struct perf_event_attr attr_off = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_close"),
+ .sample_period = 1,
+ };
+ struct perf_event_attr attr_instr = {
+ .type = PERF_TYPE_HARDWARE,
+ .config = PERF_COUNT_HW_INSTRUCTIONS,
+ .paused = 1,
+ .exclude_kernel = 1,
+ .exclude_hv = 1,
+ };
+ int fd_on, fd_off, fd_instr;
+ __u64 value, instr;
+
+ fd_instr = sys_perf_event_open(&attr_instr, 0, -1, -1, 0);
+ if (fd_instr < 0) {
+ pr_err("failed to open instruction event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_on = sys_perf_event_open(&attr_on, 0, -1,
+ fd_instr,
+ PERF_FLAG_TOGGLE_ON);
+ if (fd_on < 0) {
+ pr_err("failed to open 'on' event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_off = sys_perf_event_open(&attr_off, 0, -1,
+ fd_instr,
+ PERF_FLAG_TOGGLE_OFF);
+ if (fd_off < 0) {
+ pr_err("failed to open 'off' event, errno %d\n", errno);
+ return -1;
+ }
+
+ instr = test__toggle_event_raw_arch();
+
+ close(fd_on);
+ close(fd_off);
+
+ if (sizeof(value) != read(fd_instr, &value, sizeof(value)))
+ pr_err("failed to read instruction event, errno %d\n", errno);
+
+ pr_debug("got count %llu vs %llu\n", value, instr);
+
+ close(fd_instr);
+ return instr != value;
+}
+#endif /* __x86_64__ */
--
1.7.11.7

2013-09-25 12:51:54

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 19/21] perf tools: Carry term string value for symbols events

Currently only the number interpretation of the event is
carried for 'value_sym' related events.

We need to have also string symbol representation for toggle
on/off term interface to match the proper event name.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/parse-events.l | 4 +++-
tools/perf/util/parse-events.y | 28 ++++++++++++++++++++--------
2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 91346b7..560ca86 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -54,8 +54,10 @@ static int str(yyscan_t scanner, int token)
static int sym(yyscan_t scanner, int type, int config)
{
YYSTYPE *yylval = parse_events_get_lval(scanner);
+ char *text = parse_events_get_text(scanner);

- yylval->num = (type << 16) + config;
+ yylval->sym.num = (type << 16) + config;
+ yylval->sym.str = strdup(text);
return type == PERF_TYPE_HARDWARE ? PE_VALUE_SYM_HW : PE_VALUE_SYM_SW;
}

diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 1497a70..ca93b72 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -48,8 +48,8 @@ static inc_group_count(struct list_head *list,
%token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
%token PE_ERROR
%type <num> PE_VALUE
-%type <num> PE_VALUE_SYM_HW
-%type <num> PE_VALUE_SYM_SW
+%type <sym> PE_VALUE_SYM_HW
+%type <sym> PE_VALUE_SYM_SW
%type <num> PE_RAW
%type <num> PE_TERM
%type <str> PE_NAME
@@ -58,7 +58,7 @@ static inc_group_count(struct list_head *list,
%type <str> PE_MODIFIER_EVENT
%type <str> PE_MODIFIER_BP
%type <str> PE_EVENT_NAME
-%type <num> value_sym
+%type <sym> value_sym
%type <head> event_config_optional
%type <head> event_config
%type <term> event_term
@@ -80,8 +80,12 @@ static inc_group_count(struct list_head *list,

%union
{
- char *str;
+ struct {
+ u64 num;
+ char *str;
+ } sym;
u64 num;
+ char *str;
struct list_head *head;
struct parse_events_term *term;
}
@@ -235,8 +239,8 @@ value_sym event_config_optional
struct parse_events_evlist *data = _data;
struct list_head *list;
struct list_head *terms = $2;
- int type = $1 >> 16;
- int config = $1 & 255;
+ int type = $1.num >> 16;
+ int config = $1.num & 255;

ALLOC_LIST(list);
ABORT_ON(parse_events_add_numeric(list, &data->idx,
@@ -384,7 +388,7 @@ PE_NAME '=' PE_VALUE
PE_NAME '=' PE_VALUE_SYM_HW
{
struct parse_events_term *term;
- int config = $3 & 255;
+ int config = $3.num & 255;

ABORT_ON(parse_events_term__sym_hw(&term, $1, config));
$$ = term;
@@ -402,7 +406,7 @@ PE_NAME
PE_VALUE_SYM_HW
{
struct parse_events_term *term;
- int config = $1 & 255;
+ int config = $1.num & 255;

ABORT_ON(parse_events_term__sym_hw(&term, NULL, config));
$$ = term;
@@ -416,6 +420,14 @@ PE_TERM '=' PE_NAME
$$ = term;
}
|
+PE_TERM '=' value_sym
+{
+ struct parse_events__term *term;
+
+ ABORT_ON(parse_events_term__str(&term, (int)$1, NULL, $3.str));
+ $$ = term;
+}
+|
PE_TERM '=' PE_VALUE
{
struct parse_events_term *term;
--
1.7.11.7

2013-09-25 12:52:00

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 21/21] perf tools: Add record/stat support for toggling events

Adding support for toggling events into record and stat
command. It's now possible to meassure/sample code
bounded by events.

The toggling events are defined via on/off terms,
assigned with the name of the event they should
toggle.

- Example: using k(ret)probes:
Define toggle(on/off) events:
# perf probe -a fork_entry=do_fork
# perf probe -a fork_exit=do_fork%return

Following record session samples only within do_fork function:
# perf record -g -e '{cycles,cache-misses}:k,probe:fork_entry/on=cycles/,probe:fork_exit/off=cycles/' \
perf bench sched messaging

Following stat session measure cycles within do_fork function:
# perf stat -e '{cycles,cache-misses}:k,probe:fork_entry/on=cycles/,probe:fork_exit/off=cycles/' \
perf bench sched messaging

# Running sched/messaging benchmark...
# 20 sender and receiver processes per group
# 1 groups == 40 processes run

Total time: 0.073 [sec]

Performance counter stats for './perf bench sched messaging -g 1':

20,935,464 cycles # 0.000 GHz
18,897 cache-misses
40 probe:fork_entry
40 probe:fork_exit

0.086319682 seconds time elapsed

- Example: using u(ret)probes:
Sample program:
---
void krava(void)
{
asm volatile ("nop; nop");
}

int main(void)
{
krava();
return 0;
}
---

Define toggle(on/off) events:
# perf probe -x ./ex entry=krava
# perf probe -x ./ex exit=krava%return

Following stat session measure instructions within krava function:
# perf stat -e instructions:u,probe_ex:entry/on=instructions/,probe_ex:exit/off=instructions/ ./ex

Performance counter stats for './ex':

9 instructions:u # 0.00 insns per cycle
1 probe_ex:entry
1 probe_ex:exit

0.000556743 seconds time elapsed

Following stat session measure cycles, instructions and cache-misses
within krava function:
# perf stat -e '{cycles,instructions,cache-misses}:u,probe_ex:entry/on=cycles/,probe_ex:exit/off=cycles/' ./ex

Performance counter stats for './ex':

2,068 cycles # 0.000 GHz
9 instructions # 0.00 insns per cycle
0 cache-misses
1 probe_ex:entry
1 probe_ex:exit

0.000557504 seconds time elapsed

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-record.c | 7 ++++
tools/perf/builtin-stat.c | 12 +++++++
tools/perf/util/evlist.c | 87 +++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.h | 2 ++
tools/perf/util/evsel.c | 4 +++
tools/perf/util/evsel.h | 1 +
tools/perf/util/record.c | 2 ++
7 files changed, 115 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index da13840..a41d63c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -225,6 +225,13 @@ try_again:
goto out;
}

+ if (perf_evlist__apply_toggle(evlist)) {
+ error("failed to set toggling %d (%s)\n", errno,
+ strerror(errno));
+ rc = -1;
+ goto out;
+ }
+
if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
if (errno == EPERM) {
pr_err("Permission error mapping pages.\n"
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f686d5f..86729ae 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -242,6 +242,7 @@ static void perf_stat__reset_stats(struct perf_evlist *evlist)
static int create_perf_stat_counter(struct perf_evsel *evsel)
{
struct perf_event_attr *attr = &evsel->attr;
+ struct perf_evsel *leader = evsel->leader;

if (scale)
attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
@@ -249,6 +250,9 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)

attr->inherit = !no_inherit;

+ if (leader->is_toggled)
+ attr->paused = 1;
+
if (perf_target__has_cpu(&target))
return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));

@@ -462,6 +466,8 @@ static int __run_perf_stat(int argc, const char **argv)
if (group)
perf_evlist__set_leader(evsel_list);

+ perf_evlist__mark_toggled(evsel_list);
+
list_for_each_entry(counter, &evsel_list->entries, node) {
if (create_perf_stat_counter(counter) < 0) {
/*
@@ -496,6 +502,12 @@ static int __run_perf_stat(int argc, const char **argv)
return -1;
}

+ if (perf_evlist__apply_toggle(evsel_list)) {
+ error("failed to set toggling with %d (%s)\n", errno,
+ strerror(errno));
+ return -1;
+ }
+
/*
* Enable counters and exec the command:
*/
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c4d382d..eda7907 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -16,6 +16,7 @@
#include "evsel.h"
#include "debug.h"
#include <unistd.h>
+#include "asm/bug.h"

#include "parse-events.h"
#include "parse-options.h"
@@ -819,6 +820,91 @@ void perf_evlist__delete_maps(struct perf_evlist *evlist)
evlist->threads = NULL;
}

+static struct perf_evsel *
+perf_evlist__find_evsel_by_name(struct perf_evlist *evlist, char *name)
+{
+ struct perf_evsel *evsel;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ if (strstr(perf_evsel__name(evsel), name))
+ return evsel;
+
+ return NULL;
+}
+
+static int apply_toggle(struct perf_evsel *evsel, struct perf_evsel *toggled,
+ int ncpus, int nthreads)
+{
+ int cpu, thread, err = 0;
+
+ for (cpu = 0; cpu < ncpus; cpu++) {
+ for (thread = 0; thread < nthreads; thread++) {
+ int fd = FD(evsel, cpu, thread);
+ u64 args[2] = {
+ FD(toggled, cpu, thread),
+ evsel->toggle_flag
+ };
+
+ err = ioctl(fd, PERF_EVENT_IOC_SET_TOGGLE, args);
+ if (err)
+ break;
+ }
+ }
+
+ return err;
+}
+
+int perf_evlist__apply_toggle(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ int err = 0;
+ const int ncpus = cpu_map__nr(evlist->cpus),
+ nthreads = thread_map__nr(evlist->threads);
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ struct perf_evsel *toggled;
+
+ if (!evsel->toggle_flag)
+ continue;
+
+ toggled = perf_evlist__find_evsel_by_name(evlist,
+ evsel->toggle_name);
+ if (WARN_ONCE(!toggled, "toggle apply: internal error\n"))
+ return -1;
+
+ pr_debug("toggle: %s toggles %s %s\n",
+ perf_evsel__name(evsel),
+ evsel->toggle_flag == PERF_FLAG_TOGGLE_ON ?
+ "ON" : "OFF",
+ perf_evsel__name(toggled));
+
+ err = apply_toggle(evsel, toggled, ncpus, nthreads);
+ if (err)
+ break;
+ }
+
+ return err;
+}
+
+void perf_evlist__mark_toggled(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ struct perf_evsel *toggled;
+
+ if (!evsel->toggle_flag)
+ continue;
+
+ toggled = perf_evlist__find_evsel_by_name(evlist,
+ evsel->toggle_name);
+ if (WARN_ONCE(!toggled, "toggle mark: internal error\n"))
+ continue;
+
+ toggled->is_toggled = true;
+ }
+}
+
int perf_evlist__apply_filters(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
@@ -827,6 +913,7 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist)
nthreads = thread_map__nr(evlist->threads);

list_for_each_entry(evsel, &evlist->entries, node) {
+
if (evsel->filter == NULL)
continue;

diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0dbd8f8..eb77c81 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -135,6 +135,8 @@ static inline void perf_evlist__set_maps(struct perf_evlist *evlist,
int perf_evlist__create_maps(struct perf_evlist *evlist,
struct perf_target *target);
void perf_evlist__delete_maps(struct perf_evlist *evlist);
+void perf_evlist__mark_toggled(struct perf_evlist *evlist);
+int perf_evlist__apply_toggle(struct perf_evlist *evlist);
int perf_evlist__apply_filters(struct perf_evlist *evlist);

void __perf_evlist__set_leader(struct list_head *list);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3ed7947..4e6db1c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -696,6 +696,9 @@ void perf_evsel__config(struct perf_evsel *evsel,
*/
if (perf_target__none(&opts->target) && perf_evsel__is_group_leader(evsel))
attr->enable_on_exec = 1;
+
+ if (leader->is_toggled)
+ attr->paused = 1;
}

int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
@@ -984,6 +987,7 @@ static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
ret += PRINT_ATTR2(exclude_host, exclude_guest);
ret += PRINT_ATTR2N("excl.callchain_kern", exclude_callchain_kernel,
"excl.callchain_user", exclude_callchain_user);
+ ret += PRINT_ATTR2(paused, paused);

ret += PRINT_ATTR_U32(wakeup_events);
ret += PRINT_ATTR_U32(wakeup_watermark);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e70415b..69f4183 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -93,6 +93,7 @@ struct perf_evsel {
/* toggle event config */
char toggle_flag;
char *toggle_name;
+ bool is_toggled;
};

#define hists_to_evsel(h) container_of(h, struct perf_evsel, hists)
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 18d73aa..7f7eeb4 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -88,6 +88,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
if (evlist->cpus->map[0] < 0)
opts->no_inherit = true;

+ perf_evlist__mark_toggled(evlist);
+
list_for_each_entry(evsel, &evlist->entries, node)
perf_evsel__config(evsel, opts);

--
1.7.11.7

2013-09-25 12:52:19

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 20/21] perf tools: Add support to parse event on/off toggle terms

Adding parsing support for 'on' and 'off' terms within the
event syntax. We can now specify on/off terms like:

-e 'cycles,irq_entry/on=cycles/,irq_exit/off=cycles/'

Only string value is accepted for both terms. The name
will be used in a search for toggled event.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evsel.c | 1 +
tools/perf/util/evsel.h | 3 ++
tools/perf/util/parse-events.c | 69 +++++++++++++++++++++++++++++++++++-------
tools/perf/util/parse-events.h | 5 ++-
tools/perf/util/parse-events.l | 2 ++
tools/perf/util/parse-events.y | 6 ++--
6 files changed, 71 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 95590fe..3ed7947 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -820,6 +820,7 @@ void perf_evsel__delete(struct perf_evsel *evsel)
free(evsel->group_name);
if (evsel->tp_format)
pevent_free_format(evsel->tp_format);
+ free(evsel->toggle_name);
free(evsel->name);
free(evsel);
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4a7bdc7..e70415b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -90,6 +90,9 @@ struct perf_evsel {
int sample_read;
struct perf_evsel *leader;
char *group_name;
+ /* toggle event config */
+ char toggle_flag;
+ char *toggle_name;
};

#define hists_to_evsel(h) container_of(h, struct perf_evsel, hists)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 899c59e..4e8243f 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -267,7 +267,42 @@ const char *event_type(int type)
return "unknown";
}

+static int config_evsel_term(struct perf_evsel *evsel,
+ struct parse_events_term *term)
+{
+ if (evsel->toggle_name)
+ return -EINVAL;

+ switch (term->type_term) {
+ case PARSE_EVENTS__TERM_TYPE_TOGGLE_ON:
+ evsel->toggle_flag = PERF_FLAG_TOGGLE_ON;
+ break;
+ case PARSE_EVENTS__TERM_TYPE_TOGGLE_OFF:
+ evsel->toggle_flag = PERF_FLAG_TOGGLE_OFF;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ evsel->toggle_name = strdup(term->val.str);
+ return 0;
+}
+
+static int config_evsel(struct perf_evsel *evsel,
+ struct list_head *head)
+{
+ struct parse_events_term *term;
+
+ list_for_each_entry(term, head, list) {
+ int ret;
+
+ ret = config_evsel_term(evsel, term);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}

static int __add_event(struct list_head *list, int *idx,
struct perf_event_attr *attr,
@@ -374,7 +409,8 @@ int parse_events_add_cache(struct list_head *list, int *idx,
}

static int add_tracepoint(struct list_head *list, int *idx,
- char *sys_name, char *evt_name)
+ char *sys_name, char *evt_name,
+ struct list_head *terms)
{
struct perf_evsel *evsel;

@@ -382,13 +418,20 @@ static int add_tracepoint(struct list_head *list, int *idx,
if (!evsel)
return -ENOMEM;

+ if (terms && config_evsel(evsel, terms)) {
+ perf_evsel__delete(evsel);
+ free(list);
+ return -EINVAL;
+ }
+
list_add_tail(&evsel->node, list);

return 0;
}

static int add_tracepoint_multi_event(struct list_head *list, int *idx,
- char *sys_name, char *evt_name)
+ char *sys_name, char *evt_name,
+ struct list_head *terms)
{
char evt_path[MAXPATHLEN];
struct dirent *evt_ent;
@@ -412,7 +455,8 @@ static int add_tracepoint_multi_event(struct list_head *list, int *idx,
if (!strglobmatch(evt_ent->d_name, evt_name))
continue;

- ret = add_tracepoint(list, idx, sys_name, evt_ent->d_name);
+ ret = add_tracepoint(list, idx, sys_name,
+ evt_ent->d_name, terms);
}

closedir(evt_dir);
@@ -420,15 +464,17 @@ static int add_tracepoint_multi_event(struct list_head *list, int *idx,
}

static int add_tracepoint_event(struct list_head *list, int *idx,
- char *sys_name, char *evt_name)
+ char *sys_name, char *evt_name,
+ struct list_head *terms)
{
return strpbrk(evt_name, "*?") ?
- add_tracepoint_multi_event(list, idx, sys_name, evt_name) :
- add_tracepoint(list, idx, sys_name, evt_name);
+ add_tracepoint_multi_event(list, idx, sys_name, evt_name, terms) :
+ add_tracepoint(list, idx, sys_name, evt_name, terms);
}

static int add_tracepoint_multi_sys(struct list_head *list, int *idx,
- char *sys_name, char *evt_name)
+ char *sys_name, char *evt_name,
+ struct list_head *terms)
{
struct dirent *events_ent;
DIR *events_dir;
@@ -452,7 +498,7 @@ static int add_tracepoint_multi_sys(struct list_head *list, int *idx,
continue;

ret = add_tracepoint_event(list, idx, events_ent->d_name,
- evt_name);
+ evt_name, terms);
}

closedir(events_dir);
@@ -460,7 +506,8 @@ static int add_tracepoint_multi_sys(struct list_head *list, int *idx,
}

int parse_events_add_tracepoint(struct list_head *list, int *idx,
- char *sys, char *event)
+ char *sys, char *event,
+ struct list_head *terms)
{
int ret;

@@ -469,9 +516,9 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
return ret;

if (strpbrk(sys, "*?"))
- return add_tracepoint_multi_sys(list, idx, sys, event);
+ return add_tracepoint_multi_sys(list, idx, sys, event, terms);
else
- return add_tracepoint_event(list, idx, sys, event);
+ return add_tracepoint_event(list, idx, sys, event, terms);
}

static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index a9db24f..8bd5995 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -49,6 +49,8 @@ enum {
PARSE_EVENTS__TERM_TYPE_NAME,
PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD,
PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE,
+ PARSE_EVENTS__TERM_TYPE_TOGGLE_ON,
+ PARSE_EVENTS__TERM_TYPE_TOGGLE_OFF,
};

struct parse_events_term {
@@ -86,7 +88,8 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add);
int parse_events__modifier_group(struct list_head *list, char *event_mod);
int parse_events_name(struct list_head *list, char *name);
int parse_events_add_tracepoint(struct list_head *list, int *idx,
- char *sys, char *event);
+ char *sys, char *event,
+ struct list_head *terms);
int parse_events_add_numeric(struct list_head *list, int *idx,
u32 type, u64 config,
struct list_head *terms);
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 560ca86..afcc0d0 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -171,6 +171,8 @@ config2 { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG2); }
name { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NAME); }
period { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD); }
branch_type { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE); }
+on { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_TOGGLE_ON); }
+off { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_TOGGLE_OFF); }
, { return ','; }
"/" { BEGIN(INITIAL); return '/'; }
{name_minus} { return str(yyscanner, PE_NAME); }
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index ca93b72..7692562 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -200,7 +200,7 @@ event_def: event_pmu |
event_legacy_symbol |
event_legacy_cache sep_dc |
event_legacy_mem |
- event_legacy_tracepoint sep_dc |
+ event_legacy_tracepoint |
event_legacy_numeric sep_dc |
event_legacy_raw sep_dc

@@ -305,13 +305,13 @@ PE_PREFIX_MEM PE_VALUE sep_dc
}

event_legacy_tracepoint:
-PE_NAME ':' PE_NAME
+PE_NAME ':' PE_NAME event_config_optional
{
struct parse_events_evlist *data = _data;
struct list_head *list;

ALLOC_LIST(list);
- ABORT_ON(parse_events_add_tracepoint(list, &data->idx, $1, $3));
+ ABORT_ON(parse_events_add_tracepoint(list, &data->idx, $1, $3, $4));
$$ = list;
}

--
1.7.11.7

2013-09-25 12:52:47

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 17/21] perf tools: Add event_config_optional parsing rule

Adding 'event_config_optional' parsing rule to omit
duplication code in event_legacy_symbol for /config/no config/
processing.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/parse-events.c | 3 +++
tools/perf/util/parse-events.y | 34 ++++++++++++++++++----------------
2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1957849..37b9cb7 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1328,6 +1328,9 @@ void parse_events__free_terms(struct list_head *terms)
{
struct parse_events_term *term, *h;

+ if (!terms)
+ return;
+
list_for_each_entry_safe(term, h, terms, list)
free(term);
}
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 4eb67ec..1497a70 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -59,6 +59,7 @@ static inc_group_count(struct list_head *list,
%type <str> PE_MODIFIER_BP
%type <str> PE_EVENT_NAME
%type <num> value_sym
+%type <head> event_config_optional
%type <head> event_config
%type <term> event_term
%type <head> event_pmu
@@ -199,6 +200,17 @@ event_def: event_pmu |
event_legacy_numeric sep_dc |
event_legacy_raw sep_dc

+event_config_optional:
+'/' event_config '/'
+{
+ $$ = $2;
+}
+|
+sep_slash_dc
+{
+ $$ = NULL;
+}
+
event_pmu:
PE_NAME '/' event_config '/'
{
@@ -208,6 +220,7 @@ PE_NAME '/' event_config '/'
ALLOC_LIST(list);
ABORT_ON(parse_events_add_pmu(list, &data->idx, $1, $3));
parse_events__free_terms($3);
+ free($3);
$$ = list;
}

@@ -217,30 +230,19 @@ PE_VALUE_SYM_HW
PE_VALUE_SYM_SW

event_legacy_symbol:
-value_sym '/' event_config '/'
-{
- struct parse_events_evlist *data = _data;
- struct list_head *list;
- int type = $1 >> 16;
- int config = $1 & 255;
-
- ALLOC_LIST(list);
- ABORT_ON(parse_events_add_numeric(list, &data->idx,
- type, config, $3));
- parse_events__free_terms($3);
- $$ = list;
-}
-|
-value_sym sep_slash_dc
+value_sym event_config_optional
{
struct parse_events_evlist *data = _data;
struct list_head *list;
+ struct list_head *terms = $2;
int type = $1 >> 16;
int config = $1 & 255;

ALLOC_LIST(list);
ABORT_ON(parse_events_add_numeric(list, &data->idx,
- type, config, NULL));
+ type, config, terms));
+ parse_events__free_terms(terms);
+ free(terms);
$$ = list;
}

--
1.7.11.7

2013-09-25 12:52:46

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 18/21] perf tools: Rename term related parsing function/variable properly

The config_attr_term name is more suitable for the function
as it configures perf_event_attr data using term.

Using more suitable name 'terms' for list head of terms,
instead of list 'head_config' name.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/parse-events.c | 22 +++++++++++-----------
tools/perf/util/parse-events.h | 4 ++--
2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 37b9cb7..899c59e 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -540,8 +540,8 @@ int parse_events_add_breakpoint(struct list_head *list, int *idx,
return add_event(list, idx, &attr, NULL);
}

-static int config_term(struct perf_event_attr *attr,
- struct parse_events_term *term)
+static int config_attr_term(struct perf_event_attr *attr,
+ struct parse_events_term *term)
{
#define CHECK_TYPE_VAL(type) \
do { \
@@ -608,7 +608,7 @@ static int config_attr(struct perf_event_attr *attr,
struct parse_events_term *term;

list_for_each_entry(term, head, list)
- if (config_term(attr, term) && fail)
+ if (config_attr_term(attr, term) && fail)
return -EINVAL;

return 0;
@@ -616,7 +616,7 @@ static int config_attr(struct perf_event_attr *attr,

int parse_events_add_numeric(struct list_head *list, int *idx,
u32 type, u64 config,
- struct list_head *head_config)
+ struct list_head *terms)
{
struct perf_event_attr attr;

@@ -624,15 +624,15 @@ int parse_events_add_numeric(struct list_head *list, int *idx,
attr.type = type;
attr.config = config;

- if (head_config &&
- config_attr(&attr, head_config, 1))
+ if (terms &&
+ config_attr(&attr, terms, 1))
return -EINVAL;

return add_event(list, idx, &attr, pmu_event_name(terms));
}

int parse_events_add_pmu(struct list_head *list, int *idx,
- char *name, struct list_head *head_config)
+ char *name, struct list_head *terms)
{
struct perf_event_attr attr;
struct perf_pmu *pmu;
@@ -643,19 +643,19 @@ int parse_events_add_pmu(struct list_head *list, int *idx,

memset(&attr, 0, sizeof(attr));

- if (perf_pmu__check_alias(pmu, head_config))
+ if (perf_pmu__check_alias(pmu, terms))
return -EINVAL;

/*
* Configure hardcoded terms first, no need to check
* return value when called with fail == 0 ;)
*/
- config_attr(&attr, head_config, 0);
+ config_attr(&attr, terms, 0);

- if (perf_pmu__config(pmu, &attr, head_config))
+ if (perf_pmu__config(pmu, &attr, terms))
return -EINVAL;

- return __add_event(list, idx, &attr, pmu_event_name(head_config),
+ return __add_event(list, idx, &attr, pmu_event_name(terms),
pmu->cpus);
}

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f1cb4c4..a9db24f 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -89,13 +89,13 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
char *sys, char *event);
int parse_events_add_numeric(struct list_head *list, int *idx,
u32 type, u64 config,
- struct list_head *head_config);
+ struct list_head *terms);
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2);
int parse_events_add_breakpoint(struct list_head *list, int *idx,
void *ptr, char *type);
int parse_events_add_pmu(struct list_head *list, int *idx,
- char *pmu , struct list_head *head_config);
+ char *pmu , struct list_head *terms);
void parse_events__set_leader(char *name, struct list_head *list);
void parse_events_update_lists(struct list_head *list_event,
struct list_head *list_all);
--
1.7.11.7

2013-09-25 12:51:41

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 14/21] perf tests: Adding event group toggling test

Adding ioctl toggle test to measure instructions
enclosed within following syscall groups:

geteuid
openat
--> measure instructions
close
getppid

To show/test we could chain multiple togglers.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/tests.h | 1 +
tools/perf/tests/toggle-event-group.c | 195 ++++++++++++++++++++++++++++++++++
4 files changed, 201 insertions(+)
create mode 100644 tools/perf/tests/toggle-event-group.c

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 2072389..90d7127 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -399,6 +399,7 @@ LIB_OBJS += $(OUTPUT)tests/code-reading.o
LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o
LIB_OBJS += $(OUTPUT)tests/toggle-event-raw.o
+LIB_OBJS += $(OUTPUT)tests/toggle-event-group.o
ifeq ($(RAW_ARCH),x86_64)
LIB_OBJS += $(OUTPUT)arch/x86/tests/toggle-event-raw-64.o
endif
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index db9d924b..7e96550 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -120,6 +120,10 @@ static struct test {
.func = test__toggle_event_raw,
},
{
+ .desc = "Toggle event group",
+ .func = test__toggle_event_group,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 4f2a8a1..db692bf 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -41,5 +41,6 @@ int test__sample_parsing(void);
int test__keep_tracking(void);
int test__parse_no_sample_id_all(void);
int test__toggle_event_raw(void);
+int test__toggle_event_group(void);

#endif /* TESTS_H */
diff --git a/tools/perf/tests/toggle-event-group.c b/tools/perf/tests/toggle-event-group.c
new file mode 100644
index 0000000..e781b30
--- /dev/null
+++ b/tools/perf/tests/toggle-event-group.c
@@ -0,0 +1,195 @@
+#include <sys/types.h>
+#include <unistd.h>
+#include <traceevent/event-parse.h>
+#include "thread_map.h"
+#include "evsel.h"
+#include "debug.h"
+#include "tests.h"
+
+/*
+ * We want to toggle instructions on/off only after chained
+ * execution of defined tracepoints, like:
+ *
+ * getuid();
+ * openat();
+ * instructions to count
+ * close();
+ * getppid();
+ *
+ * This test creates following events:
+ *
+ * 1) tracepoint sys_enter_getuid
+ * 2) tracepoint sys_enter_getppid
+ * 3) tracepoint sys_enter_openat
+ * 4) tracepoint sys_enter_close
+ * 5) HW event instruction
+ *
+ *
+ * Events 3) and 4) are created as a group with 3) as the leader.
+ * Events 3) and 4) toggle ON and OFF respectively event 5).
+ * Events 1) and 2) toggle ON and OFF respectively event 3).
+ *
+ * This means:
+ * - when the workload executes getuid(), the group (events 3
+ * and 4) is toggled ON.
+ * - when the workload executes close, the group (events 3
+ * and 4) is toggled OFF.
+ * - when the workload executes started events 3) and 4) they
+ * toggle ON and OFF respectively instructions event.
+ *
+ */
+
+extern int test__toggle_event_raw_arch(void);
+
+static int get_tp_id(const char *name)
+{
+ struct event_format *tp_format = event_format__new("syscalls", name);
+ u64 id = 0;
+
+ if (tp_format) {
+ id = tp_format->id;
+ pevent_free_format(tp_format);
+ }
+
+ return id;
+}
+
+#ifndef __x86_64__
+int test__toggle_event_group(void)
+{
+ pr_err("The toggle event test not implemented for arch.\n");
+ return 0;
+}
+#else
+
+static int test(void)
+{
+ int instr;
+
+ getuid();
+ instr = test__toggle_event_raw_arch();
+ getppid();
+ return instr;
+}
+
+#ifndef PERF_EVENT_IOC_SET_TOGGLE
+#define PERF_EVENT_IOC_SET_TOGGLE 1074275336
+#endif
+
+static int toggle_event(int fd_event, int flag, int fd_toggled)
+{
+ u64 args[2] = { fd_toggled, flag };
+ return ioctl(fd_event, PERF_EVENT_IOC_SET_TOGGLE, args);
+}
+
+int test__toggle_event_group(void)
+{
+ struct perf_event_attr attr_group_on = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_getuid"),
+ .sample_period = 1,
+ };
+ struct perf_event_attr attr_group_off = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_getppid"),
+ .sample_period = 1,
+ };
+ struct perf_event_attr attr_on = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_openat"),
+ .sample_period = 1,
+ .paused = 1,
+ };
+ struct perf_event_attr attr_off = {
+ .type = PERF_TYPE_TRACEPOINT,
+ .config = get_tp_id("sys_enter_close"),
+ .sample_period = 1,
+ .paused = 1,
+ };
+ struct perf_event_attr attr_instr = {
+ .type = PERF_TYPE_HARDWARE,
+ .config = PERF_COUNT_HW_INSTRUCTIONS,
+ .paused = 1,
+ .exclude_kernel = 1,
+ .exclude_hv = 1,
+ };
+ int fd_group_on, fd_group_off, fd_on, fd_off, fd_instr;
+ __u64 value, instr;
+
+ fd_instr = sys_perf_event_open(&attr_instr, 0, -1, -1, 0);
+ if (fd_instr < 0) {
+ pr_err("failed to open instruction event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_on = sys_perf_event_open(&attr_on, 0, -1, -1, 0);
+ if (fd_on < 0) {
+ pr_err("failed to open 'on' event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_off = sys_perf_event_open(&attr_off, 0, -1, fd_on, 0);
+ if (fd_off < 0) {
+ pr_err("failed to open 'off' event, errno %d\n", errno);
+ return -1;
+ }
+
+ if (toggle_event(fd_on, PERF_FLAG_TOGGLE_ON, fd_instr)) {
+ pr_err("failed to set toggle 'on', errno %d\n", errno);
+ return -1;
+ }
+
+ if (toggle_event(fd_off, PERF_FLAG_TOGGLE_OFF, fd_instr)) {
+ pr_err("failed to set toggle 'off', errno %d\n", errno);
+ return -1;
+ }
+
+ fd_group_on = sys_perf_event_open(&attr_group_on, 0, -1,
+ fd_on, PERF_FLAG_TOGGLE_ON);
+ if (fd_group_on < 0) {
+ pr_err("failed to open 'group_on' event, errno %d\n", errno);
+ return -1;
+ }
+
+ fd_group_off = sys_perf_event_open(&attr_group_off, 0, -1,
+ fd_on, PERF_FLAG_TOGGLE_OFF);
+ if (fd_group_off < 0) {
+ pr_err("failed to open 'group_off' event, errno %d\n", errno);
+ return -1;
+ }
+
+#define READ(i, exp) \
+do { \
+ if (sizeof(value) != read(fd_instr, &value, sizeof(value))) { \
+ pr_err("failed to read instruction event, errno %d\n", errno); \
+ return -1; \
+ } \
+ pr_debug("%d got count %llu vs %lu\n", i, value, (unsigned long) exp); \
+} while (0)
+
+
+ READ(1, 0);
+
+ test__toggle_event_raw_arch();
+
+ READ(2, 0);
+
+ instr = test();
+
+ READ(3, instr);
+
+ test__toggle_event_raw_arch();
+
+ READ(4, instr);
+
+ close(fd_on);
+ close(fd_off);
+ close(fd_group_on);
+ close(fd_group_off);
+
+ READ(5, instr);
+
+ close(fd_instr);
+ return instr != value;
+}
+#endif /* __x86_64__ */
--
1.7.11.7

2013-09-25 12:53:32

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 16/21] perf tools: Allow numeric event to change name via name term

Allowing numeric event to change name via name term,
so it's possible to change following event name:

$ ./perf stat -e 'cycles:k,cycles/name=cycles_irq/k' kill
usage: kill [ -s signal | -p ] [ -a ] pid ...
kill -l [ signal ]

Performance counter stats for 'kill':

958,972 cycles:k # 0.000 GHz
958,972 cycles_irq # 0.000 GHz

0.001047556 seconds time elapsed

It'll be useful for aliasing/distinguishing events for
on/off toggling terms.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/parse-events.c | 37 ++++++++++++++++++++-----------------
1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 9812531..1957849 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -583,6 +583,25 @@ do { \
#undef CHECK_TYPE_VAL
}

+static int parse_events__is_name_term(struct parse_events_term *term)
+{
+ return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
+}
+
+static char *pmu_event_name(struct list_head *head_terms)
+{
+ struct parse_events_term *term;
+
+ if (!head_terms)
+ return NULL;
+
+ list_for_each_entry(term, head_terms, list)
+ if (parse_events__is_name_term(term))
+ return term->val.str;
+
+ return NULL;
+}
+
static int config_attr(struct perf_event_attr *attr,
struct list_head *head, int fail)
{
@@ -609,23 +628,7 @@ int parse_events_add_numeric(struct list_head *list, int *idx,
config_attr(&attr, head_config, 1))
return -EINVAL;

- return add_event(list, idx, &attr, NULL);
-}
-
-static int parse_events__is_name_term(struct parse_events_term *term)
-{
- return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
-}
-
-static char *pmu_event_name(struct list_head *head_terms)
-{
- struct parse_events_term *term;
-
- list_for_each_entry(term, head_terms, list)
- if (parse_events__is_name_term(term))
- return term->val.str;
-
- return NULL;
+ return add_event(list, idx, &attr, pmu_event_name(terms));
}

int parse_events_add_pmu(struct list_head *list, int *idx,
--
1.7.11.7

2013-09-25 12:54:18

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 11/21] perf: Split allocation and initialization code

From: Frederic Weisbecker <[email protected]>

Do this in order to prepare for toggle event inheritance support
that will rely on pre-allocated perf event.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
kernel/events/core.c | 59 ++++++++++++++++++++++++++++++++++------------------
1 file changed, 39 insertions(+), 20 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2a19b64..fa1d229 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6673,31 +6673,23 @@ static void account_event(struct perf_event *event)
account_event_cpu(event, event->cpu);
}

-/*
- * Allocate and initialize a event structure
- */
-static struct perf_event *
-perf_event_alloc(struct perf_event_attr *attr, int cpu,
- struct task_struct *task,
- struct perf_event *group_leader,
- struct perf_event *parent_event,
- perf_overflow_handler_t overflow_handler,
- void *context)
+static int perf_init_event(struct perf_event *event,
+ struct perf_event_attr *attr, int cpu,
+ struct task_struct *task,
+ struct perf_event *group_leader,
+ struct perf_event *parent_event,
+ perf_overflow_handler_t overflow_handler,
+ void *context)
{
struct pmu *pmu;
- struct perf_event *event;
struct hw_perf_event *hwc;
- long err = -EINVAL;
+ int err = -EINVAL;

if ((unsigned)cpu >= nr_cpu_ids) {
if (!task || cpu != -1)
- return ERR_PTR(-EINVAL);
+ return err;
}

- event = kzalloc(sizeof(*event), GFP_KERNEL);
- if (!event)
- return ERR_PTR(-ENOMEM);
-
/*
* Single events are their own group leaders, with an
* empty sibling list:
@@ -6791,7 +6783,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
}
}

- return event;
+ return 0;

err_pmu:
if (event->destroy)
@@ -6799,9 +6791,36 @@ err_pmu:
err_ns:
if (event->ns)
put_pid_ns(event->ns);
- kfree(event);

- return ERR_PTR(err);
+ return err;
+}
+
+/*
+ * Allocate and initialize a event structure
+ */
+static struct perf_event *
+perf_event_alloc(struct perf_event_attr *attr, int cpu,
+ struct task_struct *task,
+ struct perf_event *group_leader,
+ struct perf_event *parent_event,
+ perf_overflow_handler_t overflow_handler,
+ void *context)
+{
+ struct perf_event *event;
+ int err;
+
+ event = kzalloc(sizeof(*event), GFP_KERNEL);
+ if (!event)
+ return ERR_PTR(-ENOMEM);
+
+ err = perf_init_event(event, attr, cpu, task, group_leader,
+ parent_event, overflow_handler, context);
+ if (err) {
+ kfree(event);
+ return ERR_PTR(err);
+ }
+
+ return event;
}

static int perf_copy_attr(struct perf_event_attr __user *uattr,
--
1.7.11.7

2013-09-25 12:51:24

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 06/21] perf: Add event toggle ioctl interface

Adding new ioctl PERF_EVENT_IOC_SET_TOGGLE to interface
the toggle settings for event.

This ioctl has 2 goals:
- allowing the toggle event being part of the group
- allowing to define toggle setting after event
is created

The ioctl interface is:

u64 args[2] = { toggled_fd, flag };
err = ioctl(fd, PERF_EVENT_IOC_SET_TOGGLE, args);

Where:
toggled_fd - is file description of the event we want to toggle
flag - is one of PERF_FLAG_TOGGLE_ON|PERF_FLAG_TOGGLE_OFF
err - 0 when successful
-1 otherwise with errno:
EBUSY - event has already toggled event defined
EFAULT - could not copy user data
EINVAL - wrong data

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
include/uapi/linux/perf_event.h | 1 +
kernel/events/core.c | 40 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ecb0474..b941c21 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -325,6 +325,7 @@ struct perf_event_attr {
#define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
#define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
#define PERF_EVENT_IOC_ID _IOR('$', 7, u64 *)
+#define PERF_EVENT_IOC_SET_TOGGLE _IOW('$', 8, u64 *)

enum perf_event_ioc_flags {
PERF_IOC_FLAG_GROUP = 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 40c792d..b41a0d8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3575,6 +3575,7 @@ static inline int perf_fget_light(int fd, struct fd *p)
static int perf_event_set_output(struct perf_event *event,
struct perf_event *output_event);
static int perf_event_set_filter(struct perf_event *event, void __user *arg);
+static int perf_event_set_toggle_fd(struct perf_event *event, u64 __user *arg);

static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
@@ -3629,6 +3630,9 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case PERF_EVENT_IOC_SET_FILTER:
return perf_event_set_filter(event, (void __user *)arg);

+ case PERF_EVENT_IOC_SET_TOGGLE:
+ return perf_event_set_toggle_fd(event, (u64 __user *)arg);
+
default:
return -ENOTTY;
}
@@ -7017,6 +7021,42 @@ perf_event_set_toggle(struct perf_event *event,
return 0;
}

+static int perf_event_set_toggle_fd(struct perf_event *event, u64 __user *arg)
+{
+ struct perf_event *toggled_event;
+ struct fd toggled_fd = { NULL, 0 };
+ u64 fd, flag;
+ int err;
+
+ if (event->toggled_event)
+ return -EBUSY;
+
+ if (copy_from_user(&fd, arg, sizeof(fd)))
+ return -EFAULT;
+
+ if (copy_from_user(&flag, arg + 1, sizeof(flag)))
+ return -EFAULT;
+
+ err = perf_fget_light((int) fd, &toggled_fd);
+ if (err)
+ return -EINVAL;
+
+ toggled_event = toggled_fd.file->private_data;
+
+ if (!atomic_long_inc_not_zero(&toggled_event->refcount)) {
+ fdput(toggled_fd);
+ return -EINVAL;
+ }
+
+ err = perf_event_set_toggle(event, toggled_event, event->ctx, flag);
+ if (err)
+ put_event(toggled_event);
+
+ fdput(toggled_fd);
+ return err;
+}
+
+
/**
* sys_perf_event_open - open a performance event, associate it to a task/cpu
*
--
1.7.11.7

2013-09-25 12:51:22

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 07/21] perf: Toggle whole group in toggle event overflow

Toggling whole group in toggle event overflow,
so we could use the toggling for even groups.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
kernel/events/core.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index b41a0d8..2c8ff93 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5267,8 +5267,8 @@ static void perf_log_throttle(struct perf_event *event, int enable)
* - fix race against other toggler
* - fix race against other callers of ->stop/start (adjust period/freq)
*/
-static void perf_event_toggle(struct perf_event *event,
- enum perf_event_toggle_flag flag)
+static void __perf_event_toggle(struct perf_event *event,
+ enum perf_event_toggle_flag flag)
{
unsigned long flags;
bool active;
@@ -5304,6 +5304,16 @@ static void perf_event_toggle(struct perf_event *event,
local_irq_restore(flags);
}

+static void perf_event_toggle(struct perf_event *leader,
+ enum perf_event_toggle_flag flag)
+{
+ struct perf_event *event;
+
+ __perf_event_toggle(leader, flag);
+ list_for_each_entry(event, &leader->sibling_list, group_entry)
+ __perf_event_toggle(event, flag);
+}
+
static void
perf_event_toggle_overflow(struct perf_event *event,
struct perf_sample_data *data,
--
1.7.11.7

2013-09-25 12:54:59

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 08/21] perf: Add new 'paused' attribute

Adding new 'paused' perf_event_attr attribute bit. It sets
the initial event state as paused, so it wont get started
until it's triggered by toggling mechanism.

Original-patch-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
include/uapi/linux/perf_event.h | 3 ++-
kernel/events/core.c | 3 +++
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b941c21..1539c47 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -278,8 +278,9 @@ struct perf_event_attr {
exclude_callchain_kernel : 1, /* exclude kernel callchains */
exclude_callchain_user : 1, /* exclude user callchains */
mmap2 : 1, /* include mmap with inode data */
+ paused : 1, /* create as paused */

- __reserved_1 : 40;
+ __reserved_1 : 39;

union {
__u32 wakeup_events; /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2c8ff93..f5f00a6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6731,6 +6731,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,

event->state = PERF_EVENT_STATE_INACTIVE;

+ if (attr->paused)
+ event->paused = true;
+
if (task) {
event->attach_state = PERF_ATTACH_TASK;

--
1.7.11.7

2013-09-25 12:55:53

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 05/21] perf: Add event toggle sys_perf_event_open interface

Adding perf interface that allows to create 'toggle' events,
which can enable or disable another event. Whenever the toggle
event is triggered (has overflow), it toggles another event
state and either starts or stops it.

The goal is to be able to create toggling tracepoint events
to enable and disable HW counters, but the interface is generic
enough to be used for any kind of event.

The interface to create a toggle event is similar as the one
for defining event group. Use perf syscall with:

flags - PERF_FLAG_TOGGLE_ON or PERF_FLAG_TOGGLE_OFF
group_fd - event (or group) fd to be toggled

Created event will toggle ON(start) or OFF(stop) the event
specified via group_fd.

Obviously this way it's not possible for toggle event to
be part of group other than group leader. This will be
possible via ioctl coming in next patch.

Original-patch-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
include/linux/perf_event.h | 9 +++
include/uapi/linux/perf_event.h | 3 +
kernel/events/core.c | 158 ++++++++++++++++++++++++++++++++++++++--
3 files changed, 164 insertions(+), 6 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 866e85c..6ede25c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -289,6 +289,12 @@ struct swevent_hlist {
struct perf_cgroup;
struct ring_buffer;

+enum perf_event_toggle_flag {
+ PERF_TOGGLE_NONE = 0,
+ PERF_TOGGLE_ON = 1,
+ PERF_TOGGLE_OFF = 2,
+};
+
/**
* struct perf_event - performance event kernel representation:
*/
@@ -414,6 +420,9 @@ struct perf_event {
int cgrp_defer_enabled;
#endif

+ struct perf_event *toggled_event;
+ enum perf_event_toggle_flag toggle_flag;
+ int paused;
#endif /* CONFIG_PERF_EVENTS */
};

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ca1d90b..ecb0474 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -694,6 +694,9 @@ enum perf_callchain_context {
#define PERF_FLAG_FD_NO_GROUP (1U << 0)
#define PERF_FLAG_FD_OUTPUT (1U << 1)
#define PERF_FLAG_PID_CGROUP (1U << 2) /* pid=cgroup id, per-cpu mode only */
+#define PERF_FLAG_TOGGLE_ON (1U << 3)
+#define PERF_FLAG_TOGGLE_OFF (1U << 4)
+

union perf_mem_data_src {
__u64 val;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e8674e4..40c792d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -44,6 +44,8 @@

#include <asm/irq_regs.h>

+#define PERF_FLAG_TOGGLE (PERF_FLAG_TOGGLE_ON | PERF_FLAG_TOGGLE_OFF)
+
struct remote_function_call {
struct task_struct *p;
int (*func)(void *info);
@@ -119,7 +121,9 @@ static int cpu_function_call(int cpu, int (*func) (void *info), void *info)

#define PERF_FLAG_ALL (PERF_FLAG_FD_NO_GROUP |\
PERF_FLAG_FD_OUTPUT |\
- PERF_FLAG_PID_CGROUP)
+ PERF_FLAG_PID_CGROUP |\
+ PERF_FLAG_TOGGLE_ON |\
+ PERF_FLAG_TOGGLE_OFF)

/*
* branch priv levels that need permission checks
@@ -1358,6 +1362,25 @@ out:
perf_event__header_size(tmp);
}

+static void put_event(struct perf_event *event);
+
+static void __perf_event_toggle_detach(struct perf_event *event)
+{
+ struct perf_event *toggled_event = event->toggled_event;
+
+ event->toggle_flag = PERF_TOGGLE_NONE;
+ event->overflow_handler = NULL;
+ event->toggled_event = NULL;
+
+ put_event(toggled_event);
+}
+
+static void perf_event_toggle_detach(struct perf_event *event)
+{
+ if (event->toggle_flag > PERF_TOGGLE_NONE)
+ __perf_event_toggle_detach(event);
+}
+
static inline int
event_filter_match(struct perf_event *event)
{
@@ -1646,6 +1669,7 @@ event_sched_in(struct perf_event *event,
struct perf_event_context *ctx)
{
u64 tstamp = perf_event_time(event);
+ int add_flags = PERF_EF_START;

if (event->state <= PERF_EVENT_STATE_OFF)
return 0;
@@ -1665,7 +1689,10 @@ event_sched_in(struct perf_event *event,
*/
smp_wmb();

- if (event->pmu->add(event, PERF_EF_START)) {
+ if (event->paused)
+ add_flags = 0;
+
+ if (event->pmu->add(event, add_flags)) {
event->state = PERF_EVENT_STATE_INACTIVE;
event->oncpu = -1;
return -EAGAIN;
@@ -2723,7 +2750,7 @@ static void perf_adjust_freq_unthr_context(struct perf_event_context *ctx,
event->pmu->start(event, 0);
}

- if (!event->attr.freq || !event->attr.sample_freq)
+ if (!event->attr.freq || !event->attr.sample_freq || event->paused)
continue;

/*
@@ -3240,7 +3267,7 @@ int perf_event_release_kernel(struct perf_event *event)
raw_spin_unlock_irq(&ctx->lock);
perf_remove_from_context(event);
mutex_unlock(&ctx->mutex);
-
+ perf_event_toggle_detach(event);
free_event(event);

return 0;
@@ -5231,6 +5258,72 @@ static void perf_log_throttle(struct perf_event *event, int enable)
}

/*
+ * TODO:
+ * - fix race when interrupting event_sched_in/event_sched_out
+ * - fix race against other toggler
+ * - fix race against other callers of ->stop/start (adjust period/freq)
+ */
+static void perf_event_toggle(struct perf_event *event,
+ enum perf_event_toggle_flag flag)
+{
+ unsigned long flags;
+ bool active;
+
+ /*
+ * Prevent from races against event->add/del through
+ * preempt_schedule_irq() or enable/disable IPIs
+ */
+ local_irq_save(flags);
+
+ /* Could be out of HW counter. */
+ active = event->state == PERF_EVENT_STATE_ACTIVE;
+
+ switch (flag) {
+ case PERF_TOGGLE_ON:
+ if (!event->paused)
+ break;
+ if (active)
+ event->pmu->start(event, PERF_EF_RELOAD);
+ event->paused = false;
+ break;
+ case PERF_TOGGLE_OFF:
+ if (event->paused)
+ break;
+ if (active)
+ event->pmu->stop(event, PERF_EF_UPDATE);
+ event->paused = true;
+ break;
+ case PERF_TOGGLE_NONE:
+ break;
+ }
+
+ local_irq_restore(flags);
+}
+
+static void
+perf_event_toggle_overflow(struct perf_event *event,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ struct perf_event *toggled_event;
+
+ if (!event->toggle_flag)
+ return;
+
+ toggled_event = event->toggled_event;
+
+ if (WARN_ON_ONCE(!toggled_event))
+ return;
+
+ perf_pmu_disable(toggled_event->pmu);
+
+ perf_event_toggle(toggled_event, event->toggle_flag);
+ perf_event_output(event, data, regs);
+
+ perf_pmu_enable(toggled_event->pmu);
+}
+
+/*
* Generic event overflow handling, sampling.
*/

@@ -6887,6 +6980,43 @@ out:
return ret;
}

+static enum perf_event_toggle_flag get_toggle_flag(unsigned long flags)
+{
+ if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE_ON)
+ return PERF_TOGGLE_ON;
+ else if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE_OFF)
+ return PERF_TOGGLE_OFF;
+
+ return PERF_TOGGLE_NONE;
+}
+
+static int
+perf_event_set_toggle(struct perf_event *event,
+ struct perf_event *toggled_event,
+ struct perf_event_context *ctx,
+ unsigned long flags)
+{
+ if (WARN_ON_ONCE(!(flags & PERF_FLAG_TOGGLE)))
+ return -EINVAL;
+
+ /* It's either ON or OFF. */
+ if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE)
+ return -EINVAL;
+
+ /* Allow only same cpu, */
+ if (toggled_event->cpu != event->cpu)
+ return -EINVAL;
+
+ /* or same task. */
+ if (toggled_event->ctx->task != ctx->task)
+ return -EINVAL;
+
+ event->overflow_handler = perf_event_toggle_overflow;
+ event->toggle_flag = get_toggle_flag(flags);
+ event->toggled_event = toggled_event;
+ return 0;
+}
+
/**
* sys_perf_event_open - open a performance event, associate it to a task/cpu
*
@@ -6900,6 +7030,7 @@ SYSCALL_DEFINE5(perf_event_open,
pid_t, pid, int, cpu, int, group_fd, unsigned long, flags)
{
struct perf_event *group_leader = NULL, *output_event = NULL;
+ struct perf_event *toggled_event = NULL;
struct perf_event *event, *sibling;
struct perf_event_attr attr;
struct perf_event_context *ctx;
@@ -6949,7 +7080,9 @@ SYSCALL_DEFINE5(perf_event_open,
group_leader = group.file->private_data;
if (flags & PERF_FLAG_FD_OUTPUT)
output_event = group_leader;
- if (flags & PERF_FLAG_FD_NO_GROUP)
+ if (flags & PERF_FLAG_TOGGLE)
+ toggled_event = group_leader;
+ if (flags & (PERF_FLAG_FD_NO_GROUP|PERF_FLAG_TOGGLE))
group_leader = NULL;
}

@@ -7060,10 +7193,20 @@ SYSCALL_DEFINE5(perf_event_open,
goto err_context;
}

+ if (toggled_event) {
+ err = -EINVAL;
+ if (!atomic_long_inc_not_zero(&toggled_event->refcount))
+ goto err_context;
+
+ err = perf_event_set_toggle(event, toggled_event, ctx, flags);
+ if (err)
+ goto err_toggle;
+ }
+
event_file = anon_inode_getfile("[perf_event]", &perf_fops, event, O_RDWR);
if (IS_ERR(event_file)) {
err = PTR_ERR(event_file);
- goto err_context;
+ goto err_toggle;
}

if (move_group) {
@@ -7131,6 +7274,9 @@ SYSCALL_DEFINE5(perf_event_open,
fd_install(event_fd, event_file);
return event_fd;

+err_toggle:
+ if (toggled_event)
+ put_event(toggled_event);
err_context:
perf_unpin_context(ctx);
put_ctx(ctx);
--
1.7.11.7

2013-09-25 12:56:28

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 04/21] perf: Move event state initialization before/behind the pmu add/del calls

Moving event state initialization before the pmu->del
and behind the pmu->add call.

This way the toggler can refer to event->state and
make sure the event is really scheduled before
calling pmu->stop/start in case we get interrupted
inside event_sched_in/event_sched_out functions.

Original-patch-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
kernel/events/core.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index dd236b6..e8674e4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1388,14 +1388,16 @@ event_sched_out(struct perf_event *event,
if (event->state != PERF_EVENT_STATE_ACTIVE)
return;

+ event->tstamp_stopped = tstamp;
+ event->oncpu = -1;
event->state = PERF_EVENT_STATE_INACTIVE;
+
+ event->pmu->del(event, 0);
+
if (event->pending_disable) {
event->pending_disable = 0;
event->state = PERF_EVENT_STATE_OFF;
}
- event->tstamp_stopped = tstamp;
- event->pmu->del(event, 0);
- event->oncpu = -1;

if (!is_software_event(event))
cpuctx->active_oncpu--;
@@ -1648,9 +1650,6 @@ event_sched_in(struct perf_event *event,
if (event->state <= PERF_EVENT_STATE_OFF)
return 0;

- event->state = PERF_EVENT_STATE_ACTIVE;
- event->oncpu = smp_processor_id();
-
/*
* Unthrottle events, since we scheduled we might have missed several
* ticks already, also for a heavily scheduling task there is little
@@ -1672,6 +1671,9 @@ event_sched_in(struct perf_event *event,
return -EAGAIN;
}

+ event->state = PERF_EVENT_STATE_ACTIVE;
+ event->oncpu = smp_processor_id();
+
event->tstamp_running += tstamp - event->tstamp_stopped;

perf_set_shadow_time(event, ctx, tstamp);
--
1.7.11.7

2013-09-25 12:51:11

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 02/21] perf tools: Separate sys_perf_event_open call into evsel_open

Separating sys_perf_event_open call and its setup into
new evsel_open function.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evsel.c | 48 ++++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0ce9feb..95590fe 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1000,23 +1000,40 @@ static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
return ret;
}

+static int evsel_open(struct perf_evsel *evsel,
+ struct thread_map *threads, struct cpu_map *cpus,
+ int thread, int cpu)
+{
+ int group_fd, pid = -1;
+ unsigned long flags = 0;
+
+ /* cgroup config */
+ if (evsel->cgrp) {
+ flags = PERF_FLAG_PID_CGROUP;
+ pid = evsel->cgrp->fd;
+ } else
+ pid = threads->map[thread];
+
+ /* group config */
+ group_fd = get_group_fd(evsel, cpu, thread);
+
+ pr_debug2("perf_event_open: pid %d cpu %d group_fd %d flags %#lx\n",
+ pid, cpus->map[cpu], group_fd, flags);
+
+ return sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
+ group_fd, flags);
+}
+
static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
- int cpu, thread;
- unsigned long flags = 0;
- int pid = -1, err;
+ int cpu, thread, err;
enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;

if (evsel->fd == NULL &&
perf_evsel__alloc_fd(evsel, cpus->nr, threads->nr) < 0)
return -ENOMEM;

- if (evsel->cgrp) {
- flags = PERF_FLAG_PID_CGROUP;
- pid = evsel->cgrp->fd;
- }
-
fallback_missing_features:
if (perf_missing_features.mmap2)
evsel->attr.mmap2 = 0;
@@ -1032,20 +1049,11 @@ retry_sample_id:
for (cpu = 0; cpu < cpus->nr; cpu++) {

for (thread = 0; thread < threads->nr; thread++) {
- int group_fd;

- if (!evsel->cgrp)
- pid = threads->map[thread];
-
- group_fd = get_group_fd(evsel, cpu, thread);
retry_open:
- pr_debug2("perf_event_open: pid %d cpu %d group_fd %d flags %#lx\n",
- pid, cpus->map[cpu], group_fd, flags);
-
- FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
- pid,
- cpus->map[cpu],
- group_fd, flags);
+ FD(evsel, cpu, thread) = evsel_open(evsel,
+ threads, cpus,
+ thread, cpu);
if (FD(evsel, cpu, thread) < 0) {
err = -errno;
goto try_fallback;
--
1.7.11.7

2013-09-25 19:12:20

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC 00/21] perf tools: Add toggling events support

On Wed, Sep 25, 2013 at 02:50:26PM +0200, Jiri Olsa wrote:
> hi,
> sending *RFC* for toggling events support.
>
> Adding perf interface that allows to create toggle events, which can
> enable or disable another event. Whenever the toggle event is triggered
> (has overflow), it toggles another event state and either starts or
> stops it.
>
> The goal is to be able to create toggling tracepoint events to enable and
> disable HW counters, but the interface is generic enough to be used for
> any kind of event.

Haven't read the patches, but frequent full event switch in/out seems
very expensive. If someone puts that switch on a common
function it would likely disturb things quite a bit.

It would be better to keep counting and just do RDPMC on
the switch points, and then subtract for counting.
For sampling could need a MSR write to enable/disable.
Still somewhat expensive, but nowhere near as bad as a full switch.

Another problem is that it may be very inexact, as
the counting will often happen in the background
and not be very synchronized with the switches.
Not fully sure how big a problem that would be.

-Andi

2013-09-25 19:45:23

by Vince Weaver

[permalink] [raw]
Subject: Re: [PATCH 06/21] perf: Add event toggle ioctl interface

On Wed, 25 Sep 2013, Jiri Olsa wrote:

> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -325,6 +325,7 @@ struct perf_event_attr {
> #define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
> #define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
> #define PERF_EVENT_IOC_ID _IOR('$', 7, u64 *)
> +#define PERF_EVENT_IOC_SET_TOGGLE _IOW('$', 8, u64 *)

I'm pretty sure this should be __u64 or else it won't compile
in userspace.

Vince

2013-09-25 22:36:46

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH 05/21] perf: Add event toggle sys_perf_event_open interface

Jiri Olsa [[email protected]] wrote:
| Adding perf interface that allows to create 'toggle' events,
| which can enable or disable another event. Whenever the toggle
| event is triggered (has overflow), it toggles another event
| state and either starts or stops it.

Nice idea. It would be very useful.

Will try out the patchset, but couple of small comments below.

|
| The goal is to be able to create toggling tracepoint events
| to enable and disable HW counters, but the interface is generic
| enough to be used for any kind of event.
|
| The interface to create a toggle event is similar as the one
| for defining event group. Use perf syscall with:
|
| flags - PERF_FLAG_TOGGLE_ON or PERF_FLAG_TOGGLE_OFF
| group_fd - event (or group) fd to be toggled
|
| Created event will toggle ON(start) or OFF(stop) the event
| specified via group_fd.
|
| Obviously this way it's not possible for toggle event to
| be part of group other than group leader. This will be
| possible via ioctl coming in next patch.
|
| Original-patch-by: Frederic Weisbecker <[email protected]>
| Signed-off-by: Jiri Olsa <[email protected]>
| Cc: Arnaldo Carvalho de Melo <[email protected]>
| Cc: Corey Ashford <[email protected]>
| Cc: Frederic Weisbecker <[email protected]>
| Cc: Ingo Molnar <[email protected]>
| Cc: Paul Mackerras <[email protected]>
| Cc: Peter Zijlstra <[email protected]>
| Cc: Arnaldo Carvalho de Melo <[email protected]>
| ---
| include/linux/perf_event.h | 9 +++
| include/uapi/linux/perf_event.h | 3 +
| kernel/events/core.c | 158 ++++++++++++++++++++++++++++++++++++++--
| 3 files changed, 164 insertions(+), 6 deletions(-)
|
| diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
| index 866e85c..6ede25c 100644
| --- a/include/linux/perf_event.h
| +++ b/include/linux/perf_event.h
| @@ -289,6 +289,12 @@ struct swevent_hlist {
| struct perf_cgroup;
| struct ring_buffer;
|
| +enum perf_event_toggle_flag {
| + PERF_TOGGLE_NONE = 0,
| + PERF_TOGGLE_ON = 1,
| + PERF_TOGGLE_OFF = 2,
| +};

Can we call this 'perf_event_toggle_state' ? it can be confusing with
PERF_FLAG_TOGGLE* macros below which apply to a different field.

| +
| /**
| * struct perf_event - performance event kernel representation:
| */
| @@ -414,6 +420,9 @@ struct perf_event {
| int cgrp_defer_enabled;
| #endif
|
| + struct perf_event *toggled_event;
| + enum perf_event_toggle_flag toggle_flag;

s/toggle_flag/toggle_state/ ?

| + int paused;

There is an 'event->state' field with OFF, INACTIVE, ACTIVE states.
Can we add a 'PAUSED' state to that instead ?


| #endif /* CONFIG_PERF_EVENTS */
| };
|
| diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
| index ca1d90b..ecb0474 100644
| --- a/include/uapi/linux/perf_event.h
| +++ b/include/uapi/linux/perf_event.h
| @@ -694,6 +694,9 @@ enum perf_callchain_context {
| #define PERF_FLAG_FD_NO_GROUP (1U << 0)
| #define PERF_FLAG_FD_OUTPUT (1U << 1)
| #define PERF_FLAG_PID_CGROUP (1U << 2) /* pid=cgroup id, per-cpu mode only */
| +#define PERF_FLAG_TOGGLE_ON (1U << 3)
| +#define PERF_FLAG_TOGGLE_OFF (1U << 4)
| +
|
| union perf_mem_data_src {
| __u64 val;
| diff --git a/kernel/events/core.c b/kernel/events/core.c
| index e8674e4..40c792d 100644
| --- a/kernel/events/core.c
| +++ b/kernel/events/core.c
| @@ -44,6 +44,8 @@
|
| #include <asm/irq_regs.h>
|
| +#define PERF_FLAG_TOGGLE (PERF_FLAG_TOGGLE_ON | PERF_FLAG_TOGGLE_OFF)
| +
| struct remote_function_call {
| struct task_struct *p;
| int (*func)(void *info);
| @@ -119,7 +121,9 @@ static int cpu_function_call(int cpu, int (*func) (void *info), void *info)
|
| #define PERF_FLAG_ALL (PERF_FLAG_FD_NO_GROUP |\
| PERF_FLAG_FD_OUTPUT |\
| - PERF_FLAG_PID_CGROUP)
| + PERF_FLAG_PID_CGROUP |\
| + PERF_FLAG_TOGGLE_ON |\
| + PERF_FLAG_TOGGLE_OFF)
|
| /*
| * branch priv levels that need permission checks
| @@ -1358,6 +1362,25 @@ out:
| perf_event__header_size(tmp);
| }
|
| +static void put_event(struct perf_event *event);
| +
| +static void __perf_event_toggle_detach(struct perf_event *event)
| +{
| + struct perf_event *toggled_event = event->toggled_event;
| +
| + event->toggle_flag = PERF_TOGGLE_NONE;
| + event->overflow_handler = NULL;
| + event->toggled_event = NULL;
| +
| + put_event(toggled_event);
| +}
| +
| +static void perf_event_toggle_detach(struct perf_event *event)
| +{
| + if (event->toggle_flag > PERF_TOGGLE_NONE)
| + __perf_event_toggle_detach(event);
| +}
| +
| static inline int
| event_filter_match(struct perf_event *event)
| {
| @@ -1646,6 +1669,7 @@ event_sched_in(struct perf_event *event,
| struct perf_event_context *ctx)
| {
| u64 tstamp = perf_event_time(event);
| + int add_flags = PERF_EF_START;
|
| if (event->state <= PERF_EVENT_STATE_OFF)
| return 0;
| @@ -1665,7 +1689,10 @@ event_sched_in(struct perf_event *event,
| */
| smp_wmb();
|
| - if (event->pmu->add(event, PERF_EF_START)) {
| + if (event->paused)
| + add_flags = 0;
| +
| + if (event->pmu->add(event, add_flags)) {
| event->state = PERF_EVENT_STATE_INACTIVE;
| event->oncpu = -1;
| return -EAGAIN;
| @@ -2723,7 +2750,7 @@ static void perf_adjust_freq_unthr_context(struct perf_event_context *ctx,
| event->pmu->start(event, 0);
| }
|
| - if (!event->attr.freq || !event->attr.sample_freq)
| + if (!event->attr.freq || !event->attr.sample_freq || event->paused)
| continue;
|
| /*
| @@ -3240,7 +3267,7 @@ int perf_event_release_kernel(struct perf_event *event)
| raw_spin_unlock_irq(&ctx->lock);
| perf_remove_from_context(event);
| mutex_unlock(&ctx->mutex);
| -
| + perf_event_toggle_detach(event);
| free_event(event);
|
| return 0;
| @@ -5231,6 +5258,72 @@ static void perf_log_throttle(struct perf_event *event, int enable)
| }
|
| /*
| + * TODO:
| + * - fix race when interrupting event_sched_in/event_sched_out
| + * - fix race against other toggler
| + * - fix race against other callers of ->stop/start (adjust period/freq)
| + */
| +static void perf_event_toggle(struct perf_event *event,
| + enum perf_event_toggle_flag flag)
| +{
| + unsigned long flags;
| + bool active;
| +
| + /*
| + * Prevent from races against event->add/del through
| + * preempt_schedule_irq() or enable/disable IPIs
| + */
| + local_irq_save(flags);
| +
| + /* Could be out of HW counter. */
| + active = event->state == PERF_EVENT_STATE_ACTIVE;
| +
| + switch (flag) {
| + case PERF_TOGGLE_ON:
| + if (!event->paused)
| + break;
| + if (active)
| + event->pmu->start(event, PERF_EF_RELOAD);
| + event->paused = false;
| + break;
| + case PERF_TOGGLE_OFF:
| + if (event->paused)
| + break;
| + if (active)
| + event->pmu->stop(event, PERF_EF_UPDATE);
| + event->paused = true;
| + break;
| + case PERF_TOGGLE_NONE:
| + break;
| + }
| +
| + local_irq_restore(flags);
| +}
| +
| +static void
| +perf_event_toggle_overflow(struct perf_event *event,
| + struct perf_sample_data *data,
| + struct pt_regs *regs)
| +{
| + struct perf_event *toggled_event;
| +
| + if (!event->toggle_flag)
| + return;
| +
| + toggled_event = event->toggled_event;
| +
| + if (WARN_ON_ONCE(!toggled_event))
| + return;
| +
| + perf_pmu_disable(toggled_event->pmu);
| +
| + perf_event_toggle(toggled_event, event->toggle_flag);
| + perf_event_output(event, data, regs);
| +
| + perf_pmu_enable(toggled_event->pmu);
| +}
| +
| +/*
| * Generic event overflow handling, sampling.
| */
|
| @@ -6887,6 +6980,43 @@ out:
| return ret;
| }
|
| +static enum perf_event_toggle_flag get_toggle_flag(unsigned long flags)
| +{
| + if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE_ON)
| + return PERF_TOGGLE_ON;
| + else if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE_OFF)
| + return PERF_TOGGLE_OFF;
| +
| + return PERF_TOGGLE_NONE;
| +}
| +
| +static int
| +perf_event_set_toggle(struct perf_event *event,
| + struct perf_event *toggled_event,
| + struct perf_event_context *ctx,
| + unsigned long flags)
| +{
| + if (WARN_ON_ONCE(!(flags & PERF_FLAG_TOGGLE)))
| + return -EINVAL;
| +
| + /* It's either ON or OFF. */
| + if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE)
| + return -EINVAL;
| +
| + /* Allow only same cpu, */
| + if (toggled_event->cpu != event->cpu)
| + return -EINVAL;
| +
| + /* or same task. */

nit: s/or/and/

| + if (toggled_event->ctx->task != ctx->task)
| + return -EINVAL;
| +
| + event->overflow_handler = perf_event_toggle_overflow;
| + event->toggle_flag = get_toggle_flag(flags);
| + event->toggled_event = toggled_event;
| + return 0;
| +}
| +
| /**
| * sys_perf_event_open - open a performance event, associate it to a task/cpu
| *
| @@ -6900,6 +7030,7 @@ SYSCALL_DEFINE5(perf_event_open,
| pid_t, pid, int, cpu, int, group_fd, unsigned long, flags)
| {
| struct perf_event *group_leader = NULL, *output_event = NULL;
| + struct perf_event *toggled_event = NULL;
| struct perf_event *event, *sibling;
| struct perf_event_attr attr;
| struct perf_event_context *ctx;
| @@ -6949,7 +7080,9 @@ SYSCALL_DEFINE5(perf_event_open,
| group_leader = group.file->private_data;
| if (flags & PERF_FLAG_FD_OUTPUT)
| output_event = group_leader;
| - if (flags & PERF_FLAG_FD_NO_GROUP)
| + if (flags & PERF_FLAG_TOGGLE)
| + toggled_event = group_leader;
| + if (flags & (PERF_FLAG_FD_NO_GROUP|PERF_FLAG_TOGGLE))
| group_leader = NULL;
| }
|
| @@ -7060,10 +7193,20 @@ SYSCALL_DEFINE5(perf_event_open,
| goto err_context;
| }
|
| + if (toggled_event) {
| + err = -EINVAL;
| + if (!atomic_long_inc_not_zero(&toggled_event->refcount))
| + goto err_context;
| +
| + err = perf_event_set_toggle(event, toggled_event, ctx, flags);
| + if (err)
| + goto err_toggle;
| + }
| +
| event_file = anon_inode_getfile("[perf_event]", &perf_fops, event, O_RDWR);
| if (IS_ERR(event_file)) {
| err = PTR_ERR(event_file);
| - goto err_context;
| + goto err_toggle;
| }
|
| if (move_group) {
| @@ -7131,6 +7274,9 @@ SYSCALL_DEFINE5(perf_event_open,
| fd_install(event_fd, event_file);
| return event_fd;
|
| +err_toggle:
| + if (toggled_event)
| + put_event(toggled_event);
| err_context:
| perf_unpin_context(ctx);
| put_ctx(ctx);
| --
| 1.7.11.7
|
| --
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to [email protected]
| More majordomo info at http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at http://www.tux.org/lkml/

2013-09-25 22:42:07

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH 08/21] perf: Add new 'paused' attribute

Jiri Olsa [[email protected]] wrote:
| Adding new 'paused' perf_event_attr attribute bit. It sets
| the initial event state as paused, so it wont get started
| until it's triggered by toggling mechanism.

There is a attr->disabled field which also leaves the event off by
default. Some comments distinguishing the two states will help ?

Sukadev

2013-09-26 07:03:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC 00/21] perf tools: Add toggling events support


* Andi Kleen <[email protected]> wrote:

> On Wed, Sep 25, 2013 at 02:50:26PM +0200, Jiri Olsa wrote:
> > hi,
> > sending *RFC* for toggling events support.
> >
> > Adding perf interface that allows to create toggle events, which can
> > enable or disable another event. Whenever the toggle event is triggered
> > (has overflow), it toggles another event state and either starts or
> > stops it.
> >
> > The goal is to be able to create toggling tracepoint events to enable and
> > disable HW counters, but the interface is generic enough to be used for
> > any kind of event.
>
> Haven't read the patches, but frequent full event switch in/out seems
> very expensive. If someone puts that switch on a common function it
> would likely disturb things quite a bit.
>
> It would be better to keep counting and just do RDPMC on the switch
> points, and then subtract for counting. For sampling could need a MSR
> write to enable/disable. Still somewhat expensive, but nowhere near as
> bad as a full switch.

This is essentially an optimized event switch and should probably be done
on a higher level so that other instances of event/context switching
benefit as well.

Thanks,

Ingo

2013-09-26 11:31:10

by Stephane Eranian

[permalink] [raw]
Subject: Re: [RFC 00/21] perf tools: Add toggling events support

Jiri,

On Wed, Sep 25, 2013 at 2:50 PM, Jiri Olsa <[email protected]> wrote:
> hi,
> sending *RFC* for toggling events support.
>
> Adding perf interface that allows to create toggle events, which can
> enable or disable another event. Whenever the toggle event is triggered
> (has overflow), it toggles another event state and either starts or
> stops it.
>
> The goal is to be able to create toggling tracepoint events to enable and
> disable HW counters, but the interface is generic enough to be used for
> any kind of event.
>
> It's based on the Frederic's patchset:
> https://lkml.org/lkml/2011/3/14/346
>
> Most of the changelogs info is on wiki:
> https://perf.wiki.kernel.org/index.php/Jolsa_Features_Togle_Event
>
> In a nutshell:
> The interface is added to the sys_perf_event_open syscall
> and new ioctl was added for completeness, check:
> perf: Add event toggle sys_perf_event_open interface
> perf: Add event toggle ioctl interface
>
> The perf tool interface is pretty rough at the moment. We use
> 'on' and 'off' terms to specify the toggling event, like:
> -e 'cycles,irq_entry/on=cycles/,irq_exit/off=cycles/'
>
> Meaning:
> - irq_entry toggles on (starts) cycles, and irq_exit toggled off (stops) cycles.
> - cycles is started as paused
>
> Looking forward to some ideas for better interface in here ;-)
>
> The patchset is available at:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/core_toggle
>
> thanks for comments,

Such interface is indeed desirable. I received several requests for
this feature internally and externally.
I think it would need to be generalized for user-level code also, same
user API, different trigger points.
I would assume they would be uprobes instead of tracepoints.

And I agree with you, the current cmdline interface is not good
enough, how about something more
aligned with the current syntax:
-e cpu/event=0x3c,trigger_on=irq_entry,trigger_off=irq_exit/,

Thanks.

> Example:
> Define toggle(on/off) events:
> # perf probe -a fork_entry=do_fork
> # perf probe -a fork_exit=do_fork%return
>
> Following record session samples only within do_fork function:
> # perf record -g -e '{cycles,cache-misses}:k,probe:fork_entry/on=cycles/,probe:fork_exit/off=cycles/' \
> perf bench sched messaging
>
> Following stat session measure cycles within do_fork function:
> # perf stat -e '{cycles,cache-misses}:k,probe:fork_entry/on=cycles/,probe:fork_exit/off=cycles/' \
> perf bench sched messaging
>
> # Running sched/messaging benchmark...
> # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
>
> Total time: 0.073 [sec]
>
> Performance counter stats for './perf bench sched messaging -g 1':
>
> 20,935,464 cycles # 0.000 GHz
> 18,897 cache-misses
> 40 probe:fork_entry
> 40 probe:fork_exit
>
> 0.086319682 seconds time elapsed
>
> Example:
> Measure interrupts cycles:
> # ./perf stat -e 'cycles,cycles/name=cycles_irq/,irq:irq_handler_entry/on=cycles_irq/,irq:irq_handler_exit/off=cycles_irq/' -a sleep 10
>
> Performance counter stats for 'sleep 10':
>
> 50,680,084,994 cycles # 0.000 GHz [100.00%]
> 652,690 cycles_irq # 0.000 GHz
> 33 irq:irq_handler_entry [100.00%]
> 33 irq:irq_handler_exit
>
> 10.002084400 seconds time elapsed
>
> Check uprobes example at:
> https://perf.wiki.kernel.org/index.php/Jolsa_Features_Togle_Event#Example_-_using_u.28ret.29probes
>
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Corey Ashford <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Don Zickus <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Adrian Hunter <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> ---
> Frederic Weisbecker (2):
> perf: Be more specific on pmu related event init naming
> perf: Split allocation and initialization code
>
> Jiri Olsa (19):
> perf tools: Introduce perf_evlist__wait_workload function
> perf tools: Separate sys_perf_event_open call into evsel_open
> perf x86: Update event count properly for read syscall
> perf: Move event state initialization before/behind the pmu add/del calls
> perf: Add event toggle sys_perf_event_open interface
> perf: Add event toggle ioctl interface
> perf: Toggle whole group in toggle event overflow
> perf: Add new 'paused' attribute
> perf: Account toggle masters for toggled event
> perf: Support event inheritance for toggle feature
> perf tests: Adding event simple toggling test
> perf tests: Adding event group toggling test
> perf tests: Adding event inherit toggling test
> perf tools: Allow numeric event to change name via name term
> perf tools: Add event_config_optional parsing rule
> perf tools: Rename term related parsing function/variable properly
> perf tools: Carry term string value for symbols events
> perf tools: Add support to parse event on/off toggle terms
> perf tools: Add record/stat support for toggling events
>
> arch/x86/kernel/cpu/perf_event.c | 6 +-
> include/linux/perf_event.h | 12 +++
> include/uapi/linux/perf_event.h | 7 +-
> kernel/events/core.c | 396 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
> tools/perf/Makefile | 6 ++
> tools/perf/arch/x86/tests/toggle-event-raw-64.S | 28 ++++++
> tools/perf/builtin-record.c | 7 ++
> tools/perf/builtin-stat.c | 12 +++
> tools/perf/tests/builtin-test.c | 12 +++
> tools/perf/tests/perf-record.c | 1 +
> tools/perf/tests/task-exit.c | 5 ++
> tools/perf/tests/tests.h | 3 +
> tools/perf/tests/toggle-event-group.c | 195 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/tests/toggle-event-inherit.c | 132 ++++++++++++++++++++++++++++
> tools/perf/tests/toggle-event-raw.c | 106 ++++++++++++++++++++++
> tools/perf/util/evlist.c | 97 +++++++++++++++++++++
> tools/perf/util/evlist.h | 3 +
> tools/perf/util/evsel.c | 53 ++++++-----
> tools/perf/util/evsel.h | 4 +
> tools/perf/util/parse-events.c | 131 +++++++++++++++++++---------
> tools/perf/util/parse-events.h | 9 +-
> tools/perf/util/parse-events.l | 6 +-
> tools/perf/util/parse-events.y | 68 +++++++++------
> tools/perf/util/record.c | 2 +
> 24 files changed, 1167 insertions(+), 134 deletions(-)
> create mode 100644 tools/perf/arch/x86/tests/toggle-event-raw-64.S
> create mode 100644 tools/perf/tests/toggle-event-group.c
> create mode 100644 tools/perf/tests/toggle-event-inherit.c
> create mode 100644 tools/perf/tests/toggle-event-raw.c

2013-09-26 12:12:21

by Jiri Olsa

[permalink] [raw]
Subject: Re: [RFC 00/21] perf tools: Add toggling events support

On Wed, Sep 25, 2013 at 12:12:16PM -0700, Andi Kleen wrote:
> On Wed, Sep 25, 2013 at 02:50:26PM +0200, Jiri Olsa wrote:
> > hi,
> > sending *RFC* for toggling events support.
> >
> > Adding perf interface that allows to create toggle events, which can
> > enable or disable another event. Whenever the toggle event is triggered
> > (has overflow), it toggles another event state and either starts or
> > stops it.
> >
> > The goal is to be able to create toggling tracepoint events to enable and
> > disable HW counters, but the interface is generic enough to be used for
> > any kind of event.
>
> Haven't read the patches, but frequent full event switch in/out seems
> very expensive. If someone puts that switch on a common
> function it would likely disturb things quite a bit.

We dont do full sched in/out.. the toggled event
is scheduled in 'paused' state which means that
it's not started. Once the trigger is hit, pmu
start/stop is executed.

>
> It would be better to keep counting and just do RDPMC on
> the switch points, and then subtract for counting.
> For sampling could need a MSR write to enable/disable.
> Still somewhat expensive, but nowhere near as bad as a full switch.

I'll check on that

>
> Another problem is that it may be very inexact, as
> the counting will often happen in the background
> and not be very synchronized with the switches.
> Not fully sure how big a problem that would be.

the toggling overflow function does following
(perf_event_toggle_overflow)

- disable pmu of the toggled event
- start/stop the toggle event
- store sample for the trigger function
- enable pmu of the toggled event

so the overhead (extra count) is:
- the return code from pmu enable and return from the overflow processing
- trigger event overflow processing till pmu disable code

and no overhead for user space events ;-)

thanks,
jirka

2013-09-26 12:21:16

by Jiri Olsa

[permalink] [raw]
Subject: Re: [RFC 00/21] perf tools: Add toggling events support

On Thu, Sep 26, 2013 at 01:31:06PM +0200, Stephane Eranian wrote:
> Jiri,
>
> On Wed, Sep 25, 2013 at 2:50 PM, Jiri Olsa <[email protected]> wrote:
> > hi,
> > sending *RFC* for toggling events support.
> >
> > Adding perf interface that allows to create toggle events, which can
> > enable or disable another event. Whenever the toggle event is triggered
> > (has overflow), it toggles another event state and either starts or
> > stops it.
> >
> > The goal is to be able to create toggling tracepoint events to enable and
> > disable HW counters, but the interface is generic enough to be used for
> > any kind of event.
> >
> > It's based on the Frederic's patchset:
> > https://lkml.org/lkml/2011/3/14/346
> >
> > Most of the changelogs info is on wiki:
> > https://perf.wiki.kernel.org/index.php/Jolsa_Features_Togle_Event
> >
> > In a nutshell:
> > The interface is added to the sys_perf_event_open syscall
> > and new ioctl was added for completeness, check:
> > perf: Add event toggle sys_perf_event_open interface
> > perf: Add event toggle ioctl interface
> >
> > The perf tool interface is pretty rough at the moment. We use
> > 'on' and 'off' terms to specify the toggling event, like:
> > -e 'cycles,irq_entry/on=cycles/,irq_exit/off=cycles/'
> >
> > Meaning:
> > - irq_entry toggles on (starts) cycles, and irq_exit toggled off (stops) cycles.
> > - cycles is started as paused
> >
> > Looking forward to some ideas for better interface in here ;-)
> >
> > The patchset is available at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > perf/core_toggle
> >
> > thanks for comments,
>
> Such interface is indeed desirable. I received several requests for
> this feature internally and externally.
> I think it would need to be generalized for user-level code also, same
> user API, different trigger points.
> I would assume they would be uprobes instead of tracepoints.

not sure what you mean, this code work with uprobes:
https://perf.wiki.kernel.org/index.php/Jolsa_Features_Togle_Event#Example_-_using_u.28ret.29probes

>
> And I agree with you, the current cmdline interface is not good
> enough, how about something more
> aligned with the current syntax:
> -e cpu/event=0x3c,trigger_on=irq_entry,trigger_off=irq_exit/,
>

that's basically oposite of what we have now, your example would be:
-e cpu/event=0x3c,name=cycles/',irq_entry/on=cycles/',irq_exit/off=cycles/'


hm, the thing is that you also want to customize trigger events,
for example how would you set a filter for irq_entry/irq_exit
if needed?

maybe allowing extra refference of triggers
-e cpu/event=0x3c,trigger_on=irq_entry,trigger_off=irq_exit/' -e 'irq_exit' --filter ...

jirka

2013-09-26 12:24:54

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 08/21] perf: Add new 'paused' attribute

On Wed, Sep 25, 2013 at 03:41:47PM -0700, Sukadev Bhattiprolu wrote:
> Jiri Olsa [[email protected]] wrote:
> | Adding new 'paused' perf_event_attr attribute bit. It sets
> | the initial event state as paused, so it wont get started
> | until it's triggered by toggling mechanism.
>
> There is a attr->disabled field which also leaves the event off by
> default. Some comments distinguishing the two states will help ?

right, so the 'disabled' says the event is created as disabled

the 'paused' says the event is enabled, it get's scheduled in/out
but it's does not get started.. toggle event will do that

I'll put more info on that into doc

thanks,
jirka

2013-09-26 12:27:21

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 05/21] perf: Add event toggle sys_perf_event_open interface

On Wed, Sep 25, 2013 at 03:36:29PM -0700, Sukadev Bhattiprolu wrote:
> Jiri Olsa [[email protected]] wrote:

SNIP

> | diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> | index 866e85c..6ede25c 100644
> | --- a/include/linux/perf_event.h
> | +++ b/include/linux/perf_event.h
> | @@ -289,6 +289,12 @@ struct swevent_hlist {
> | struct perf_cgroup;
> | struct ring_buffer;
> |
> | +enum perf_event_toggle_flag {
> | + PERF_TOGGLE_NONE = 0,
> | + PERF_TOGGLE_ON = 1,
> | + PERF_TOGGLE_OFF = 2,
> | +};
>
> Can we call this 'perf_event_toggle_state' ? it can be confusing with
> PERF_FLAG_TOGGLE* macros below which apply to a different field.

right, 'state' is probably better

>
> | +
> | /**
> | * struct perf_event - performance event kernel representation:
> | */
> | @@ -414,6 +420,9 @@ struct perf_event {
> | int cgrp_defer_enabled;
> | #endif
> |
> | + struct perf_event *toggled_event;
> | + enum perf_event_toggle_flag toggle_flag;
>
> s/toggle_flag/toggle_state/ ?
>
> | + int paused;
>
> There is an 'event->state' field with OFF, INACTIVE, ACTIVE states.
> Can we add a 'PAUSED' state to that instead ?

good idea, I think that's possible

>
>
> | #endif /* CONFIG_PERF_EVENTS */
> | };
> |
> | diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> | index ca1d90b..ecb0474 100644
> | --- a/include/uapi/linux/perf_event.h
> | +++ b/include/uapi/linux/perf_event.h
> | @@ -694,6 +694,9 @@ enum perf_callchain_context {
> | #define PERF_FLAG_FD_NO_GROUP (1U << 0)
> | #define PERF_FLAG_FD_OUTPUT (1U << 1)
> | #define PERF_FLAG_PID_CGROUP (1U << 2) /* pid=cgroup id, per-cpu mode only */
> | +#define PERF_FLAG_TOGGLE_ON (1U << 3)
> | +#define PERF_FLAG_TOGGLE_OFF (1U << 4)
> | +
> |

SNIP

> | + /* It's either ON or OFF. */
> | + if ((flags & PERF_FLAG_TOGGLE) == PERF_FLAG_TOGGLE)
> | + return -EINVAL;
> | +
> | + /* Allow only same cpu, */
> | + if (toggled_event->cpu != event->cpu)
> | + return -EINVAL;
> | +
> | + /* or same task. */
>
> nit: s/or/and/
>

ok

thanks,
jirka

2013-09-26 12:31:18

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 06/21] perf: Add event toggle ioctl interface

On Wed, Sep 25, 2013 at 03:46:58PM -0400, Vince Weaver wrote:
> On Wed, 25 Sep 2013, Jiri Olsa wrote:
>
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -325,6 +325,7 @@ struct perf_event_attr {
> > #define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
> > #define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
> > #define PERF_EVENT_IOC_ID _IOR('$', 7, u64 *)
> > +#define PERF_EVENT_IOC_SET_TOGGLE _IOW('$', 8, u64 *)
>
> I'm pretty sure this should be __u64 or else it won't compile
> in userspace.

hum, we define u64 in perf 'util/types.h', that's why I missed that..
I'll send a fix for the PERF_EVENT_IOC_ID as well

thanks,
jirka

2013-09-26 13:02:23

by Vince Weaver

[permalink] [raw]
Subject: Re: [PATCH 06/21] perf: Add event toggle ioctl interface

On Thu, 26 Sep 2013, Jiri Olsa wrote:

> On Wed, Sep 25, 2013 at 03:46:58PM -0400, Vince Weaver wrote:
> > On Wed, 25 Sep 2013, Jiri Olsa wrote:
> >
> > > --- a/include/uapi/linux/perf_event.h
> > > +++ b/include/uapi/linux/perf_event.h
> > > @@ -325,6 +325,7 @@ struct perf_event_attr {
> > > #define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
> > > #define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
> > > #define PERF_EVENT_IOC_ID _IOR('$', 7, u64 *)
> > > +#define PERF_EVENT_IOC_SET_TOGGLE _IOW('$', 8, u64 *)
> >
> > I'm pretty sure this should be __u64 or else it won't compile
> > in userspace.
>
> hum, we define u64 in perf 'util/types.h', that's why I missed that..
> I'll send a fix for the PERF_EVENT_IOC_ID as well

I sent in a fix for that recently and I think it's upstream already.

That's why I noticed the issue when reviewing the perf_event.h changes.

Vince

2013-09-26 15:45:22

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC 00/21] perf tools: Add toggling events support

> > It would be better to keep counting and just do RDPMC on the switch
> > points, and then subtract for counting. For sampling could need a MSR
> > write to enable/disable. Still somewhat expensive, but nowhere near as
> > bad as a full switch.
>
> This is essentially an optimized event switch and should probably be done
> on a higher level so that other instances of event/context switching
> benefit as well.

Ok. Need to be a bit careful and it cannot be done in all cases.

For non global counters this only works if RDPMC is disabled, otherwise
it leaks counter interformation between unrelated processes, which
may be a security problem.

-Andu