From: Robert Richter <[email protected]>
This patch set implements out of the box support of perf tools for
persistent events. For this the kernel must provide necessary
information about existing persistent events via sysfs to userland.
Persistent events are provided by the kernel with readonly event
buffers. To allow the independent usage of the event buffers by any
process without limiting other processes, multiple users of a single
event must be handled too.
The basic concept is to use a pmu as an event container for persistent
events. The pmu registers events in sysfs and provides format and
event information for the userland. The persistent event framework
requires to add events to the pmu dynamically.
With the information in sysfs userland knows about how to setup the
perf_event attribute of a persistent event. Since a persistent event
always has the persistent flag set, a way is needed to express this in
sysfs. A new syntax is used for this. With 'attr<num>:<mask>' any bit
in the attribute structure may be set in a similar way as using
'config<num>', but <num> is an index that points to the u64 value to
change within the attribute.
For persistent events the persistent flag (bit 23 of flag field in
struct perf_event_attr) needs to be set which is expressed in sysfs
with "attr5:23". E.g. the mce_record event is described in sysfs as
follows:
/sys/bus/event_source/devices/persistent/events/mce_record:persistent,config=106
/sys/bus/event_source/devices/persistent/format/persistent:attr5:23
Note that perf tools need to support the 'attr<num>' syntax that is
added in a separate patch set. With it we are able to run perf tool
commands to read persistent events, e.g.:
# perf record -e persistent/mce_record/ sleep 10
# perf top -e persistent/mce_record/
In general the new syntax is flexible to describe with sysfs any event
to be setup by perf tools.
First patches contain also fixes and reworks made after reviewing
code. Could be applied separately.
Patches base on Boris' patches which I have rebased to latest
tip/perf/core. All patches can be found here:
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent
-Robert
Robert Richter (16):
perf, persistent: Fix build error for no-tracepoints configs
perf, persistent: Fix attr size
perf, persistent: Setting default buffer size to 512k as in perf tools
perf, persistent: Print error code on failure when adding events
perf, persistent: Return resonable error code
perf, persistent: Return -EACCES if mapped buffers must be readonly
perf, persistent: Rework struct pers_event_desc
perf, persistent: Remove rb_put()
perf, persistent: Introduce get_persistent_event()
perf, persistent: Reworking perf_get_persistent_event_fd()
perf, persistent: Protect event lists with mutex
perf, persistent: Avoid adding identical events
perf, persistent: Implementing a persistent pmu
perf, persistent: Name each persistent event
perf, persistent: Exposing persistent events using sysfs
perf, persistent: Allow multiple users for an event
arch/x86/kernel/cpu/mcheck/mce.c | 7 +-
include/linux/perf_event.h | 11 +-
kernel/events/core.c | 6 +-
kernel/events/internal.h | 1 -
kernel/events/persistent.c | 292 +++++++++++++++++++++++++++++----------
5 files changed, 229 insertions(+), 88 deletions(-)
--
1.8.1.1
From: Robert Richter <[email protected]>
The mce_record tracepoint needs tracepoints to be enabled. Fixing
build error for no-tracepoints configs.
Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 0ecf4a2..d421937 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1987,6 +1987,8 @@ int __init mcheck_init(void)
return 0;
}
+#ifdef CONFIG_TRACEPOINTS
+
int __init mcheck_init_tp(void)
{
if (perf_add_persistent_event_by_id(event_mce_record.event.type)) {
@@ -2001,6 +2003,8 @@ int __init mcheck_init_tp(void)
*/
fs_initcall_sync(mcheck_init_tp);
+#endif /* CONFIG_TRACEPOINTS */
+
/*
* mce_syscore: PM support
*/
--
1.8.1.1
From: Robert Richter <[email protected]>
Returning -ENODEV if no file descriptor is found. An error code of -1
(-EPERM) is misleading in this case.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 6612eb77..190aa75 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -165,7 +165,7 @@ int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
if (desc->attr->config == attr->config)
return __alloc_persistent_event_fd(desc);
- return -1;
+ return -ENODEV;
}
--
1.8.1.1
From: Robert Richter <[email protected]>
mmap should return EACCES error if fd can not be opened writable.
This error code also helps userland to map buffers readonly on
failure.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0730f36..a9b6470 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3761,7 +3761,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
return -EINVAL;
if (event->attr.persistent && (vma->vm_flags & VM_WRITE))
- return -EINVAL;
+ return -EACCES;
vma_size = vma->vm_end - vma->vm_start;
nr_pages = (vma_size / PAGE_SIZE) - 1;
--
1.8.1.1
From: Robert Richter <[email protected]>
Struct pers_event_desc is only used in kernel/events/persistent.c.
Moving it there. Also, removing attr member as this is a copy of
event->attr.
Signed-off-by: Robert Richter <[email protected]>
---
include/linux/perf_event.h | 7 -------
kernel/events/persistent.c | 12 ++++++++----
2 files changed, 8 insertions(+), 11 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index d2a42b7..dc72c93 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -518,13 +518,6 @@ struct perf_output_handle {
int page;
};
-struct pers_event_desc {
- struct perf_event_attr *attr;
- struct perf_event *event;
- struct list_head plist;
- int fd;
-};
-
#ifdef CONFIG_PERF_EVENTS
extern int perf_pmu_register(struct pmu *pmu, char *name, int type);
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 190aa75..22297e5 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -8,6 +8,12 @@
/* 512 kiB: default perf tools memory size, see perf_evlist__mmap() */
#define CPU_BUFFER_NR_PAGES ((512 * 1024) / PAGE_SIZE)
+struct pers_event_desc {
+ struct perf_event *event;
+ struct list_head plist;
+ int fd;
+};
+
static DEFINE_PER_CPU(struct list_head, pers_events);
static struct perf_event *
@@ -33,7 +39,6 @@ add_persistent_event_on_cpu(unsigned int cpu, struct perf_event_attr *attr,
rcu_assign_pointer(event->rb, buf);
desc->event = event;
- desc->attr = attr;
INIT_LIST_HEAD(&desc->plist);
list_add_tail(&desc->plist, &per_cpu(pers_events, cpu));
@@ -59,7 +64,7 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
struct perf_event *event = NULL;
list_for_each_entry_safe(desc, tmp, &per_cpu(pers_events, cpu), plist) {
- if (desc->attr->config == attr->config) {
+ if (desc->event->attr.config == attr->config) {
event = desc->event;
break;
}
@@ -78,7 +83,6 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
perf_event_release_kernel(event);
put_unused_fd(desc->fd);
- kfree(desc->attr);
kfree(desc);
}
@@ -162,7 +166,7 @@ int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
struct pers_event_desc *desc;
list_for_each_entry(desc, &per_cpu(pers_events, cpu), plist)
- if (desc->attr->config == attr->config)
+ if (desc->event->attr.config == attr->config)
return __alloc_persistent_event_fd(desc);
return -ENODEV;
--
1.8.1.1
From: Robert Richter <[email protected]>
rb_put() is called already in perf_event_release_kernel(), so no need
to do the same in del_persistent_event(). We also don't need it in
add_persistent_event_on_cpu() after a rework. Since there are no users
of rb_put() anymore, we can make it private again to only
events/core.c.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/core.c | 4 +++-
kernel/events/internal.h | 1 -
kernel/events/persistent.c | 29 +++++++++++------------------
3 files changed, 14 insertions(+), 20 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a9b6470..8f85caa 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3021,6 +3021,8 @@ static void free_event_rcu(struct rcu_head *head)
kfree(event);
}
+static void rb_put(struct ring_buffer *rb);
+
static void free_event(struct perf_event *event)
{
irq_work_sync(&event->pending);
@@ -3680,7 +3682,7 @@ static struct ring_buffer *rb_get(struct perf_event *event)
return rb;
}
-void rb_put(struct ring_buffer *rb)
+static void rb_put(struct ring_buffer *rb)
{
struct perf_event *event, *n;
unsigned long flags;
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 3b481be..3647289 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -38,7 +38,6 @@ struct ring_buffer {
extern void rb_free(struct ring_buffer *rb);
extern struct ring_buffer *
rb_alloc(int nr_pages, long watermark, int cpu, int flags);
-extern void rb_put(struct ring_buffer *rb);
extern void perf_event_wakeup(struct perf_event *event);
extern void
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 22297e5..ff1ce3b 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -20,22 +20,22 @@ static struct perf_event *
add_persistent_event_on_cpu(unsigned int cpu, struct perf_event_attr *attr,
unsigned nr_pages)
{
- struct perf_event *event = ERR_PTR(-ENOMEM);
+ struct perf_event *event;
struct pers_event_desc *desc;
struct ring_buffer *buf;
desc = kzalloc(sizeof(*desc), GFP_KERNEL);
if (!desc)
- goto out;
-
- buf = rb_alloc(nr_pages, 0, cpu, 0);
- if (!buf)
- goto err_rb;
+ return ERR_PTR(-ENOMEM);
event = perf_event_create_kernel_counter(attr, cpu, NULL, NULL, NULL);
if (IS_ERR(event))
goto err_event;
+ buf = rb_alloc(nr_pages, 0, cpu, 0);
+ if (!buf)
+ goto err_rb;
+
rcu_assign_pointer(event->rb, buf);
desc->event = event;
@@ -47,14 +47,12 @@ add_persistent_event_on_cpu(unsigned int cpu, struct perf_event_attr *attr,
perf_event_enable(event);
goto out;
-
- err_event:
- rb_put(buf);
-
- err_rb:
+err_rb:
+ perf_event_release_kernel(event);
+ event = ERR_PTR(-ENOMEM);
+err_event:
kfree(desc);
-
- out:
+out:
return event;
}
@@ -76,11 +74,6 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
list_del(&desc->plist);
perf_event_disable(event);
- if (event->rb) {
- rb_put(event->rb);
- rcu_assign_pointer(event->rb, NULL);
- }
-
perf_event_release_kernel(event);
put_unused_fd(desc->fd);
kfree(desc);
--
1.8.1.1
From: Robert Richter <[email protected]>
Usually a fd close leads to the release of the event too. For
persistent events this is different as the events should be
permanently enabled in the system. Using reference counting to avoid
releasing an event during a fd close. This also allows it to have
multiple users (open file descriptors) for a single persistent event.
While at this, we don't need desc->fd any longer. The fd is attached
to a task and reference counting keeps the event. Removing desc->fd.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 46 ++++++++++++++++++++++++++++++++++++----------
1 file changed, 36 insertions(+), 10 deletions(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index a764144..4920702 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -11,7 +11,6 @@
struct pers_event_desc {
struct perf_event *event;
struct list_head plist;
- int fd;
};
struct pers_event {
@@ -88,6 +87,18 @@ out:
return event;
}
+static void detach_persistent_event(struct pers_event_desc *desc)
+{
+ list_del(&desc->plist);
+ kfree(desc);
+}
+
+static void release_persistent_event(struct perf_event *event)
+{
+ perf_event_disable(event);
+ perf_event_release_kernel(event);
+}
+
static void del_persistent_event(int cpu, struct perf_event_attr *attr)
{
struct pers_event_desc *desc;
@@ -100,12 +111,14 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
goto out;
event = desc->event;
- list_del(&desc->plist);
-
- perf_event_disable(event);
- perf_event_release_kernel(event);
- put_unused_fd(desc->fd);
- kfree(desc);
+ /*
+ * We primarily want to remove desc from the list. If there
+ * are no open files, the refcount is 0 and we need to release
+ * the event too.
+ */
+ detach_persistent_event(desc);
+ if (atomic_long_dec_and_test(&event->refcount))
+ release_persistent_event(event);
out:
mutex_unlock(&per_cpu(pers_events_lock, cpu));
}
@@ -182,18 +195,31 @@ fail:
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
{
struct pers_event_desc *desc;
+ struct perf_event *event;
int event_fd = -ENODEV;
mutex_lock(&per_cpu(pers_events_lock, cpu));
desc = get_persistent_event(cpu, attr);
- if (!desc)
+
+ /* Increment refcount to keep event on put_event() */
+ if (!desc || !atomic_long_inc_not_zero(&desc->event->refcount))
goto out;
event_fd = anon_inode_getfd("[pers_event]", &perf_fops,
desc->event, O_RDONLY);
- if (event_fd >= 0)
- desc->fd = event_fd;
+
+ if (event_fd < 0) {
+ event = desc->event;
+ if (WARN_ON(atomic_long_dec_and_test(&event->refcount))) {
+ /*
+ * May not happen since decrementing refcount is
+ * protected by pers_events_lock.
+ */
+ detach_persistent_event(desc);
+ release_persistent_event(event);
+ }
+ }
out:
mutex_unlock(&per_cpu(pers_events_lock, cpu));
--
1.8.1.1
From: Robert Richter <[email protected]>
Expose persistent events in the system to userland using sysfs. Perf
tools are able to read existing pmu events from sysfs. Now we use a
persistent pmu as an event container containing all registered
persistent events of the system. This patch adds dynamically
registration of persistent events to sysfs. E.g. something like this:
/sys/bus/event_source/devices/persistent/events/mce_record:persistent,config=106
/sys/bus/event_source/devices/persistent/format/persistent:attr5:23
Perf tools need to support the attr<num> syntax that is added in a
separate patch set. With it we are able to run perf tool commands to
read persistent events, e.g.:
# perf record -e persistent/mce_record/ sleep 10
# perf top -e persistent/mce_record/
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 55 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index d5093a3..a764144 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -17,8 +17,10 @@ struct pers_event_desc {
struct pers_event {
char *name;
struct perf_event_attr attr;
+ struct perf_pmu_events_attr sysfs;
};
+static struct pmu persistent_pmu;
static DEFINE_PER_CPU(struct list_head, pers_events);
static DEFINE_PER_CPU(struct mutex, pers_events_lock);
@@ -137,6 +139,8 @@ unwind:
return PTR_ERR(event);
}
+static int pers_event_sysfs_register(struct pers_event *event);
+
int perf_add_persistent_event_by_id(char* name, int id)
{
struct pers_event *event;
@@ -150,6 +154,8 @@ int perf_add_persistent_event_by_id(char* name, int id)
if (!event->name)
goto fail;
+ event->sysfs.id = id;
+
attr = &event->attr;
attr->sample_period = 1;
attr->wakeup_events = 1;
@@ -163,6 +169,8 @@ int perf_add_persistent_event_by_id(char* name, int id)
if (ret)
goto fail;
+ pers_event_sysfs_register(event);
+
return 0;
fail:
kfree(event->name);
@@ -204,12 +212,57 @@ static struct attribute_group persistent_format_group = {
.attrs = persistent_format_attrs,
};
+#define MAX_EVENTS 16
+
+static struct attribute *persistent_events_attr[MAX_EVENTS + 1] = { };
+
+static struct attribute_group persistent_events_group = {
+ .name = "events",
+ .attrs = persistent_events_attr,
+};
+
static const struct attribute_group *persistent_attr_groups[] = {
&persistent_format_group,
+ NULL, /* placeholder: &persistent_events_group */
NULL,
};
+#define EVENTS_GROUP (persistent_attr_groups[1])
-static struct pmu persistent_pmu;
+static ssize_t pers_event_sysfs_show(struct device *dev,
+ struct device_attribute *__attr, char *page)
+{
+ struct perf_pmu_events_attr *attr =
+ container_of(__attr, struct perf_pmu_events_attr, attr);
+ return sprintf(page, "persistent,config=%lld",
+ (unsigned long long)attr->id);
+}
+
+static int pers_event_sysfs_register(struct pers_event *event)
+{
+ struct device_attribute *attr = &event->sysfs.attr;
+ int idx;
+
+ *attr = (struct device_attribute)__ATTR(, 0444, pers_event_sysfs_show,
+ NULL);
+ attr->attr.name = event->name;
+
+ /* add sysfs attr to events: */
+ for (idx = 0; idx < MAX_EVENTS; idx++) {
+ if (!cmpxchg(persistent_events_attr + idx, NULL, &attr->attr))
+ break;
+ }
+
+ if (idx >= MAX_EVENTS)
+ return -ENOSPC;
+ if (!idx)
+ EVENTS_GROUP = &persistent_events_group;
+ if (!persistent_pmu.dev)
+ return 0; /* sysfs not yet initialized */
+ if (idx)
+ return sysfs_update_group(&persistent_pmu.dev->kobj,
+ EVENTS_GROUP);
+ return sysfs_create_group(&persistent_pmu.dev->kobj, EVENTS_GROUP);
+}
static int persistent_pmu_init(struct perf_event *event)
{
--
1.8.1.1
From: Robert Richter <[email protected]>
For later adding persistent events to sysfs we need a name for each
event. Adding a name to each persistent event.
Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
include/linux/perf_event.h | 4 ++--
kernel/events/persistent.c | 30 +++++++++++++++++++++++++-----
3 files changed, 29 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index d421937..833eb7a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1991,7 +1991,8 @@ int __init mcheck_init(void)
int __init mcheck_init_tp(void)
{
- if (perf_add_persistent_event_by_id(event_mce_record.event.type)) {
+ if (perf_add_persistent_event_by_id(event_mce_record.name,
+ event_mce_record.event.type)) {
pr_err("Error adding MCE persistent event.\n");
return -EINVAL;
}
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index dc72c93..06b4357b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -751,7 +751,7 @@ extern void perf_event_disable(struct perf_event *event);
extern int __perf_event_disable(void *info);
extern void perf_event_task_tick(void);
extern int perf_add_persistent_event(struct perf_event_attr *, unsigned);
-extern int perf_add_persistent_event_by_id(int id);
+extern int perf_add_persistent_event_by_id(char *name, int id);
#else /* !CONFIG_PERF_EVENTS */
static inline void
perf_event_task_sched_in(struct task_struct *prev,
@@ -794,7 +794,7 @@ static inline int __perf_event_disable(void *info) { return -1; }
static inline void perf_event_task_tick(void) { }
static inline int perf_add_persistent_event(struct perf_event_attr *attr,
unsigned nr_pages) { return -EINVAL; }
-static inline int perf_add_persistent_event_by_id(int id) { return -EINVAL; }
+static inline int perf_add_persistent_event_by_id(char *name, int id) { return -EINVAL; }
#endif /* !CONFIG_PERF_EVENTS */
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_NO_HZ_FULL)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 95d6943..d5093a3 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -14,6 +14,11 @@ struct pers_event_desc {
int fd;
};
+struct pers_event {
+ char *name;
+ struct perf_event_attr attr;
+};
+
static DEFINE_PER_CPU(struct list_head, pers_events);
static DEFINE_PER_CPU(struct mutex, pers_events_lock);
@@ -132,14 +137,20 @@ unwind:
return PTR_ERR(event);
}
-int perf_add_persistent_event_by_id(int id)
+int perf_add_persistent_event_by_id(char* name, int id)
{
- struct perf_event_attr *attr;
+ struct pers_event *event;
+ struct perf_event_attr *attr;
+ int ret = -ENOMEM;
- attr = kzalloc(sizeof(*attr), GFP_KERNEL);
- if (!attr)
+ event = kzalloc(sizeof(*event), GFP_KERNEL);
+ if (!event)
return -ENOMEM;
+ event->name = kstrdup(name, GFP_KERNEL);
+ if (!event->name)
+ goto fail;
+ attr = &event->attr;
attr->sample_period = 1;
attr->wakeup_events = 1;
attr->sample_type = PERF_SAMPLE_RAW;
@@ -148,7 +159,16 @@ int perf_add_persistent_event_by_id(int id)
attr->type = PERF_TYPE_TRACEPOINT;
attr->size = sizeof(*attr);
- return perf_add_persistent_event(attr, CPU_BUFFER_NR_PAGES);
+ ret = perf_add_persistent_event(attr, CPU_BUFFER_NR_PAGES);
+ if (ret)
+ goto fail;
+
+ return 0;
+fail:
+ kfree(event->name);
+ kfree(event);
+
+ return ret;
}
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
--
1.8.1.1
From: Robert Richter <[email protected]>
We want to use the kernel's pmu design to later expose persistent
events via sysfs to userland. Initially implement a persistent pmu.
The format syntax is introduced allowing to set bits anywhere in
struct perf_event_attr. This is used in this case to set the
persistent flag (attr5:23). The syntax is attr<num> where num is the
index of the u64 array in struct perf_event_attr. Otherwise syntax is
same as for config<num>.
Patches that implement this functionality for perf tools are sent in a
separate patchset.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 1e93b51..95d6943 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -172,10 +172,45 @@ out:
return event_fd;
}
+PMU_FORMAT_ATTR(persistent, "attr5:23");
+
+static struct attribute *persistent_format_attrs[] = {
+ &format_attr_persistent.attr,
+ NULL,
+};
+
+static struct attribute_group persistent_format_group = {
+ .name = "format",
+ .attrs = persistent_format_attrs,
+};
+
+static const struct attribute_group *persistent_attr_groups[] = {
+ &persistent_format_group,
+ NULL,
+};
+
+static struct pmu persistent_pmu;
+
+static int persistent_pmu_init(struct perf_event *event)
+{
+ if (persistent_pmu.type != event->attr.type)
+ return -ENOENT;
+
+ /* Not a persistent event. */
+ return -EFAULT;
+}
+
+static struct pmu persistent_pmu = {
+ .event_init = persistent_pmu_init,
+ .attr_groups = persistent_attr_groups,
+};
+
void __init persistent_events_init(void)
{
int cpu;
+ perf_pmu_register(&persistent_pmu, "persistent", -1);
+
for_each_possible_cpu(cpu) {
INIT_LIST_HEAD(&per_cpu(pers_events, cpu));
mutex_init(&per_cpu(pers_events_lock, cpu));
--
1.8.1.1
From: Robert Richter <[email protected]>
Check if an event already exists before adding it.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index ebef089..1e93b51 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -40,6 +40,12 @@ add_persistent_event_on_cpu(unsigned int cpu, struct perf_event_attr *attr,
mutex_lock(&per_cpu(pers_events_lock, cpu));
+ desc = get_persistent_event(cpu, attr);
+ if (desc) {
+ event = ERR_PTR(-EEXIST);
+ goto out;
+ }
+
desc = kzalloc(sizeof(*desc), GFP_KERNEL);
if (!desc) {
event = ERR_PTR(-ENOMEM);
--
1.8.1.1
From: Robert Richter <[email protected]>
Protect esp. access to struct pers_event_desc *desc. There are race
conditions possible where the descriptor could be removed from list
while it is used.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 4ecbdab..ebef089 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -15,6 +15,7 @@ struct pers_event_desc {
};
static DEFINE_PER_CPU(struct list_head, pers_events);
+static DEFINE_PER_CPU(struct mutex, pers_events_lock);
static struct pers_event_desc
*get_persistent_event(int cpu, struct perf_event_attr *attr)
@@ -37,9 +38,13 @@ add_persistent_event_on_cpu(unsigned int cpu, struct perf_event_attr *attr,
struct pers_event_desc *desc;
struct ring_buffer *buf;
+ mutex_lock(&per_cpu(pers_events_lock, cpu));
+
desc = kzalloc(sizeof(*desc), GFP_KERNEL);
- if (!desc)
- return ERR_PTR(-ENOMEM);
+ if (!desc) {
+ event = ERR_PTR(-ENOMEM);
+ goto out;
+ }
event = perf_event_create_kernel_counter(attr, cpu, NULL, NULL, NULL);
if (IS_ERR(event))
@@ -66,6 +71,7 @@ err_rb:
err_event:
kfree(desc);
out:
+ mutex_unlock(&per_cpu(pers_events_lock, cpu));
return event;
}
@@ -74,9 +80,11 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
struct pers_event_desc *desc;
struct perf_event *event;
+ mutex_lock(&per_cpu(pers_events_lock, cpu));
+
desc = get_persistent_event(cpu, attr);
if (!desc)
- return;
+ goto out;
event = desc->event;
list_del(&desc->plist);
@@ -85,6 +93,8 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
perf_event_release_kernel(event);
put_unused_fd(desc->fd);
kfree(desc);
+out:
+ mutex_unlock(&per_cpu(pers_events_lock, cpu));
}
/*
@@ -137,25 +147,31 @@ int perf_add_persistent_event_by_id(int id)
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
{
- struct pers_event_desc *desc = get_persistent_event(cpu, attr);
- int event_fd;
+ struct pers_event_desc *desc;
+ int event_fd = -ENODEV;
+ mutex_lock(&per_cpu(pers_events_lock, cpu));
+
+ desc = get_persistent_event(cpu, attr);
if (!desc)
- return -ENODEV;
+ goto out;
event_fd = anon_inode_getfd("[pers_event]", &perf_fops,
desc->event, O_RDONLY);
if (event_fd >= 0)
desc->fd = event_fd;
+out:
+ mutex_unlock(&per_cpu(pers_events_lock, cpu));
return event_fd;
}
-
void __init persistent_events_init(void)
{
- int i;
+ int cpu;
- for_each_possible_cpu(i)
- INIT_LIST_HEAD(&per_cpu(pers_events, i));
+ for_each_possible_cpu(cpu) {
+ INIT_LIST_HEAD(&per_cpu(pers_events, cpu));
+ mutex_init(&per_cpu(pers_events_lock, cpu));
+ }
}
--
1.8.1.1
From: Robert Richter <[email protected]>
There is already a function anon_inode_getfd() that does already all
the work. Reworking and simplifying code.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 35 +++++++----------------------------
1 file changed, 7 insertions(+), 28 deletions(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 7d1aff3..4ecbdab 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -87,33 +87,6 @@ static void del_persistent_event(int cpu, struct perf_event_attr *attr)
kfree(desc);
}
-static int __alloc_persistent_event_fd(struct pers_event_desc *desc)
-{
- struct file *event_file = NULL;
- int event_fd = -1;
-
- event_fd = get_unused_fd();
- if (event_fd < 0)
- goto out;
-
- event_file = anon_inode_getfile("[pers_event]", &perf_fops,
- desc->event, O_RDONLY);
- if (IS_ERR(event_file))
- goto err_event_file;
-
- desc->fd = event_fd;
- fd_install(event_fd, event_file);
-
- goto out;
-
-
- err_event_file:
- put_unused_fd(event_fd);
-
- out:
- return event_fd;
-}
-
/*
* Create and enable the persistent version of the perf event described by
* @attr.
@@ -165,11 +138,17 @@ int perf_add_persistent_event_by_id(int id)
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
{
struct pers_event_desc *desc = get_persistent_event(cpu, attr);
+ int event_fd;
if (!desc)
return -ENODEV;
- return __alloc_persistent_event_fd(desc);
+ event_fd = anon_inode_getfd("[pers_event]", &perf_fops,
+ desc->event, O_RDONLY);
+ if (event_fd >= 0)
+ desc->fd = event_fd;
+
+ return event_fd;
}
--
1.8.1.1
From: Robert Richter <[email protected]>
Reducing duplicate code by introducing function get_persistent_event()
to get the descriptor of an persistent event.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index ff1ce3b..7d1aff3 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -16,6 +16,19 @@ struct pers_event_desc {
static DEFINE_PER_CPU(struct list_head, pers_events);
+static struct pers_event_desc
+*get_persistent_event(int cpu, struct perf_event_attr *attr)
+{
+ struct pers_event_desc *desc;
+
+ list_for_each_entry(desc, &per_cpu(pers_events, cpu), plist) {
+ if (desc->event->attr.config == attr->config)
+ return desc;
+ }
+
+ return NULL;
+}
+
static struct perf_event *
add_persistent_event_on_cpu(unsigned int cpu, struct perf_event_attr *attr,
unsigned nr_pages)
@@ -58,18 +71,13 @@ out:
static void del_persistent_event(int cpu, struct perf_event_attr *attr)
{
- struct pers_event_desc *desc, *tmp;
- struct perf_event *event = NULL;
-
- list_for_each_entry_safe(desc, tmp, &per_cpu(pers_events, cpu), plist) {
- if (desc->event->attr.config == attr->config) {
- event = desc->event;
- break;
- }
- }
+ struct pers_event_desc *desc;
+ struct perf_event *event;
- if (!event)
+ desc = get_persistent_event(cpu, attr);
+ if (!desc)
return;
+ event = desc->event;
list_del(&desc->plist);
@@ -156,13 +164,12 @@ int perf_add_persistent_event_by_id(int id)
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
{
- struct pers_event_desc *desc;
+ struct pers_event_desc *desc = get_persistent_event(cpu, attr);
- list_for_each_entry(desc, &per_cpu(pers_events, cpu), plist)
- if (desc->event->attr.config == attr->config)
- return __alloc_persistent_event_fd(desc);
+ if (!desc)
+ return -ENODEV;
- return -ENODEV;
+ return __alloc_persistent_event_fd(desc);
}
--
1.8.1.1
From: Robert Richter <[email protected]>
The default buffer size used to setup event buffers with perf tools is
512k. Using the same buffer size for persistent events. This also
avoids failed mmap calls due to different buffer sizes.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 9075164..1e6c03a 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -5,6 +5,9 @@
#include "internal.h"
+/* 512 kiB: default perf tools memory size, see perf_evlist__mmap() */
+#define CPU_BUFFER_NR_PAGES ((512 * 1024) / PAGE_SIZE)
+
static DEFINE_PER_CPU(struct list_head, pers_events);
static struct perf_event *
@@ -151,7 +154,7 @@ int perf_add_persistent_event_by_id(int id)
attr->type = PERF_TYPE_TRACEPOINT;
attr->size = sizeof(*attr);
- return perf_add_persistent_event(attr, 4);
+ return perf_add_persistent_event(attr, CPU_BUFFER_NR_PAGES);
}
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
--
1.8.1.1
From: Robert Richter <[email protected]>
Improve error reporting.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 1e6c03a..6612eb77 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -123,15 +123,15 @@ int perf_add_persistent_event(struct perf_event_attr *attr, unsigned nr_pages)
for_each_possible_cpu(i) {
event = add_persistent_event_on_cpu(i, attr, nr_pages);
- if (IS_ERR(event)) {
- pr_err("%s: Error adding persistent event on cpu %d\n",
- __func__, i);
+ if (IS_ERR(event))
goto unwind;
- }
}
return 0;
unwind:
+ pr_err("%s: Error adding persistent event on cpu %d: %ld\n",
+ __func__, i, PTR_ERR(event));
+
while (--i >= 0)
del_persistent_event(i, attr);
--
1.8.1.1
From: Robert Richter <[email protected]>
Fixing wrongly determined attribute size.
Signed-off-by: Robert Richter <[email protected]>
---
kernel/events/persistent.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
index 5335644..9075164 100644
--- a/kernel/events/persistent.c
+++ b/kernel/events/persistent.c
@@ -149,7 +149,7 @@ int perf_add_persistent_event_by_id(int id)
attr->persistent = 1;
attr->config = id;
attr->type = PERF_TYPE_TRACEPOINT;
- attr->size = sizeof(attr);
+ attr->size = sizeof(*attr);
return perf_add_persistent_event(attr, 4);
}
--
1.8.1.1
On Fri, May 31, 2013 at 10:47:21AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
>
> The mce_record tracepoint needs tracepoints to be enabled. Fixing
> build error for no-tracepoints configs.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 0ecf4a2..d421937 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1987,6 +1987,8 @@ int __init mcheck_init(void)
> return 0;
> }
>
> +#ifdef CONFIG_TRACEPOINTS
> +
> int __init mcheck_init_tp(void)
> {
> if (perf_add_persistent_event_by_id(event_mce_record.event.type)) {
> @@ -2001,6 +2003,8 @@ int __init mcheck_init_tp(void)
> */
> fs_initcall_sync(mcheck_init_tp);
>
> +#endif /* CONFIG_TRACEPOINTS */
> +
Right, since my persistent patches are in no tree yet, we should merge
this one with, I think, the last one in my set and keep them all
together.
Thanks.
On Fri, May 31, 2013 at 10:47:22AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
>
> Fixing wrongly determined attribute size.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> kernel/events/persistent.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
> index 5335644..9075164 100644
> --- a/kernel/events/persistent.c
> +++ b/kernel/events/persistent.c
> @@ -149,7 +149,7 @@ int perf_add_persistent_event_by_id(int id)
> attr->persistent = 1;
> attr->config = id;
> attr->type = PERF_TYPE_TRACEPOINT;
> - attr->size = sizeof(attr);
> + attr->size = sizeof(*attr);
Ditto for this one.
On Fri, May 31, 2013 at 10:47:23AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
>
> The default buffer size used to setup event buffers with perf tools is
> 512k. Using the same buffer size for persistent events. This also
> avoids failed mmap calls due to different buffer sizes.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> kernel/events/persistent.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
> index 9075164..1e6c03a 100644
> --- a/kernel/events/persistent.c
> +++ b/kernel/events/persistent.c
> @@ -5,6 +5,9 @@
>
> #include "internal.h"
>
> +/* 512 kiB: default perf tools memory size, see perf_evlist__mmap() */
> +#define CPU_BUFFER_NR_PAGES ((512 * 1024) / PAGE_SIZE)
> +
> static DEFINE_PER_CPU(struct list_head, pers_events);
>
> static struct perf_event *
> @@ -151,7 +154,7 @@ int perf_add_persistent_event_by_id(int id)
> attr->type = PERF_TYPE_TRACEPOINT;
> attr->size = sizeof(*attr);
>
> - return perf_add_persistent_event(attr, 4);
> + return perf_add_persistent_event(attr, CPU_BUFFER_NR_PAGES);
This one too - the 4 pages I chose then was purely arbitrary for I had
no idea what size to pick. This clearly makes more sense.
Thanks.
On Fri, May 31, 2013 at 10:47:24AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
>
> Improve error reporting.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> kernel/events/persistent.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
> index 1e6c03a..6612eb77 100644
> --- a/kernel/events/persistent.c
> +++ b/kernel/events/persistent.c
> @@ -123,15 +123,15 @@ int perf_add_persistent_event(struct perf_event_attr *attr, unsigned nr_pages)
>
> for_each_possible_cpu(i) {
> event = add_persistent_event_on_cpu(i, attr, nr_pages);
> - if (IS_ERR(event)) {
> - pr_err("%s: Error adding persistent event on cpu %d\n",
> - __func__, i);
> + if (IS_ERR(event))
> goto unwind;
> - }
> }
> return 0;
>
> unwind:
> + pr_err("%s: Error adding persistent event on cpu %d: %ld\n",
> + __func__, i, PTR_ERR(event));
> +
> while (--i >= 0)
> del_persistent_event(i, attr);
Should also be merged.
Thanks.
On Fri, May 31, 2013 at 10:47:25AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
>
> Returning -ENODEV if no file descriptor is found. An error code of -1
> (-EPERM) is misleading in this case.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> kernel/events/persistent.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/events/persistent.c b/kernel/events/persistent.c
> index 6612eb77..190aa75 100644
> --- a/kernel/events/persistent.c
> +++ b/kernel/events/persistent.c
> @@ -165,7 +165,7 @@ int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
> if (desc->attr->config == attr->config)
> return __alloc_persistent_event_fd(desc);
>
> - return -1;
> + return -ENODEV;
Ditto.
On Fri, May 31, 2013 at 10:47:26AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
>
> mmap should return EACCES error if fd can not be opened writable.
> This error code also helps userland to map buffers readonly on
> failure.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> kernel/events/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 0730f36..a9b6470 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3761,7 +3761,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
> return -EINVAL;
>
> if (event->attr.persistent && (vma->vm_flags & VM_WRITE))
> - return -EINVAL;
> + return -EACCES;
>
> vma_size = vma->vm_end - vma->vm_start;
> nr_pages = (vma_size / PAGE_SIZE) - 1;
Also merge.
On Fri, May 31, 2013 at 10:47:20AM +0200, Robert Richter wrote:
> Patches base on Boris' patches which I have rebased to latest
> tip/perf/core. All patches can be found here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent
Right, we should merge the fixes/changes to the persistent patches with
the original ones since they are still WIP anyway.
Feel free to rework them as you see fit, or, I could do that when I get
back.
Thanks.
On 31.05.13 11:15:40, Borislav Petkov wrote:
> On Fri, May 31, 2013 at 10:47:20AM +0200, Robert Richter wrote:
> > Patches base on Boris' patches which I have rebased to latest
> > tip/perf/core. All patches can be found here:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent
>
> Right, we should merge the fixes/changes to the persistent patches with
> the original ones since they are still WIP anyway.
>
> Feel free to rework them as you see fit, or, I could do that when I get
> back.
Hmm, since the changes in the onliner patches are either hard effort
to find in reviewing/testing or more or less related to the new
implementation, I better prefer to keep authorship as well to document
the code development (don't blame me about patch count ;)). We better
should add a branch (preferable at topic branch in tip?) as soon as
possible for this.
-Robert
On Fri, May 31, 2013 at 11:32:10AM +0200, Robert Richter wrote:
> Hmm, since the changes in the onliner patches are either hard effort
> to find in reviewing/testing or more or less related to the new
> implementation, I better prefer to keep authorship as well to document
> the code development (don't blame me about patch count ;)). We better
> should add a branch (preferable at topic branch in tip?) as soon as
> possible for this.
Yes, you should definitely keep authorship - simply state this in the
commit message, add your SOB, etc. However, I don't want to have messy
history for patches which haven't been reviewed yet. This complicates
review needlessly and makes absolutely no sense for later when you stare
at the history.
Thanks.
On 31.05.13 14:21:36, Borislav Petkov wrote:
> On Fri, May 31, 2013 at 11:32:10AM +0200, Robert Richter wrote:
> > Hmm, since the changes in the onliner patches are either hard effort
> > to find in reviewing/testing or more or less related to the new
> > implementation, I better prefer to keep authorship as well to document
> > the code development (don't blame me about patch count ;)). We better
> > should add a branch (preferable at topic branch in tip?) as soon as
> > possible for this.
>
> Yes, you should definitely keep authorship - simply state this in the
> commit message, add your SOB, etc. However, I don't want to have messy
> history for patches which haven't been reviewed yet. This complicates
> review needlessly and makes absolutely no sense for later when you stare
> at the history.
No, it's not about authorship in the sense of copyright, it's just
about keeping track of changes. My changes weren't related to a patch
review. In that case it would be totally fine to instantly merge the
changes.
Instead I reviewed the code while developing it with certain goals in
mind. Most changes I found necessary while building and running a
modified version during development. That never could be found in a
patch review. These changes are what we actually want to see in git
history.
And your argument that changes should be merged to reduce review
effort would actually mean to drop all the code you introduce which is
later removed in my patches (see below for the diff stats).
I also don't think we need to re-review your patches. Most of it has
been reviewed and should also not change much to avoid rebase
conflicts. In my point of view they are fine to be applied to a perf
topic branch. Ingo, would this be ok? There is no messy history if we
later just apply my patches on top. So no, I don't agree with you here
to merge some of my patches.
-Robert
$ git diff tip/master...ras --stat kernel/
kernel/events/Makefile | 2 +-
kernel/events/core.c | 56 +++++++++++++++++++----------------
kernel/events/internal.h | 4 +++
kernel/events/persistent.c | 175 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
kernel/events/ring_buffer.c | 7 ++---
5 files changed, 214 insertions(+), 30 deletions(-)
$ git diff ras...persistent --stat kernel/
kernel/events/core.c | 6 ++-
kernel/events/internal.h | 1 -
kernel/events/persistent.c | 292 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------
3 files changed, 221 insertions(+), 78 deletions(-)
On Sat, Jun 01, 2013 at 06:15:55PM +0200, Robert Richter wrote:
> Instead I reviewed the code while developing it with certain goals in
> mind. Most changes I found necessary while building and running a
> modified version during development. That never could be found in a
> patch review. These changes are what we actually want to see in git
> history.
No, we don't. Not if my patches were never a baseline to begin with.
> And your argument that changes should be merged to reduce review
> effort would actually mean to drop all the code you introduce which is
> later removed in my patches (see below for the diff stats).
Yes, absolutely, if we have to. My version is an initial functionality
proposal and nothing more. It is Work-In-Progress, and, as such, is
very much a subject to change, or even a complete rewrite - whatever it
takes.
> I also don't think we need to re-review your patches. Most of it has
> been reviewed and should also not change much to avoid rebase
No, my patches haven't really been reviewed since they've been a moving
target from the very beginning. IOW, you simply stating that fixing
bugs against code which hasn't been the base line and agreed upon
functionality in the first place, is simply unnecessary churn.
If you want to document the goals you had in mind, just write that as
a comment in the code if it is very important or put it in a commit
message.
IOW, my patches were never ready but RFC. Nothing more.
> conflicts. In my point of view they are fine to be applied to a perf
> topic branch. Ingo, would this be ok? There is no messy history if we
> later just apply my patches on top. So no, I don't agree with you here
> to merge some of my patches.
Again, my patches are simply RFC and a moving target functionality-wise.
If you feel that something was done not in the way it should be done,
then pieces of maybe even them all should be thrown away and done right.
Simply redoing them or asking me to change them is all I'm saying.
Thanks.
On Fri, May 31, 2013 at 10:47:36AM +0200, Robert Richter wrote:
> From: Robert Richter <[email protected]>
SNIP
> out:
> mutex_unlock(&per_cpu(pers_events_lock, cpu));
> }
> @@ -182,18 +195,31 @@ fail:
> int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
> {
> struct pers_event_desc *desc;
> + struct perf_event *event;
> int event_fd = -ENODEV;
>
> mutex_lock(&per_cpu(pers_events_lock, cpu));
maybe check for valid cpu, since perf_get_persistent_event_fd is
called directly from syscall allowing anything in cpu
(unrelated to this patch, but I couldn't find the original
patch that adds perf_get_persistent_event_fd)
jirka
On Mon, Jun 03, 2013 at 03:49:25PM +0200, Jiri Olsa wrote:
> maybe check for valid cpu, since perf_get_persistent_event_fd is
> called directly from syscall allowing anything in cpu
That should be fine - we're traversing a per-cpu list of events there.
> (unrelated to this patch, but I couldn't find the original
> patch that adds perf_get_persistent_event_fd)
Try this:
http://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/log/?h=persistent5.1
:-)
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On Tue, Jun 04, 2013 at 11:19:21AM +0200, Jiri Olsa wrote:
> hum, can't see it:
>
> perf SYSCALL:
> ...
> return perf_get_persistent_event_fd(cpu, &attr);
>
> ...
> int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
> {
> struct pers_event_desc *desc;
> struct perf_event *event;
> int event_fd = -ENODEV;
>
> mutex_lock(&per_cpu(pers_events_lock, cpu));
> ...
>
> 'cpu' did not pass ay check here..
Oh, I see.
You mean something like
if (cpu < NR_CPUS && cpu_online(cpu))
?
I guess that should cover your concerns...
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On Tue, Jun 04, 2013 at 10:20:42AM +0200, Borislav Petkov wrote:
> On Mon, Jun 03, 2013 at 03:49:25PM +0200, Jiri Olsa wrote:
> > maybe check for valid cpu, since perf_get_persistent_event_fd is
> > called directly from syscall allowing anything in cpu
>
> That should be fine - we're traversing a per-cpu list of events there.
hum, can't see it:
perf SYSCALL:
...
return perf_get_persistent_event_fd(cpu, &attr);
...
int perf_get_persistent_event_fd(unsigned cpu, struct perf_event_attr *attr)
{
struct pers_event_desc *desc;
struct perf_event *event;
int event_fd = -ENODEV;
mutex_lock(&per_cpu(pers_events_lock, cpu));
...
'cpu' did not pass ay check here..
jirka
On 04.06.13 11:35:22, Borislav Petkov wrote:
> On Tue, Jun 04, 2013 at 11:19:21AM +0200, Jiri Olsa wrote:
> > 'cpu' did not pass ay check here..
>
> Oh, I see.
>
> You mean something like
>
> if (cpu < NR_CPUS && cpu_online(cpu))
In perf_event_alloc() there is:
if ((unsigned)cpu >= nr_cpu_ids) {
if (!task || cpu != -1)
return ERR_PTR(-EINVAL);
}
So this should be sufficient:
if ((unsigned)cpu >= nr_cpu_ids)
...
-Robert