2012-02-14 04:56:20

by Luigi Semenzato

[permalink] [raw]
Subject: [PATCH] Perf: bug fix: distinguish between rename and exec

This makes it possible for "perf report" to distinguish between an exec and
a rename (for instance from prctl(PR_SET_NAME)). Currently a similar COMM
record is produced for the two events. Perf report assumes all COMM records
are execs and discards the old mappings. Without mappings, perf report
can no longer correlate sampled IPs to the functions containing them,
and collapses all samples into a single bucket.

This change maintains backward compatibility in both directions, i.e. old
version of perf will continue to work on new perf.data files in the same
way, and new versions of perf on old data files.

Another solution would be to not send COMM records for renames. Although
it seems reasonable, it might break existing setups.

Signed-off-by: Luigi Semenzato <[email protected]>
---
fs/exec.c | 6 +++---
fs/proc/base.c | 2 +-
include/linux/perf_event.h | 9 +++++++--
include/linux/sched.h | 2 +-
kernel/events/core.c | 4 ++--
kernel/kthread.c | 2 +-
kernel/sys.c | 2 +-
tools/perf/util/event.c | 4 +++-
tools/perf/util/session.c | 2 +-
tools/perf/util/thread.c | 11 ++++++-----
tools/perf/util/thread.h | 2 +-
11 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 92ce83a..fc2792d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1052,7 +1052,7 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
}
EXPORT_SYMBOL_GPL(get_task_comm);

-void set_task_comm(struct task_struct *tsk, char *buf)
+void set_task_comm(struct task_struct *tsk, char *buf, bool is_rename)
{
task_lock(tsk);

@@ -1068,7 +1068,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
wmb();
strlcpy(tsk->comm, buf, sizeof(tsk->comm));
task_unlock(tsk);
- perf_event_comm(tsk);
+ perf_event_comm(tsk, is_rename);
}

static void filename_to_taskname(char *tcomm, const char *fn, unsigned int len)
@@ -1142,7 +1142,7 @@ void setup_new_exec(struct linux_binprm * bprm)
else
set_dumpable(current->mm, suid_dumpable);

- set_task_comm(current, bprm->tcomm);
+ set_task_comm(current, bprm->tcomm, false);

/* Set the new mm task size. We have to do that late because it may
* depend on TIF_32BIT which is only updated in flush_thread() on
diff --git a/fs/proc/base.c b/fs/proc/base.c
index d4548dd..cd2f609 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1361,7 +1361,7 @@ static ssize_t comm_write(struct file *file, const char __user *buf,
return -ESRCH;

if (same_thread_group(current, p))
- set_task_comm(p, buffer);
+ set_task_comm(p, buffer, true);
else
count = -EINVAL;

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index abb2776..3007ef0 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -328,6 +328,11 @@ struct perf_event_mmap_page {
*/
#define PERF_RECORD_MISC_EXACT_IP (1 << 14)
/*
+ * Indicates that a PERF_RECORD_COMM is a rename (prctl) rather than an exec.
+ * Note: this bit position has different meanings for different record types.
+ */
+#define PERF_RECORD_MISC_RENAME (1 << 14)
+/*
* Reserve the last bit to indicate some extended misc field
*/
#define PERF_RECORD_MISC_EXT_RESERVED (1 << 15)
@@ -1089,7 +1094,7 @@ extern struct perf_guest_info_callbacks *perf_guest_cbs;
extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);

-extern void perf_event_comm(struct task_struct *tsk);
+extern void perf_event_comm(struct task_struct *tsk, bool is_rename);
extern void perf_event_fork(struct task_struct *tsk);

/* Callchains */
@@ -1179,7 +1184,7 @@ static inline int perf_unregister_guest_info_callbacks
(struct perf_guest_info_callbacks *callbacks) { return 0; }

static inline void perf_event_mmap(struct vm_area_struct *vma) { }
-static inline void perf_event_comm(struct task_struct *tsk) { }
+static inline void perf_event_comm(struct task_struct *tsk, bool is_rename) { }
static inline void perf_event_fork(struct task_struct *tsk) { }
static inline void perf_event_init(void) { }
static inline int perf_swevent_get_recursion_context(void) { return -1; }
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7d379a6..0283de4 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2293,7 +2293,7 @@ extern int do_execve(const char *,
extern long do_fork(unsigned long, unsigned long, struct pt_regs *, unsigned long, int __user *, int __user *);
struct task_struct *fork_idle(int);

-extern void set_task_comm(struct task_struct *tsk, char *from);
+extern void set_task_comm(struct task_struct *tsk, char *from, bool is_rename);
extern char *get_task_comm(char *to, struct task_struct *tsk);

#ifdef CONFIG_SMP
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ba36013..879d7a9 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4250,7 +4250,7 @@ next:
rcu_read_unlock();
}

-void perf_event_comm(struct task_struct *task)
+void perf_event_comm(struct task_struct *task, bool is_rename)
{
struct perf_comm_event comm_event;
struct perf_event_context *ctx;
@@ -4274,7 +4274,7 @@ void perf_event_comm(struct task_struct *task)
.event_id = {
.header = {
.type = PERF_RECORD_COMM,
- .misc = 0,
+ .misc = is_rename ? PERF_RECORD_MISC_RENAME : 0,
/* .size */
},
/* .pid */
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 3d3de63..cacda74 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -277,7 +277,7 @@ int kthreadd(void *unused)
struct task_struct *tsk = current;

/* Setup a clean context for our children to inherit. */
- set_task_comm(tsk, "kthreadd");
+ set_task_comm(tsk, "kthreadd", false);
ignore_signals(tsk);
set_cpus_allowed_ptr(tsk, cpu_all_mask);
set_mems_allowed(node_states[N_HIGH_MEMORY]);
diff --git a/kernel/sys.c b/kernel/sys.c
index 4070153..a0f42a4 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1879,7 +1879,7 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
if (strncpy_from_user(comm, (char __user *)arg2,
sizeof(me->comm) - 1) < 0)
return -EFAULT;
- set_task_comm(me, comm);
+ set_task_comm(me, comm, true);
proc_comm_connector(me);
return 0;
case PR_GET_NAME:
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 73ddaf0..ba21a63 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -502,11 +502,13 @@ int perf_event__process_comm(struct perf_tool *tool __used,
struct machine *machine)
{
struct thread *thread = machine__findnew_thread(machine, event->comm.tid);
+ bool is_rename = event->header.misc & PERF_EVENT_MISC_RENAME;

if (dump_trace)
perf_event__fprintf_comm(event, stdout);

- if (thread == NULL || thread__set_comm(thread, event->comm.comm)) {
+ if (thread == NULL || thread__set_comm(thread, event->comm.comm,
+ is_rename)) {
dump_printf("problem processing PERF_RECORD_COMM, skipping event.\n");
return -1;
}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index b5ca255..f263875 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -924,7 +924,7 @@ static struct thread *perf_session__register_idle_thread(struct perf_session *se
{
struct thread *thread = perf_session__findnew(self, 0);

- if (thread == NULL || thread__set_comm(thread, "swapper")) {
+ if (thread == NULL || thread__set_comm(thread, "swapper", false)) {
pr_err("problem inserting idle task.\n");
thread = NULL;
}
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index fb4b7ea..be8063c 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -29,7 +29,7 @@ void thread__delete(struct thread *self)
free(self);
}

-int thread__set_comm(struct thread *self, const char *comm)
+int thread__set_comm(struct thread *self, const char *comm, bool is_rename)
{
int err;

@@ -37,11 +37,12 @@ int thread__set_comm(struct thread *self, const char *comm)
free(self->comm);
self->comm = strdup(comm);
err = self->comm == NULL ? -ENOMEM : 0;
- if (!err) {
- self->comm_set = true;
+ if (err)
+ return err;
+ self->comm_set = true;
+ if (!is_rename)
map_groups__flush(&self->mg);
- }
- return err;
+ return 0;
}

int thread__comm_len(struct thread *self)
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 70c2c13..758df3a 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -22,7 +22,7 @@ struct machine;

void thread__delete(struct thread *self);

-int thread__set_comm(struct thread *self, const char *comm);
+int thread__set_comm(struct thread *self, const char *comm, bool is_rename);
int thread__comm_len(struct thread *self);
void thread__insert_map(struct thread *self, struct map *map);
int thread__fork(struct thread *self, struct thread *parent);
--
1.7.7.3


2012-02-15 12:49:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On Mon, 2012-02-13 at 20:56 -0800, Luigi Semenzato wrote:
> This makes it possible for "perf report" to distinguish between an exec and
> a rename (for instance from prctl(PR_SET_NAME)). Currently a similar COMM
> record is produced for the two events. Perf report assumes all COMM records
> are execs and discards the old mappings. Without mappings, perf report
> can no longer correlate sampled IPs to the functions containing them,
> and collapses all samples into a single bucket.
>
> This change maintains backward compatibility in both directions, i.e. old
> version of perf will continue to work on new perf.data files in the same
> way, and new versions of perf on old data files.
>
> Another solution would be to not send COMM records for renames. Although
> it seems reasonable, it might break existing setups.

Uhm, didn't you argue its already broken?

> +++ b/fs/exec.c
> @@ -1052,7 +1052,7 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
> }
> EXPORT_SYMBOL_GPL(get_task_comm);
>
> -void set_task_comm(struct task_struct *tsk, char *buf)
> +void set_task_comm(struct task_struct *tsk, char *buf, bool is_rename)
> {
> task_lock(tsk);
>
> @@ -1068,7 +1068,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
> wmb();
> strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> task_unlock(tsk);
> - perf_event_comm(tsk);
> + perf_event_comm(tsk, is_rename);
> }

I really dislike changing generic code purely for the purpose of
instrumentation like this. Better to pull perf_event_comm() out of here
if you want to change semantics.

Personally I couldn't care less about renames, I think they're a waste
of time, so I'm ok with the simple patch moving the perf_event_comm()
into setup_new_exec() and possibly renaming it to perf_event_exec().

Acme, do you care about renames?

2012-02-15 13:47:47

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

Em Wed, Feb 15, 2012 at 01:48:33PM +0100, Peter Zijlstra escreveu:
> I really dislike changing generic code purely for the purpose of
> instrumentation like this. Better to pull perf_event_comm() out of here
> if you want to change semantics.
>
> Personally I couldn't care less about renames, I think they're a waste
> of time, so I'm ok with the simple patch moving the perf_event_comm()
> into setup_new_exec() and possibly renaming it to perf_event_exec().
>
> Acme, do you care about renames?

I like your idea of keeping the semantics of PERF_RECORD_COMM and
introducing a PERF_RECORD_EXEC, just have to think about how to handle
that in a way that the tools detect that we have PERF_RECORD_EXEC...

Humm, will be yet another fallback for setting an perf_event_attr bit,
just like with .sample_id_all and .exclude_{guest,host}...

That together with the per class errnos + __strerror() method will allow
to move all the event creation finally to perf_evlist__open() where all
this gets nicely hidden away from poor tools.

We can then even have an ui__evlist_perror() method that does all the
ui__warning calls, etc.

So, yes, from a tooling perspective, I want to be notified of renames
and being able to stop relying on PERF_RECORD_COMM to call
map_groups__flush and instead do it at PERF_RECORD_EXEC seems a
bonus.

- Arnaldo

2012-02-15 16:57:40

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On 2/15/12 5:48 AM, Peter Zijlstra wrote:
> On Mon, 2012-02-13 at 20:56 -0800, Luigi Semenzato wrote:
>> This makes it possible for "perf report" to distinguish between an exec and
>> a rename (for instance from prctl(PR_SET_NAME)). Currently a similar COMM
>> record is produced for the two events. Perf report assumes all COMM records
>> are execs and discards the old mappings. Without mappings, perf report
>> can no longer correlate sampled IPs to the functions containing them,
>> and collapses all samples into a single bucket.
>>
>> This change maintains backward compatibility in both directions, i.e. old
>> version of perf will continue to work on new perf.data files in the same
>> way, and new versions of perf on old data files.
>>
>> Another solution would be to not send COMM records for renames. Although
>> it seems reasonable, it might break existing setups.
>
> Uhm, didn't you argue its already broken?
>
>> +++ b/fs/exec.c
>> @@ -1052,7 +1052,7 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
>> }
>> EXPORT_SYMBOL_GPL(get_task_comm);
>>
>> -void set_task_comm(struct task_struct *tsk, char *buf)
>> +void set_task_comm(struct task_struct *tsk, char *buf, bool is_rename)
>> {
>> task_lock(tsk);
>>
>> @@ -1068,7 +1068,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
>> wmb();
>> strlcpy(tsk->comm, buf, sizeof(tsk->comm));
>> task_unlock(tsk);
>> - perf_event_comm(tsk);
>> + perf_event_comm(tsk, is_rename);
>> }
>
> I really dislike changing generic code purely for the purpose of
> instrumentation like this. Better to pull perf_event_comm() out of here
> if you want to change semantics.
>
> Personally I couldn't care less about renames, I think they're a waste
> of time, so I'm ok with the simple patch moving the perf_event_comm()
> into setup_new_exec() and possibly renaming it to perf_event_exec().
>
> Acme, do you care about renames?
>

I'm not Acme, but I do care. We use a lot of processes with named
threads that give users an idea about the function of a particular thread.

I do not understand the use case targeted by the patch -- what kind of
processes run for some amount of time and then decide to change task names?

David

2012-02-15 17:08:09

by Luigi Semenzato

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On Wed, Feb 15, 2012 at 5:47 AM, Arnaldo Carvalho de Melo
<[email protected]> wrote:
> Em Wed, Feb 15, 2012 at 01:48:33PM +0100, Peter Zijlstra escreveu:
>> I really dislike changing generic code purely for the purpose of
>> instrumentation like this. Better to pull perf_event_comm() out of here
>> if you want to change semantics.
>>
>> Personally I couldn't care less about renames, I think they're a waste
>> of time, so I'm ok with the simple patch moving the perf_event_comm()
>> into setup_new_exec() and possibly renaming it to perf_event_exec().
>>
>> Acme, do you care about renames?
>
> I like your idea of keeping the semantics of PERF_RECORD_COMM and
> introducing a PERF_RECORD_EXEC, just have to think about how to handle
> that in a way that the tools detect that we have PERF_RECORD_EXEC...

I considered this but I don't know how important it is to be backward
compatible. Adding a new record type makes old "perf report" fail to
parse new perf.data files. (Unless we pad the new record to a
multiple of 8 bytes, but I don't think we want to go down that path).

If looking forward is more important, I agree a new new record type is
best. We might want to consider adding a PERF_RECORD_RENAME for
renames, and leaving the COMM record to its historical meaning (exec),
possibly renaming it to PERF_RECORD_EXEC for clarity. And yes, the
perf instrumentation should not be in set_task_comm(), that's why the
bug exists in the first place.

We might also want to change the parsing of perf.data so that in the
future it is more tolerant of new record types.

>
> Humm, will be yet another fallback for setting an perf_event_attr bit,
> just like with .sample_id_all and .exclude_{guest,host}...
>
> That together with the per class errnos + __strerror() method will allow
> to move all the event creation finally to perf_evlist__open() where all
> this gets nicely hidden away from poor tools.
>
> We can then even have an ui__evlist_perror() method that does all the
> ui__warning calls, etc.
>
> So, yes, from a tooling perspective, I want to be notified of renames
> and being able to stop relying on PERF_RECORD_COMM to call
> map_groups__flush and instead do it at PERF_RECORD_EXEC seems a
> bonus.
>
> - Arnaldo

2012-02-15 17:22:54

by Luigi Semenzato

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On Wed, Feb 15, 2012 at 8:57 AM, David Ahern <[email protected]> wrote:
> On 2/15/12 5:48 AM, Peter Zijlstra wrote:
>>
>> On Mon, 2012-02-13 at 20:56 -0800, Luigi Semenzato wrote:
>>>
>>> This makes it possible for "perf report" to distinguish between an exec
>>> and
>>> a rename (for instance from prctl(PR_SET_NAME)). ?Currently a similar
>>> COMM
>>> record is produced for the two events. ?Perf report assumes all COMM
>>> records
>>> are execs and discards the old mappings. ?Without mappings, perf report
>>> can no longer correlate sampled IPs to the functions containing them,
>>> and collapses all samples into a single bucket.
>>>
>>> This change maintains backward compatibility in both directions, i.e. old
>>> version of perf will continue to work on new perf.data files in the same
>>> way, and new versions of perf on old data files.
>>>
>>> Another solution would be to not send COMM records for renames. ?Although
>>> it seems reasonable, it might break existing setups.
>>
>>
>> Uhm, didn't you argue its already broken?
>>
>>> +++ b/fs/exec.c
>>> @@ -1052,7 +1052,7 @@ char *get_task_comm(char *buf, struct task_struct
>>> *tsk)
>>> ?}
>>> ?EXPORT_SYMBOL_GPL(get_task_comm);
>>>
>>> -void set_task_comm(struct task_struct *tsk, char *buf)
>>> +void set_task_comm(struct task_struct *tsk, char *buf, bool is_rename)
>>> ?{
>>> ? ? ? ?task_lock(tsk);
>>>
>>> @@ -1068,7 +1068,7 @@ void set_task_comm(struct task_struct *tsk, char
>>> *buf)
>>> ? ? ? ?wmb();
>>> ? ? ? ?strlcpy(tsk->comm, buf, sizeof(tsk->comm));
>>> ? ? ? ?task_unlock(tsk);
>>> - ? ? ? perf_event_comm(tsk);
>>> + ? ? ? perf_event_comm(tsk, is_rename);
>>> ?}
>>
>>
>> I really dislike changing generic code purely for the purpose of
>> instrumentation like this. Better to pull perf_event_comm() out of here
>> if you want to change semantics.
>>
>> Personally I couldn't care less about renames, I think they're a waste
>> of time, so I'm ok with the simple patch moving the perf_event_comm()
>> into setup_new_exec() and possibly renaming it to perf_event_exec().
>>
>> Acme, do you care about renames?
>>
>
> I'm not Acme, but I do care. We use a lot of processes with named threads
> that give users an idea about the function of a particular thread.
>
> I do not understand the use case targeted by the patch -- what kind of
> processes run for some amount of time and then decide to change task names?

Any process that makes a prctl(PR_SET_NAME) call loses its mappings,
no matter when makes the call. The perf records for that process look
like this:

COMM (for the initial exec)
MMAP (the executable)
MMAP (1st dll)
MMAP (2nd dll)
...
COMM (for the prctl)

The second COMM flushes the old mappings, and all samples from then on
cannot be classified. This is easily reproducible with a small
program, which I would be happy to send.

I found this while trying to use perf with Chrome on Chrome OS.
Chrome forks and execs all the time, and calls prctl() in the thread
library.

2012-02-15 17:30:18

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On 2/15/12 10:22 AM, Luigi Semenzato wrote:
>
> Any process that makes a prctl(PR_SET_NAME) call loses its mappings,
> no matter when makes the call. The perf records for that process look
> like this:
>
> COMM (for the initial exec)
> MMAP (the executable)
> MMAP (1st dll)
> MMAP (2nd dll)
> ...
> COMM (for the prctl)
>
> The second COMM flushes the old mappings, and all samples from then on
> cannot be classified. This is easily reproducible with a small
> program, which I would be happy to send.

Ok, I see that now.

Thanks,
David

>
> I found this while trying to use perf with Chrome on Chrome OS.
> Chrome forks and execs all the time, and calls prctl() in the thread
> library.

2012-02-15 17:31:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On Wed, 2012-02-15 at 09:57 -0700, David Ahern wrote:
>
> I'm not Acme, but I do care. We use a lot of processes with named
> threads that give users an idea about the function of a particular
> thread.
>
But why would they care? If you're debugging its easy enough to see from
the backtrace and if you're not, most tools like top/ps don't even show
threads (by default).

So who cares what threads are called.

I realize I'm not going to convince anybody, but I genuinely don't see
the point of naming threads.

2012-02-15 17:48:03

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

Em Wed, Feb 15, 2012 at 09:07:47AM -0800, Luigi Semenzato escreveu:
> On Wed, Feb 15, 2012 at 5:47 AM, Arnaldo Carvalho de Melo
> <[email protected]> wrote:
> > Em Wed, Feb 15, 2012 at 01:48:33PM +0100, Peter Zijlstra escreveu:
> >> I really dislike changing generic code purely for the purpose of
> >> instrumentation like this. Better to pull perf_event_comm() out of here
> >> if you want to change semantics.
> >>
> >> Personally I couldn't care less about renames, I think they're a waste
> >> of time, so I'm ok with the simple patch moving the perf_event_comm()
> >> into setup_new_exec() and possibly renaming it to perf_event_exec().
> >>
> >> Acme, do you care about renames?
> >
> > I like your idea of keeping the semantics of PERF_RECORD_COMM and
> > introducing a PERF_RECORD_EXEC, just have to think about how to handle
> > that in a way that the tools detect that we have PERF_RECORD_EXEC...
>
> I considered this but I don't know how important it is to be backward
> compatible. Adding a new record type makes old "perf report" fail to
> parse new perf.data files. (Unless we pad the new record to a

Hey, old perf record would still see the PERF_RECORD_COMM, i.e. we would
continue asking for PERF_RECORD_COMM in new versions. Together with
PERF_RECORD_EXEC.

> multiple of 8 bytes, but I don't think we want to go down that path).
>
> If looking forward is more important, I agree a new new record type is
> best. We might want to consider adding a PERF_RECORD_RENAME for

PERF_RECORD_COMM is good enough, well, it always was confusing for most
people that asked "hey, that means an EXEC, right?"

First thing pople think is "hey, this is when it sets the thread COMM,
right?"

> renames, and leaving the COMM record to its historical meaning (exec),
> possibly renaming it to PERF_RECORD_EXEC for clarity. And yes, the
> perf instrumentation should not be in set_task_comm(), that's why the
> bug exists in the first place.
>
> We might also want to change the parsing of perf.data so that in the
> future it is more tolerant of new record types.

Yes, what is the behaviour now? Lemme see... Well, difficult, I'm barely
reading this, just after magnifying it, dilated pupils two hours ago,
grrr

Wiĺl check later, but IIRC it just warns and skips the record, right?

> > Humm, will be yet another fallback for setting an perf_event_attr bit,
> > just like with .sample_id_all and .exclude_{guest,host}...
> >
> > That together with the per class errnos + __strerror() method will allow
> > to move all the event creation finally to perf_evlist__open() where all
> > this gets nicely hidden away from poor tools.
> >
> > We can then even have an ui__evlist_perror() method that does all the
> > ui__warning calls, etc.
> >
> > So, yes, from a tooling perspective, I want to be notified of renames
> > and being able to stop relying on PERF_RECORD_COMM to call
> > map_groups__flush and instead do it at PERF_RECORD_EXEC seems a
> > bonus.
> >
> > - Arnaldo

2012-02-15 18:07:38

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

Em Wed, Feb 15, 2012 at 06:59:38PM +0100, Stephane Eranian escreveu:
> On Feb 15, 2012 1:48 PM, "Peter Zijlstra" <[email protected]> wrote:
> > I really dislike changing generic code purely for the purpose of
> > instrumentation like this. Better to pull perf_event_comm() out of here
> > if you want to change semantics.
> >
> > Personally I couldn't care less about renames, I think they're a waste
> > of time, so I'm ok with the simple patch moving the perf_event_comm()
> > into setup_new_exec() and possibly renaming it to perf_event_exec().

> I tend to agree with you on this.
> Is there anything that depends on renames
> In perf?

Look at thread__set_comm, not intuitive, goes back to the original
problem description by Luigi.

- Arnaldo

2012-02-15 18:55:58

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] Perf: bug fix: distinguish between rename and exec

On 2/15/12 10:30 AM, Peter Zijlstra wrote:
> On Wed, 2012-02-15 at 09:57 -0700, David Ahern wrote:
>>
>> I'm not Acme, but I do care. We use a lot of processes with named
>> threads that give users an idea about the function of a particular
>> thread.
>>
> But why would they care? If you're debugging its easy enough to see from
> the backtrace and if you're not, most tools like top/ps don't even show
> threads (by default).
>
> So who cares what threads are called.
>
> I realize I'm not going to convince anybody, but I genuinely don't see
> the point of naming threads.

Very subjective. How fast do you want your users/tech
staff/developers/QA testers to make sense of what is going on? Which is
more informative (process and thread names sanitized)

$ top -d5

top - 18:39:49 up 23:46, 1 user, load average: 0.75, 0.37, 0.24
Cpu(s): 11.0%us, 19.0%sy, 0.0%ni, 47.8%id, 0.0%wa, 0.3%hi, 21.8%si,
0.0%st

PID USER S %CPU %MEM TIME+ COMMAND
3830 root S 62.4 1.1 1:16.68 myapp


Wow, myapp is sucking up CPU. I wonder what it's doing. 'H'.


With non-named threads:

top - 18:40:36 up 23:47, 1 user, load average: 0.89, 0.47, 0.28
Cpu(s): 15.5%us, 22.4%sy, 0.0%ni, 45.7%id, 0.0%wa, 0.6%hi, 15.9%si,
0.0%st

PID USER S %CPU %MEM TIME+ COMMAND
3891 root S 44.8 1.1 0:12.39 myapp
3879 root R 14.6 1.1 0:26.07 myapp
3893 root S 8.7 1.1 0:08.62 myapp

Ok, so 3 threads are dominating the CPU. I guess I need to hook up gdb
to find out which functional areas those threads are running. Or, for
one product I worked on you have to know that the thread map is dumped
to a file, get shell access, read it and mentally correlate lwps.

Or you can name the threads:

top - 18:40:36 up 23:47, 1 user, load average: 0.89, 0.47, 0.28
Cpu(s): 15.5%us, 22.4%sy, 0.0%ni, 45.7%id, 0.0%wa, 0.6%hi, 15.9%si,
0.0%st

PID USER S %CPU %MEM TIME+ COMMAND
3891 root S 44.8 1.1 0:12.39 myapp:dispatch
3879 root R 14.6 1.1 0:26.07 myapp:my-mgr
3893 root S 8.7 1.1 0:08.62 myapp:worker

Sure process knowledge is needed but top gives a lot more information
for named threads.

Even with perf having named threads makes the event dumps 100x more
understandable because the comm name gives you a hint about what the
task is does.

David