Ok. Tried to take in the feedback on the first patch and see what I
could do. The comm is now set via /proc/pid/comm
or /proc/pid/task/tid/comm, per Andi's request.
Also I think the reference issues Andrew pointed out are fixed.
As for the unlocked self-access of current->comm without the
task_lock(), I've sort of hacked something in here, and I'm curious how
horrible people feel it is.
Basically we just carefully write to the string, so unlocked access see
either the old name, the new name, or an empty string. The empty string
is very transient and would only be visible while the write was
occurring. This avoids the issue of a short name being overwritten by a
long name causing a reader overruns if it reads right as the short
name's null is overwritten but before the long name's null is written.
Although, I need to figure out if a memory barrier is needed to make
that really correct.
So yea. Its a little gross. But figured I'd throw it out there.
If there's any other issues I've missed, please let me know.
thanks
-john
========================
Setting a thread's comm to be something unique is a very useful ability
and is helpful for debugging complicated threaded applications. However
currently the only way to set a thread name is for the thread to name
itself via the PR_SET_NAME prctl.
However, there may be situations where it would be advantageous for a
thread dispatcher to be naming the threads its managing, rather then
having the threads self-describe themselves. This sort of behavior is
available on other systems via the pthread_setname_np() interface.
This patch exports a task's comm via proc/pid/comm and
proc/pid/task/tid/comm interfaces, and allows thread siblings to write
to these values.
Signed-off-by: John Stultz <[email protected]>
diff --git a/fs/exec.c b/fs/exec.c
index d49be6b..dc4bc87 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -926,7 +926,17 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
void set_task_comm(struct task_struct *tsk, char *buf)
{
task_lock(tsk);
- strlcpy(tsk->comm, buf, sizeof(tsk->comm));
+
+ /*
+ * Threads may access current->comm without holding
+ * the task lock, so write the string carefully
+ * to avoid non-terminating reads. Readers without a lock
+ * with get the oldname, the newname or an empty string.
+ */
+ tsk->comm[0] = NULL;
+ /* XXX - Need an mb() here?*/
+ strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
+ tsk->comm[0] = buf[0];
task_unlock(tsk);
perf_event_comm(tsk);
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 837469a..bc79061 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1265,6 +1265,79 @@ static const struct file_operations proc_pid_sched_operations = {
#endif
+
+
+static ssize_t
+comm_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *offset)
+{
+ struct inode *inode = file->f_path.dentry->d_inode;
+ struct task_struct *p;
+ char buffer[TASK_COMM_LEN];
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ if (same_thread_group(current, p))
+ set_task_comm(p, buffer);
+ else
+ count = -EINVAL;
+
+ put_task_struct(p);
+
+ return count;
+}
+
+
+static int comm_show(struct seq_file *m, void *v)
+{
+ struct inode *inode = m->private;
+ struct task_struct *p;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ task_lock(p);
+ seq_printf(m, "%s\n", p->comm);
+ task_unlock(p);
+
+ put_task_struct(p);
+
+ return 0;
+}
+
+static int comm_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = single_open(filp, comm_show, NULL);
+ if (!ret) {
+ struct seq_file *m = filp->private_data;
+
+ m->private = inode;
+ }
+ return ret;
+}
+
+
+static const struct file_operations proc_pid_set_comm_operations = {
+ .open = comm_open,
+ .read = seq_read,
+ .write = comm_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+
/*
* We added or removed a vma mapping the executable. The vmas are only mapped
* during exec and are not mapped with the mmap system call.
@@ -2504,6 +2577,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
@@ -2839,6 +2913,7 @@ static const struct pid_entry tid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
Hi John,
Just a couple nitpics really, looks pretty good to me - other than the
need for the wmb()s below.
john stultz wrote:
> This patch exports a task's comm via proc/pid/comm and
> proc/pid/task/tid/comm interfaces, and allows thread siblings to write
> to these values.
And the parent I presume?
> + /*
> + * Threads may access current->comm without holding
> + * the task lock, so write the string carefully
> + * to avoid non-terminating reads. Readers without a lock
> + * with get the oldname, the newname or an empty string.
s/with/will/
s/oldname/old name/ (it isn't a variable right?)
s/newname/new name/ (it isn't a variable right?)
> + */
> + tsk->comm[0] = NULL;
> + /* XXX - Need an mb() here?*/
I believe you do, yes. Now, which one... hrm... checking... You only
care about ensuring the the comm[0] store occurs BEFORE the strlcpy.
But, if no lock is held here, you can be preempted, so this is important
for both UP and SMP. I believe what you need here is:
wmb()
Memory barrier experts, please enlighten us if I am missing something.
> + strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
And one more here I should think, otherwise that could effectively undo
the previous one :-)
wmb()
> + tsk->comm[0] = buf[0];
> task_unlock(tsk);
To be clear, we hold the lock to prevent other threads from changing
this at the same time as us - any other thread but the target thread
that is?
> +static ssize_t
> +comm_write(struct file *file, const char __user *buf,
> + size_t count, loff_t *offset)
> +{
> + struct inode *inode = file->f_path.dentry->d_inode;
> + struct task_struct *p;
> + char buffer[TASK_COMM_LEN];
> +
> + memset(buffer, 0, sizeof(buffer));
What purpose does zeroing this entire buffer serve?
> + if (count > sizeof(buffer) - 1)
> + count = sizeof(buffer) - 1;
> + if (copy_from_user(buffer, buf, count))
> + return -EFAULT;
> +
Extra whitespace
> +
> + p = get_proc_task(inode);
> + if (!p)
> + return -ESRCH;
> +
> + if (same_thread_group(current, p))
> + set_task_comm(p, buffer);
> + else
> + count = -EINVAL;
> +
> + put_task_struct(p);
> +
> + return count;
> +}
Thanks,
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
On Fri, 2009-10-23 at 20:45 -0700, Darren Hart wrote:
> Hi John,
>
> Just a couple nitpics really, looks pretty good to me - other than the
> need for the wmb()s below.
>
> john stultz wrote:
> > + /*
> > + * Threads may access current->comm without holding
> > + * the task lock, so write the string carefully
> > + * to avoid non-terminating reads. Readers without a lock
> > + * with get the oldname, the newname or an empty string.
>
> s/with/will/
> s/oldname/old name/ (it isn't a variable right?)
> s/newname/new name/ (it isn't a variable right?)
Fixed. Thanks
> > + */
> > + tsk->comm[0] = NULL;
> > + /* XXX - Need an mb() here?*/
>
> I believe you do, yes. Now, which one... hrm... checking... You only
> care about ensuring the the comm[0] store occurs BEFORE the strlcpy.
> But, if no lock is held here, you can be preempted, so this is important
> for both UP and SMP. I believe what you need here is:
>
> wmb()
>
> Memory barrier experts, please enlighten us if I am missing something.
>
> > + strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
>
> And one more here I should think, otherwise that could effectively undo
> the previous one :-)
>
> wmb()
Cool. Added. If anyone sees anything incorrect here, please let me know.
> > + tsk->comm[0] = buf[0];
> > task_unlock(tsk);
>
> To be clear, we hold the lock to prevent other threads from changing
> this at the same time as us - any other thread but the target thread
> that is?
Right, so in order to change the comm, you have to hold the task_lock
(even if your changing your own). The issue I'm trying to address is
the threads self-referencing their comm without holding the task_lock.
We don't want to slow them down by adding additional locking around
every current->comm access, but we want to allow other threads to modify
the comm.
> > +static ssize_t
> > +comm_write(struct file *file, const char __user *buf,
> > + size_t count, loff_t *offset)
> > +{
> > + struct inode *inode = file->f_path.dentry->d_inode;
> > + struct task_struct *p;
> > + char buffer[TASK_COMM_LEN];
> > +
> > + memset(buffer, 0, sizeof(buffer));
>
> What purpose does zeroing this entire buffer serve?
Just make sure we always terminate with a null.
> > + if (count > sizeof(buffer) - 1)
> > + count = sizeof(buffer) - 1;
> > + if (copy_from_user(buffer, buf, count))
> > + return -EFAULT;
> > +
>
> Extra whitespace
fixed.
Thanks for the review. I'll send a new version out shortly.
-john
Ok, third attempt at this.
I've added some minor tweaks from Darren's review, and I think the
memory barrier bits are in the right place.
Please let me know if you have any comments or feedback! If not I'll
probably send this on to -mm next week.
thanks
-john
===============
Setting a thread's comm to be something unique is a very useful ability
and is helpful for debugging complicated threaded applications. However
currently the only way to set a thread name is for the thread to name
itself via the PR_SET_NAME prctl.
However, there may be situations where it would be advantageous for a
thread dispatcher to be naming the threads its managing, rather then
having the threads self-describe themselves. This sort of behavior is
available on other systems via the pthread_setname_np() interface.
This patch exports a task's comm via proc/pid/comm and
proc/pid/task/tid/comm interfaces, and allows thread siblings to write
to these values.
Signed-off-by: John Stultz <[email protected]>
diff --git a/fs/exec.c b/fs/exec.c
index d49be6b..ce70e55 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -926,7 +926,18 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
void set_task_comm(struct task_struct *tsk, char *buf)
{
task_lock(tsk);
- strlcpy(tsk->comm, buf, sizeof(tsk->comm));
+
+ /*
+ * Threads may access current->comm without holding
+ * the task lock, so write the string carefully
+ * to avoid non-terminating reads. Readers without a lock
+ * will get the oldname, the newname or an empty string.
+ */
+ tsk->comm[0] = 0;
+ wmb();
+ strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
+ wmb();
+ tsk->comm[0] = buf[0];
task_unlock(tsk);
perf_event_comm(tsk);
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 837469a..7f59af1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1265,6 +1265,78 @@ static const struct file_operations proc_pid_sched_operations = {
#endif
+
+
+static ssize_t
+comm_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *offset)
+{
+ struct inode *inode = file->f_path.dentry->d_inode;
+ struct task_struct *p;
+ char buffer[TASK_COMM_LEN];
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ if (same_thread_group(current, p))
+ set_task_comm(p, buffer);
+ else
+ count = -EINVAL;
+
+ put_task_struct(p);
+
+ return count;
+}
+
+
+static int comm_show(struct seq_file *m, void *v)
+{
+ struct inode *inode = m->private;
+ struct task_struct *p;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ task_lock(p);
+ seq_printf(m, "%s\n", p->comm);
+ task_unlock(p);
+
+ put_task_struct(p);
+
+ return 0;
+}
+
+static int comm_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = single_open(filp, comm_show, NULL);
+ if (!ret) {
+ struct seq_file *m = filp->private_data;
+
+ m->private = inode;
+ }
+ return ret;
+}
+
+
+static const struct file_operations proc_pid_set_comm_operations = {
+ .open = comm_open,
+ .read = seq_read,
+ .write = comm_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+
/*
* We added or removed a vma mapping the executable. The vmas are only mapped
* during exec and are not mapped with the mmap system call.
@@ -2504,6 +2576,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
@@ -2839,6 +2912,7 @@ static const struct pid_entry tid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
Andrew: Would you consider this for testing in -mm?
Setting a thread's comm to be something unique is a very useful ability
and is helpful for debugging complicated threaded applications. However
currently the only way to set a thread name is for the thread to name
itself via the PR_SET_NAME prctl.
However, there may be situations where it would be advantageous for a
thread dispatcher to be naming the threads its managing, rather then
having the threads self-describe themselves. This sort of behavior is
available on other systems via the pthread_setname_np() interface.
This patch exports a task's comm via proc/pid/comm and
proc/pid/task/tid/comm interfaces, and allows thread siblings to write
to these values.
Signed-off-by: John Stultz <[email protected]>
diff --git a/fs/exec.c b/fs/exec.c
index d49be6b..ce70e55 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -926,7 +926,18 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
void set_task_comm(struct task_struct *tsk, char *buf)
{
task_lock(tsk);
- strlcpy(tsk->comm, buf, sizeof(tsk->comm));
+
+ /*
+ * Threads may access current->comm without holding
+ * the task lock, so write the string carefully
+ * to avoid non-terminating reads. Readers without a lock
+ * will get the oldname, the newname or an empty string.
+ */
+ tsk->comm[0] = 0;
+ wmb();
+ strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
+ wmb();
+ tsk->comm[0] = buf[0];
task_unlock(tsk);
perf_event_comm(tsk);
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 837469a..7f59af1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1265,6 +1265,78 @@ static const struct file_operations proc_pid_sched_operations = {
#endif
+
+
+static ssize_t
+comm_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *offset)
+{
+ struct inode *inode = file->f_path.dentry->d_inode;
+ struct task_struct *p;
+ char buffer[TASK_COMM_LEN];
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ if (same_thread_group(current, p))
+ set_task_comm(p, buffer);
+ else
+ count = -EINVAL;
+
+ put_task_struct(p);
+
+ return count;
+}
+
+
+static int comm_show(struct seq_file *m, void *v)
+{
+ struct inode *inode = m->private;
+ struct task_struct *p;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ task_lock(p);
+ seq_printf(m, "%s\n", p->comm);
+ task_unlock(p);
+
+ put_task_struct(p);
+
+ return 0;
+}
+
+static int comm_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = single_open(filp, comm_show, NULL);
+ if (!ret) {
+ struct seq_file *m = filp->private_data;
+
+ m->private = inode;
+ }
+ return ret;
+}
+
+
+static const struct file_operations proc_pid_set_comm_operations = {
+ .open = comm_open,
+ .read = seq_read,
+ .write = comm_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+
/*
* We added or removed a vma mapping the executable. The vmas are only mapped
* during exec and are not mapped with the mmap system call.
@@ -2504,6 +2576,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
@@ -2839,6 +2912,7 @@ static const struct pid_entry tid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
john stultz <[email protected]> writes:
> - strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> +
> + /*
> + * Threads may access current->comm without holding
> + * the task lock, so write the string carefully
> + * to avoid non-terminating reads. Readers without a lock
> + * will get the oldname, the newname or an empty string.
> + */
> + tsk->comm[0] = 0;
> + wmb();
> + strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
> + wmb();
> + tsk->comm[0] = buf[0];
Is this really safe?
reader writer
read comm[0]
set comm[0] to 0
overwrites comm[1]
read comm[1]
read comm[2]
writes comm[2] to 0
read comm[3]
...
goes beyond the end
Better way probably is to replace tsk->comm with a pointer
and exchange that using xchg. Drawback: 4-8 bytes more per task.
Or perhaps make comm one byte longer and make sure the last
byte is always 0, but the drawback is that a reader can
read random (but at least safe) junk then.
-Andi
--
[email protected] -- Speaking for myself only.
On Sat, 07 Nov 2009 23:52:20 +0100
Andi Kleen <[email protected]> wrote:
> john stultz <[email protected]> writes:
> > - strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> > +
> > + /*
> > + * Threads may access current->comm without holding
> > + * the task lock, so write the string carefully
> > + * to avoid non-terminating reads. Readers without a lock
> > + * will get the oldname, the newname or an empty string.
> > + */
> > + tsk->comm[0] = 0;
> > + wmb();
> > + strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
> > + wmb();
> > + tsk->comm[0] = buf[0];
>
> Is this really safe?
>
> reader writer
>
>
> read comm[0]
> set comm[0] to 0
> overwrites comm[1]
> read comm[1]
> read comm[2]
> writes comm[2] to 0
> read comm[3]
>
> ...
> goes beyond the end
>
> Better way probably is to replace tsk->comm with a pointer
> and exchange that using xchg. Drawback: 4-8 bytes more per task.
>
> Or perhaps make comm one byte longer and make sure the last
> byte is always 0, but the drawback is that a reader can
> read random (but at least safe) junk then.
another option is to memset the whole thing to 0's.
might sound like overkill, but we're talking 16 bytes here... cheap
enough to do for such a rare case.
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
On Sat, 2009-11-07 at 23:52 +0100, Andi Kleen wrote:
> john stultz <[email protected]> writes:
> > - strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> > +
> > + /*
> > + * Threads may access current->comm without holding
> > + * the task lock, so write the string carefully
> > + * to avoid non-terminating reads. Readers without a lock
> > + * will get the oldname, the newname or an empty string.
> > + */
> > + tsk->comm[0] = 0;
> > + wmb();
> > + strlcpy(tsk->comm+1, buf+1, sizeof(tsk->comm)-1);
> > + wmb();
> > + tsk->comm[0] = buf[0];
>
> Is this really safe?
>
> reader writer
>
>
> read comm[0]
> set comm[0] to 0
> overwrites comm[1]
> read comm[1]
> read comm[2]
> writes comm[2] to 0
> read comm[3]
>
> ...
> goes beyond the end
Ah. Thanks for catching that.
Here's a reworked patch using Arjan's suggestion of memsetting the whole
string. Does this look ok to you?
thanks
-john
Setting a thread's comm to be something unique is a very useful ability
and is helpful for debugging complicated threaded applications. However
currently the only way to set a thread name is for the thread to name
itself via the PR_SET_NAME prctl.
However, there may be situations where it would be advantageous for a
thread dispatcher to be naming the threads its managing, rather then
having the threads self-describe themselves. This sort of behavior is
available on other systems via the pthread_setname_np() interface.
This patch exports a task's comm via proc/pid/comm and
proc/pid/task/tid/comm interfaces, and allows thread siblings to write
to these values.
Signed-off-by: John Stultz <[email protected]>
diff --git a/fs/exec.c b/fs/exec.c
index d49be6b..90003f8 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -926,6 +926,15 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
void set_task_comm(struct task_struct *tsk, char *buf)
{
task_lock(tsk);
+
+ /*
+ * Threads may access current->comm without holding
+ * the task lock, so write the string carefully.
+ * Readers without a lock may see incomplete new
+ * names but are safe from non-terminating string reads.
+ */
+ memset(tsk->comm, 0, TASK_COMM_LEN);
+ wmb();
strlcpy(tsk->comm, buf, sizeof(tsk->comm));
task_unlock(tsk);
perf_event_comm(tsk);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 837469a..7f59af1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1265,6 +1265,78 @@ static const struct file_operations proc_pid_sched_operations = {
#endif
+
+
+static ssize_t
+comm_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *offset)
+{
+ struct inode *inode = file->f_path.dentry->d_inode;
+ struct task_struct *p;
+ char buffer[TASK_COMM_LEN];
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ if (same_thread_group(current, p))
+ set_task_comm(p, buffer);
+ else
+ count = -EINVAL;
+
+ put_task_struct(p);
+
+ return count;
+}
+
+
+static int comm_show(struct seq_file *m, void *v)
+{
+ struct inode *inode = m->private;
+ struct task_struct *p;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ task_lock(p);
+ seq_printf(m, "%s\n", p->comm);
+ task_unlock(p);
+
+ put_task_struct(p);
+
+ return 0;
+}
+
+static int comm_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = single_open(filp, comm_show, NULL);
+ if (!ret) {
+ struct seq_file *m = filp->private_data;
+
+ m->private = inode;
+ }
+ return ret;
+}
+
+
+static const struct file_operations proc_pid_set_comm_operations = {
+ .open = comm_open,
+ .read = seq_read,
+ .write = comm_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+
/*
* We added or removed a vma mapping the executable. The vmas are only mapped
* during exec and are not mapped with the mmap system call.
@@ -2504,6 +2576,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
@@ -2839,6 +2912,7 @@ static const struct pid_entry tid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
Hey Andrew,
So far I've not had any objections to this version of the patch, so I
wanted to resend this for consideration for inclusion in -mm.
Thanks to Andi and Arjan for catching and suggesting a fix to the safe
lockless string write issue.
thanks
-john
Setting a thread's comm to be something unique is a very useful ability
and is helpful for debugging complicated threaded applications. However
currently the only way to set a thread name is for the thread to name
itself via the PR_SET_NAME prctl.
However, there may be situations where it would be advantageous for a
thread dispatcher to be naming the threads its managing, rather then
having the threads self-describe themselves. This sort of behavior is
available on other systems via the pthread_setname_np() interface.
This patch exports a task's comm via proc/pid/comm and
proc/pid/task/tid/comm interfaces, and allows thread siblings to write
to these values.
Signed-off-by: John Stultz <[email protected]>
diff --git a/fs/exec.c b/fs/exec.c
index d49be6b..90003f8 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -926,6 +926,15 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
void set_task_comm(struct task_struct *tsk, char *buf)
{
task_lock(tsk);
+
+ /*
+ * Threads may access current->comm without holding
+ * the task lock, so write the string carefully.
+ * Readers without a lock may see incomplete new
+ * names but are safe from non-terminating string reads.
+ */
+ memset(tsk->comm, 0, TASK_COMM_LEN);
+ wmb();
strlcpy(tsk->comm, buf, sizeof(tsk->comm));
task_unlock(tsk);
perf_event_comm(tsk);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 837469a..7f59af1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1265,6 +1265,78 @@ static const struct file_operations proc_pid_sched_operations = {
#endif
+
+
+static ssize_t
+comm_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *offset)
+{
+ struct inode *inode = file->f_path.dentry->d_inode;
+ struct task_struct *p;
+ char buffer[TASK_COMM_LEN];
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ if (same_thread_group(current, p))
+ set_task_comm(p, buffer);
+ else
+ count = -EINVAL;
+
+ put_task_struct(p);
+
+ return count;
+}
+
+
+static int comm_show(struct seq_file *m, void *v)
+{
+ struct inode *inode = m->private;
+ struct task_struct *p;
+
+ p = get_proc_task(inode);
+ if (!p)
+ return -ESRCH;
+
+ task_lock(p);
+ seq_printf(m, "%s\n", p->comm);
+ task_unlock(p);
+
+ put_task_struct(p);
+
+ return 0;
+}
+
+static int comm_open(struct inode *inode, struct file *filp)
+{
+ int ret;
+
+ ret = single_open(filp, comm_show, NULL);
+ if (!ret) {
+ struct seq_file *m = filp->private_data;
+
+ m->private = inode;
+ }
+ return ret;
+}
+
+
+static const struct file_operations proc_pid_set_comm_operations = {
+ .open = comm_open,
+ .read = seq_read,
+ .write = comm_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+
/*
* We added or removed a vma mapping the executable. The vmas are only mapped
* during exec and are not mapped with the mmap system call.
@@ -2504,6 +2576,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
@@ -2839,6 +2912,7 @@ static const struct pid_entry tid_base_stuff[] = {
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
+ REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
On Mon, 16 Nov 2009 13:11:07 -0800
john stultz <[email protected]> wrote:
> Setting a thread's comm to be something unique is a very useful ability
> and is helpful for debugging complicated threaded applications. However
> currently the only way to set a thread name is for the thread to name
> itself via the PR_SET_NAME prctl.
>
> However, there may be situations where it would be advantageous for a
> thread dispatcher to be naming the threads its managing, rather then
> having the threads self-describe themselves. This sort of behavior is
> available on other systems via the pthread_setname_np() interface.
>
> This patch exports a task's comm via proc/pid/comm and
> proc/pid/task/tid/comm interfaces, and allows thread siblings to write
> to these values.
>
Would be nice to document the new userspace interface.
Documentation/filesystems/proc.txt, perhaps.
>
> diff --git a/fs/exec.c b/fs/exec.c
> index d49be6b..90003f8 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -926,6 +926,15 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
> void set_task_comm(struct task_struct *tsk, char *buf)
> {
> task_lock(tsk);
> +
> + /*
> + * Threads may access current->comm without holding
> + * the task lock, so write the string carefully.
> + * Readers without a lock may see incomplete new
> + * names but are safe from non-terminating string reads.
> + */
> + memset(tsk->comm, 0, TASK_COMM_LEN);
> + wmb();
OK.
> strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> task_unlock(tsk);
> perf_event_comm(tsk);
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 837469a..7f59af1 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1265,6 +1265,78 @@ static const struct file_operations proc_pid_sched_operations = {
>
> #endif
>
> +
> +
> +static ssize_t
> +comm_write(struct file *file, const char __user *buf,
> + size_t count, loff_t *offset)
> +{
> + struct inode *inode = file->f_path.dentry->d_inode;
> + struct task_struct *p;
> + char buffer[TASK_COMM_LEN];
> +
> + memset(buffer, 0, sizeof(buffer));
> + if (count > sizeof(buffer) - 1)
> + count = sizeof(buffer) - 1;
Is this the best policy? If userspace tries to apply a too-long name
to a thread, the kernel will silently truncate (ie: corrupt) it? I'd
have thought that returning an error would be more robust?
> + if (copy_from_user(buffer, buf, count))
> + return -EFAULT;
> +
> + p = get_proc_task(inode);
> + if (!p)
> + return -ESRCH;
> +
> + if (same_thread_group(current, p))
> + set_task_comm(p, buffer);
> + else
> + count = -EINVAL;
> +
> + put_task_struct(p);
> +
> + return count;
> +}
Is same_thread_group() sufficient? Are any security/permission-related
checks appropriate here, for example?
The restriction to a separate thread group seems a bit arbitrary,
really. There's no reason I can see why we cannot permit unrelated
(but suitably authorised) processes to do this.
This patch makes task->comm inconsistent with /prod/pid/cmdline. What
are the implications of that for userspace? None, I guess, given that
this can already be done.
> +
> +static int comm_show(struct seq_file *m, void *v)
> +{
> + struct inode *inode = m->private;
> + struct task_struct *p;
> +
> + p = get_proc_task(inode);
> + if (!p)
> + return -ESRCH;
> +
> + task_lock(p);
> + seq_printf(m, "%s\n", p->comm);
> + task_unlock(p);
> +
> + put_task_struct(p);
> +
> + return 0;
> +}
> +
> +static int comm_open(struct inode *inode, struct file *filp)
> +{
> + int ret;
> +
> + ret = single_open(filp, comm_show, NULL);
> + if (!ret) {
> + struct seq_file *m = filp->private_data;
> +
> + m->private = inode;
> + }
> + return ret;
> +}
> +
> +
The patch has a seemingly-random inexplicable mixture of \n and \n\n.
> +static const struct file_operations proc_pid_set_comm_operations = {
> + .open = comm_open,
> + .read = seq_read,
> + .write = comm_write,
> + .llseek = seq_lseek,
> + .release = single_release,
> +};
> +
> +
> /*
> * We added or removed a vma mapping the executable. The vmas are only mapped
> * during exec and are not mapped with the mmap system call.
> @@ -2504,6 +2576,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> #ifdef CONFIG_SCHED_DEBUG
> REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
> #endif
> + REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
> #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
> INF("syscall", S_IRUSR, proc_pid_syscall),
> #endif
> @@ -2839,6 +2912,7 @@ static const struct pid_entry tid_base_stuff[] = {
> #ifdef CONFIG_SCHED_DEBUG
> REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
> #endif
> + REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
> #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
> INF("syscall", S_IRUSR, proc_pid_syscall),
> #endif
(cc to linux-security-module and James)
> On Mon, 16 Nov 2009 13:11:07 -0800
> john stultz <[email protected]> wrote:
>
> > Setting a thread's comm to be something unique is a very useful ability
> > and is helpful for debugging complicated threaded applications. However
> > currently the only way to set a thread name is for the thread to name
> > itself via the PR_SET_NAME prctl.
> >
> > However, there may be situations where it would be advantageous for a
> > thread dispatcher to be naming the threads its managing, rather then
> > having the threads self-describe themselves. This sort of behavior is
> > available on other systems via the pthread_setname_np() interface.
> >
> > This patch exports a task's comm via proc/pid/comm and
> > proc/pid/task/tid/comm interfaces, and allows thread siblings to write
> > to these values.
> >
>
> Would be nice to document the new userspace interface.
> Documentation/filesystems/proc.txt, perhaps.
>
> >
> > diff --git a/fs/exec.c b/fs/exec.c
> > index d49be6b..90003f8 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -926,6 +926,15 @@ char *get_task_comm(char *buf, struct task_struct *tsk)
> > void set_task_comm(struct task_struct *tsk, char *buf)
> > {
> > task_lock(tsk);
> > +
> > + /*
> > + * Threads may access current->comm without holding
> > + * the task lock, so write the string carefully.
> > + * Readers without a lock may see incomplete new
> > + * names but are safe from non-terminating string reads.
> > + */
> > + memset(tsk->comm, 0, TASK_COMM_LEN);
> > + wmb();
>
> OK.
Hmm, I don't like mix TASK_COMM_LEN and sizeof(tsk->comm).
John, Is there any reason?
>
> > strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> > task_unlock(tsk);
> > perf_event_comm(tsk);
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 837469a..7f59af1 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1265,6 +1265,78 @@ static const struct file_operations proc_pid_sched_operations = {
> >
> > #endif
> >
> > +
> > +
> > +static ssize_t
> > +comm_write(struct file *file, const char __user *buf,
> > + size_t count, loff_t *offset)
> > +{
> > + struct inode *inode = file->f_path.dentry->d_inode;
> > + struct task_struct *p;
> > + char buffer[TASK_COMM_LEN];
> > +
> > + memset(buffer, 0, sizeof(buffer));
> > + if (count > sizeof(buffer) - 1)
> > + count = sizeof(buffer) - 1;
>
> Is this the best policy? If userspace tries to apply a too-long name
> to a thread, the kernel will silently truncate (ie: corrupt) it? I'd
> have thought that returning an error would be more robust?
>
> > + if (copy_from_user(buffer, buf, count))
> > + return -EFAULT;
> > +
> > + p = get_proc_task(inode);
> > + if (!p)
> > + return -ESRCH;
> > +
> > + if (same_thread_group(current, p))
> > + set_task_comm(p, buffer);
> > + else
> > + count = -EINVAL;
> > +
> > + put_task_struct(p);
> > +
> > + return count;
> > +}
>
> Is same_thread_group() sufficient? Are any security/permission-related
> checks appropriate here, for example?
>
> The restriction to a separate thread group seems a bit arbitrary,
> really. There's no reason I can see why we cannot permit unrelated
> (but suitably authorised) processes to do this.
At least, currently /proc/pid/cmdline read the process stack. A stack
can be rewritten without any security check by the same group thread.
I guess we can't make consist check of task name change.
Plus, now we don't have any LSM hook of task name change nor security
capability. I guess all security module don't need task name.
I hope security folks correct me if I misunderstood.
> This patch makes task->comm inconsistent with /prod/pid/cmdline. What
> are the implications of that for userspace? None, I guess, given that
> this can already be done.
ditto.
Process stack isn't guarded now. we can't make reasonable protection.
- kosaki
>
> > +
> > +static int comm_show(struct seq_file *m, void *v)
> > +{
> > + struct inode *inode = m->private;
> > + struct task_struct *p;
> > +
> > + p = get_proc_task(inode);
> > + if (!p)
> > + return -ESRCH;
> > +
> > + task_lock(p);
> > + seq_printf(m, "%s\n", p->comm);
> > + task_unlock(p);
> > +
> > + put_task_struct(p);
> > +
> > + return 0;
> > +}
> > +
> > +static int comm_open(struct inode *inode, struct file *filp)
> > +{
> > + int ret;
> > +
> > + ret = single_open(filp, comm_show, NULL);
> > + if (!ret) {
> > + struct seq_file *m = filp->private_data;
> > +
> > + m->private = inode;
> > + }
> > + return ret;
> > +}
> > +
> > +
>
> The patch has a seemingly-random inexplicable mixture of \n and \n\n.
>
> > +static const struct file_operations proc_pid_set_comm_operations = {
> > + .open = comm_open,
> > + .read = seq_read,
> > + .write = comm_write,
> > + .llseek = seq_lseek,
> > + .release = single_release,
> > +};
> > +
> > +
> > /*
> > * We added or removed a vma mapping the executable. The vmas are only mapped
> > * during exec and are not mapped with the mmap system call.
> > @@ -2504,6 +2576,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> > #ifdef CONFIG_SCHED_DEBUG
> > REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
> > #endif
> > + REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
> > #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
> > INF("syscall", S_IRUSR, proc_pid_syscall),
> > #endif
> > @@ -2839,6 +2912,7 @@ static const struct pid_entry tid_base_stuff[] = {
> > #ifdef CONFIG_SCHED_DEBUG
> > REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
> > #endif
> > + REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
> > #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
> > INF("syscall", S_IRUSR, proc_pid_syscall),
> > #endif
>
On Wed, 2009-11-18 at 13:54 -0800, Andrew Morton wrote:
> On Mon, 16 Nov 2009 13:11:07 -0800
> john stultz <[email protected]> wrote:
>
> > Setting a thread's comm to be something unique is a very useful ability
> > and is helpful for debugging complicated threaded applications. However
> > currently the only way to set a thread name is for the thread to name
> > itself via the PR_SET_NAME prctl.
> >
> > However, there may be situations where it would be advantageous for a
> > thread dispatcher to be naming the threads its managing, rather then
> > having the threads self-describe themselves. This sort of behavior is
> > available on other systems via the pthread_setname_np() interface.
> >
> > This patch exports a task's comm via proc/pid/comm and
> > proc/pid/task/tid/comm interfaces, and allows thread siblings to write
> > to these values.
> >
>
> Would be nice to document the new userspace interface.
> Documentation/filesystems/proc.txt, perhaps.
Initial swipe provided below. I'm not much of a writer, so let me know
if you'd like to see more explanation.
> > +
> > +
> > +static ssize_t
> > +comm_write(struct file *file, const char __user *buf,
> > + size_t count, loff_t *offset)
> > +{
> > + struct inode *inode = file->f_path.dentry->d_inode;
> > + struct task_struct *p;
> > + char buffer[TASK_COMM_LEN];
> > +
> > + memset(buffer, 0, sizeof(buffer));
> > + if (count > sizeof(buffer) - 1)
> > + count = sizeof(buffer) - 1;
>
> Is this the best policy? If userspace tries to apply a too-long name
> to a thread, the kernel will silently truncate (ie: corrupt) it? I'd
> have thought that returning an error would be more robust?
The comm is usually a truncated version of cmdline. If you look at most
long-named gnome apps, their comms are cut off, so I think the
truncation is somewhat expected.
For the implementation above, I followed how the PR_SET_NAME prctrl
behaves, where the same argument would apply.
Personally, I'm not picky, and would ok doing it either way if you or
anyone else feels strongly.
> > + if (copy_from_user(buffer, buf, count))
> > + return -EFAULT;
> > +
> > + p = get_proc_task(inode);
> > + if (!p)
> > + return -ESRCH;
> > +
> > + if (same_thread_group(current, p))
> > + set_task_comm(p, buffer);
> > + else
> > + count = -EINVAL;
> > +
> > + put_task_struct(p);
> > +
> > + return count;
> > +}
>
> Is same_thread_group() sufficient? Are any security/permission-related
> checks appropriate here, for example?
>
> The restriction to a separate thread group seems a bit arbitrary,
> really. There's no reason I can see why we cannot permit unrelated
> (but suitably authorised) processes to do this.
To me it seemed the most sane safe behavior. Applications may not expect
to have their comms changed under them, so excluding it to sibling
threads ensures any changes are expected internally by the application.
> This patch makes task->comm inconsistent with /prod/pid/cmdline. What
> are the implications of that for userspace? None, I guess, given that
> this can already be done.
Agreed. PR_SET_NAME would be the precedent.
> > +static int comm_open(struct inode *inode, struct file *filp)
> > +{
> > + int ret;
> > +
> > + ret = single_open(filp, comm_show, NULL);
> > + if (!ret) {
> > + struct seq_file *m = filp->private_data;
> > +
> > + m->private = inode;
> > + }
> > + return ret;
> > +}
> > +
> > +
>
> The patch has a seemingly-random inexplicable mixture of \n and \n\n.
Sorry about that, I'll have to fix the programming robot. ;)
thanks
-john
Add some documentation about /proc/<pid>/task/<tid>/comm interface.
Signed-off-by: John Stultz <[email protected]>
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2c48f94..4e5470d 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -38,6 +38,7 @@ Table of Contents
3.3 /proc/<pid>/io - Display the IO accounting fields
3.4 /proc/<pid>/coredump_filter - Core dump filtering settings
3.5 /proc/<pid>/mountinfo - Information about mounts
+ 3.6 /proc/<pid>/comm & /proc/<pid>/task/<tid>/comm
------------------------------------------------------------------------------
@@ -1408,3 +1409,11 @@ For more information on mount propagation see:
Documentation/filesystems/sharedsubtree.txt
+
+3.6 /proc/<pid>/comm & /proc/<pid>/task/<tid>/comm
+--------------------------------------------------------
+These files provide a method to access a tasks comm value. It also allows for
+a task to set its own or one of its thread siblings comm value. The comm value
+is limited in size compared to the cmdline value, so writing anything longer
+then the kernel's TASK_COMM_LEN (currently 16 chars) will result in a truncated
+comm value.