2005-05-19 18:02:33

by Olivier Croquette

[permalink] [raw]
Subject: Thread and process dentifiers (CPU affinity, kill)


It seems that the thread ids in Linux are unique within the complete
operating system, and not only within their corresponding processes.
This is explicitely allowed by the POSIX standard (it also states that
applications shall no rely on it).

Apparently some system calls which normally require a process id also
work with thread ids.

- a system call requiring a PID can have the same effect if a thread id
of the same process was given.
Example: kill(tid,SIGTERM) will kill the entire process the thread
belongs to. I think that this is not POSIX compliant. It shall trigger
ESRCH!

Sometimes, the system call has another effect, potentialy providing
additional functionality.
Example: sched_setaffinity(). The man page and the prototype (which
requires a pid_t) both show that a process id is required. Nothing
indicates that it works with threads, and AFAIK there is no documented
way to set affinity for a specific thread.
But if you give a TID to sched_setaffinity, it will put the *thread* on
the given cpu set.
If you give a PID to sched_setaffinity, it will put the *main thread* on
the given cpu set. The other threads won't be impacted.
Even if sched_setaffinity() is no standard, I find it confusing to give
it a pid_t and that it affects only threads!


Some open questions:

- is it a guaranted behaviour within Linux that thread ids and process
ids do not overlap? is it documented anywhere?

- is there a real confusion at API level within Linux between threads
and processes or are kill() and sched_setaffinity() isolated examples?
or bugs?

- is Linux kill() POSIX compliant in this regard?

- do we want to limit the sched_setaffinity() functionality to
correspond to its documentation, or do we want to update the
documentation so that its covers all the functionality?


Regards


Olivier


2005-05-19 18:23:15

by Lennart Sorensen

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

On Thu, May 19, 2005 at 08:00:56PM +0200, Olivier Croquette wrote:
> It seems that the thread ids in Linux are unique within the complete
> operating system, and not only within their corresponding processes.
> This is explicitely allowed by the POSIX standard (it also states that
> applications shall no rely on it).
>
> Apparently some system calls which normally require a process id also
> work with thread ids.

They work with numbers, this is C, there is no real serious type checking.
Don't rely on it and just pretend it isn't that way.

> - a system call requiring a PID can have the same effect if a thread id
> of the same process was given.
> Example: kill(tid,SIGTERM) will kill the entire process the thread
> belongs to. I think that this is not POSIX compliant. It shall trigger
> ESRCH!

How should kill know if you are sending a threadid or processid? If the
integer matches a running pid then it should kill that one. If there is
no such process id it should return ESRCH. Just because every threadid
is a processid too doesn't mean it is broken, and you aren't supposed to
pass threadids to kill anyhow.

> Sometimes, the system call has another effect, potentialy providing
> additional functionality.
> Example: sched_setaffinity(). The man page and the prototype (which
> requires a pid_t) both show that a process id is required. Nothing
> indicates that it works with threads, and AFAIK there is no documented
> way to set affinity for a specific thread.
> But if you give a TID to sched_setaffinity, it will put the *thread* on
> the given cpu set.
> If you give a PID to sched_setaffinity, it will put the *main thread* on
> the given cpu set. The other threads won't be impacted.
> Even if sched_setaffinity() is no standard, I find it confusing to give
> it a pid_t and that it affects only threads!
>
>
> Some open questions:
>
> - is it a guaranted behaviour within Linux that thread ids and process
> ids do not overlap? is it documented anywhere?
>
> - is there a real confusion at API level within Linux between threads
> and processes or are kill() and sched_setaffinity() isolated examples?
> or bugs?

Just pretend they are different things even if currently they are
implemented so threads are processes and your code should be safe.

> - is Linux kill() POSIX compliant in this regard?

Does posix say that a process can't be allocated multiple PIDs? What
should kill do when sent to anyone of the PIDs beloging to a process?
It should probably do the same thing as if it was sent to the PID of the
main thread (whatever main thread means in a threaded program). I would
think it is posix compliant even if it isn't the common way to represent
threads on posix compliant systems.

> - do we want to limit the sched_setaffinity() functionality to
> correspond to its documentation, or do we want to update the
> documentation so that its covers all the functionality?

I believe Linux currently implements threads as seperate processes (at
least top and ps sees them that way). Of course I would never recomend
assuming things will always work that way, since after all someone is
perfectly allowed to implement threading in user space with a posix
thread compliant interface and link a program against their own thread
library which doesn't use the kernel to manage the threads (using
linuxthreads). A safe programmer makes no assumptions about anything if
it isn't documented in the specs to work a specific way. if it is
states as undefined, expect the behaviour to potentially change.

Now given linux runs threads as seperate processes, it makes sense that
thread ids and process ids are the same thing and hence currently
unique, and that kill would work on any thread's pid within a given
process. I am not sure how the process handles signals to a thread in
terms of signal handling, although I would think the kernel probably
knows it is a thread and passes it to the parent process instead.

Doesn't sched_setaffinity do what it says it will? Since each thread is
treated as a process then sched_setaffinity should work on it I would
think since it is a process after all as far as the scheduler is
concerned.

Len Sorensen

2005-05-19 19:47:13

by Chris Friesen

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

Lennart Sorensen wrote:
> On Thu, May 19, 2005 at 08:00:56PM +0200, Olivier Croquette wrote:

>>- a system call requiring a PID can have the same effect if a thread id
>>of the same process was given.
>>Example: kill(tid,SIGTERM) will kill the entire process the thread
>>belongs to. I think that this is not POSIX compliant. It shall trigger
>>ESRCH!
>
>
> How should kill know if you are sending a threadid or processid?

Doesn't matter. From a userspace point of view there is no process with
that PID, so kill() should return ESRCH. In the kernel, I think this
means that kill() should actually be looking up tgids rather than pids.

>>- is Linux kill() POSIX compliant in this regard?

> Does posix say that a process can't be allocated multiple PIDs?

PID="process ID"

You have one PID per process.

>>- do we want to limit the sched_setaffinity() functionality to
>>correspond to its documentation, or do we want to update the
>>documentation so that its covers all the functionality?

> I believe Linux currently implements threads as seperate processes

No, they are implemented as separately schedulable entities with lots of
shared state. "process" and "thread" are POSIX terms that don't really
mean anything in the kernel.

> Now given linux runs threads as seperate processes, it makes sense that
> thread ids and process ids are the same thing and hence currently
> unique, and that kill would work on any thread's pid within a given
> process.

Pthreads define signal handling. Signals are delivered to the process
as a whole, not to any particular thread. If you specify a TID that is
not a valid PID, then the kernel should return an error.

> Doesn't sched_setaffinity do what it says it will? Since each thread is
> treated as a process then sched_setaffinity should work on it I would
> think since it is a process after all as far as the scheduler is
> concerned.

If the syscall is supposed to operate on processes, it should operate on
all threads within a process. It would be nice to have a way to specify
affinity for threads. POSIX doesn't define one though.

Chris

2005-05-20 12:55:22

by Lennart Sorensen

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

On Thu, May 19, 2005 at 01:46:20PM -0600, Chris Friesen wrote:
> Doesn't matter. From a userspace point of view there is no process with
> that PID, so kill() should return ESRCH. In the kernel, I think this
> means that kill() should actually be looking up tgids rather than pids.

If you look in the list of processes running, you WILL see that PID in
the list. ERSCH should only be returned if you ask for a thread that
either never existed or doesn't exist anymore. Since a thread is a
process to the kernel (at least as far as cheduling and PIDs are
concerned) you can send a kill to the thread, which will probably be
sent to the parent process id instead.

> PID="process ID"
>
> You have one PID per process.

No, you have at least one PID per process. I have never heard anyone
claim before that implementing threads as extra processes in the kernel
is violating posix. It sure makes the scheduler simpler to implement.
Much more efficient than user space threading.

> No, they are implemented as separately schedulable entities with lots of
> shared state. "process" and "thread" are POSIX terms that don't really
> mean anything in the kernel.

Certainly process and thread does have meanings in the kernel.

> Pthreads define signal handling. Signals are delivered to the process
> as a whole, not to any particular thread. If you specify a TID that is
> not a valid PID, then the kernel should return an error.

Well as long as the kernel send the signals sent to any of the PIDs of a
multithreaded process, to that process, then that seems fine to me.

> If the syscall is supposed to operate on processes, it should operate on
> all threads within a process. It would be nice to have a way to specify
> affinity for threads. POSIX doesn't define one though.

Hmm, well I guess the current way it works you can set the affinity per
thread since you had a PID per thread to operate on. If you want to do
it for the whole process, perhaps if you set it on the starting thread
before it creates more threads they would probably inherit the affinity
of the original thread.

Have you tried NPTL (native posix threading library) which is supposed
to become the threading standard on linux in the future (if it works
out)? I was under the impression amd64 systems with 2.6 kernels at
least tend to use that by default, but I might be remembering something
else unrelated. I wonder if NPTL doesn't do more the way you want than
the way linuxthreads have worked so far.

Len Sorensen

2005-05-20 15:00:48

by Olivier Croquette

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

Lennart Sorensen wrote:

> I believe Linux currently implements threads as seperate processes (at
> least top and ps sees them that way).

> Have you tried NPTL (native posix threading library) which is supposed
> to become the threading standard on linux in the future (if it works
> out)?

Lennart,

From the beginning we are talking about the present GNU/Linux systems,
which do already use NTPL in standard. NPTL is no future standard, it is
present standard.

This means basicly that 50% of your assertions (like the above) are
wrong, and your conclusions "suffer" from that :)

The point is that if you make a ps on a decent Linux based system, you
will *NOT* see one process for each thread. Nor they do appear in /proc.

This means there are *NOT* userland processes.

And therefore, you shall *NOT* be able to reference them as such where a
process ID is required.

2005-05-20 16:53:14

by Lennart Sorensen

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

On Fri, May 20, 2005 at 04:51:10PM +0200, Olivier Croquette wrote:
> From the beginning we are talking about the present GNU/Linux systems,
> which do already use NTPL in standard. NPTL is no future standard, it is
> present standard.

Hmm, I just noticed the page I found about NPTL had 2005 date one place
and 2002 in another. Yeah that is what people are using already.

Most current i386 systems do NOT use NPTL yet by default since it only
works on 2.6 kernels, and even then probably only if glibc was compiled
that way.

> This means basicly that 50% of your assertions (like the above) are
> wrong, and your conclusions "suffer" from that :)
>
> The point is that if you make a ps on a decent Linux based system, you
> will *NOT* see one process for each thread. Nor they do appear in /proc.
>
> This means there are *NOT* userland processes.
>
> And therefore, you shall *NOT* be able to reference them as such where a
> process ID is required.

Hmm, I just ran a pthread program and every thread shows up as a process
in ps. Of course that is on a system with a 2.4 kernel compatible
glibc, so that is probably not a valid test. Running on 2.6.11 on an
amd64 with glibc compiled for 2.6 kernels only, I only see one pid for
the multithreaded program.

Doing kill on the threadid on the amd64 does return ESRCH.

Make sure your test is on a 2.6 kernel system with glibc compiled to
only use nptl (so not 2.4 kernel compatible at all). It appears to work
as you want it to now.

Len Sorensen

2005-05-20 18:13:50

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

In article <[email protected]>,
Lennart Sorensen <[email protected]> wrote:
>On Fri, May 20, 2005 at 04:51:10PM +0200, Olivier Croquette wrote:
>> From the beginning we are talking about the present GNU/Linux systems,
>> which do already use NTPL in standard. NPTL is no future standard, it is
>> present standard.
>
>Hmm, I just noticed the page I found about NPTL had 2005 date one place
>and 2002 in another. Yeah that is what people are using already.
>
>Most current i386 systems do NOT use NPTL yet by default since it only
>works on 2.6 kernels, and even then probably only if glibc was compiled
>that way.

No. On modern systems, glibc contains both LinuxThreads and NPTL.
They have the same ABI. At runtime one of the two is selected,
depending on if the currently running kernel supports NTPL.
You can also force it by setting the LD_ASSUME_KERNEL environment
variable to 2.4 or 2.6.

Mike.

2005-05-20 20:13:00

by Lennart Sorensen

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

On Fri, May 20, 2005 at 06:13:48PM +0000, Miquel van Smoorenburg wrote:
> No. On modern systems, glibc contains both LinuxThreads and NPTL.
> They have the same ABI. At runtime one of the two is selected,
> depending on if the currently running kernel supports NTPL.
> You can also force it by setting the LD_ASSUME_KERNEL environment
> variable to 2.4 or 2.6.

Well so far my tests show that glibc 2.3.2.ds1-21 on Debian Sarge when
running 2.6.11 kernel on i386 uses LinuxThreads, while on amd64 version
of Sarge it uses NPTL (and won't run with 2.4 kernel at all either).

Maybe Debian compiled their glibc to not do NPTL on i386 yet. Not sure.

Hmm, after checking, it turns out if you use errno in your program, it
drops to linuxthreads, while using #include <errno.h> makes it able to
use NPTL when using 2.6 kernel. Now my program works the same on i386
as on amd64 (I had to fix the errno to make it run on amd64 so that does
make some sense). Well I learned something new.

Len Sorensen

2005-05-20 20:19:39

by Olivier Croquette

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

Miquel van Smoorenburg wrote:

> No. On modern systems, glibc contains both LinuxThreads and NPTL.
> They have the same ABI. At runtime one of the two is selected,
> depending on if the currently running kernel supports NTPL.
> You can also force it by setting the LD_ASSUME_KERNEL environment
> variable to 2.4 or 2.6.

More info about that from:
http://linuxdevices.com/articles/AT6753699732.html

Some Linux vendors, such as later versions of Red Hat Linux, have
backported NPTL to earlier kernels and have even made the threading
environment for specific processes selectable through an environment
variable (LD_ASSUME_KERNEL). On systems that support this feature, the
variable is set via a command such as the following:
# export LD_ASSUME_KERNEL=2.4.1
This is a clever way to enable some existing applications that rely on
LinuxThreads to continue to work in an NPTL environment, but is a
short-term solution. To make the most of the design and performance
benefits provided by NPTL, you should update the code for any existing
applications that use threading.


[...]

Simply using a 2.6 based kernel does not mean that you are automatically
using the NPTL. To determine the threading library that a system uses,
you can execute the getconf command (part of the glibc package), to
examine the GNU_LIBPTHREAD_VERSION environment variable, as in the
following command example:
# getconf GNU_LIBPTHREAD_VERSION
linuxthreads-0.10
If your system uses the NPTL, the command would return the value of NPTL
that your system was using, as in the following example:
# getconf GNU_LIBPTHREAD_VERSION
nptl-0.60

2005-05-20 20:39:06

by Lee Revell

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

On Fri, 2005-05-20 at 22:17 +0200, Olivier Croquette wrote:
> # export LD_ASSUME_KERNEL=2.4.1
> This is a clever way to enable some existing applications that rely on
> LinuxThreads to continue to work in an NPTL environment, but is a
> short-term solution. To make the most of the design and performance
> benefits provided by NPTL, you should update the code for any existing
> applications that use threading.

Applications that rely on linuxthreads, heh, that's a good one.

The most common use of LD_ASSUME_KERNEL is to force Linuxthreads to be
used in order to work around a bad bug in NPTL 0.60, often present on
Debian systems. Ubuntu still reports NPTL 0.60, but they at least fixed
the bug for the Hoary release. The Debian people refuse to.

The issue is very well known to JACK users.

Lee

2005-05-23 12:57:10

by Nix

[permalink] [raw]
Subject: Re: Thread and process dentifiers (CPU affinity, kill)

On 20 May 2005, Lennart Sorensen prattled cheerily:
> Maybe Debian compiled their glibc to not do NPTL on i386 yet. Not sure.

This is not the case. Proof from ps -FT output:

mysql 8473 8473 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8475 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8476 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8477 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8478 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8479 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8480 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8481 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld
mysql 8473 8482 8472 0 29110 14056 0 May22 pts/1 /usr/sbin/mysqld

> Hmm, after checking, it turns out if you use errno in your program, it
> drops to linuxthreads, while using #include <errno.h> makes it able to
> use NPTL when using 2.6 kernel.

This is a distribution-specific patch. glibc as shipped by the FSF simply
refuses to run programs that reference the errno symbol: errno is no
longer an exported symbol at all. (This is reasonable, as such programs
would fail to work on a multithreaded NPTL program in any case.)

The only valid way to gain access to the errno symbol is to
#include <errno.h>. This has been true for as long as glibc2 has existed.

--
`Once again, I must remark on the far-reaching extent of my
ladylike nature.' --- Rosie Taylor