2004-10-05 20:59:43

by Patryk Jakubowski

[permalink] [raw]
Subject: Invisible threads in 2.6.9

Hi.

I've been experimenting with process/thread accounting in 2.6.9-rc3 (and
2.6.8), and found this strange situation: if the leader thread of a
multi-threaded process terminates, the other threads become
undetectable. After the main thread becomes a zombie, /proc/PID/task
returns ENOENT on open. If you happen to know the TID, you can access
/proc/PID/* directly, but otherwise, there is no way to observe the
remaining threads, as far as I can see. Consider this program, for example:

|
#include

void *run(void *arg)
{
for(;;)
;
}

int main()
{
pthread_t t;
int i;
for (i = 0; i < 10; ++i)
pthread_create(&t, NULL, run, NULL);
pthread_exit(NULL);
}
|

When I run it, the system (predictably) goes to ~100% CPU utilization,
but there seems to be no way to find out who is hogging the CPU with
top(1), ps(1), or anything else. All they can show is the main thread in
zombie state, consuming 0% CPU.

Is this correct behaviour of linux?
Would not this allow user space programs to hide running executions?
This could be an opportunity for spyware to infect the machine and hide
itself perhaps? Hope I'm wrong here!

If this is the bug in kernel (procfs?) I can give you my configuration
and resulting behaviour.

Sorry for my bad english.



----------------------------------------------------------------------
Portal INTERIA.PL zaprasza... >>> http://link.interia.pl/f17cb


2004-10-06 03:08:25

by Nuno Silva

[permalink] [raw]
Subject: Re: Invisible threads in 2.6.9

Patryk Jakubowski wrote:

...

> When I run it, the system (predictably) goes to ~100% CPU utilization,
> but there seems to be no way to find out who is hogging the CPU with
> top(1), ps(1), or anything else. All they can show is the main thread in
> zombie state, consuming 0% CPU.
>
> Is this correct behaviour of linux?
> Would not this allow user space programs to hide running executions?
> This could be an opportunity for spyware to infect the machine and hide
> itself perhaps? Hope I'm wrong here!
>
> If this is the bug in kernel (procfs?) I can give you my configuration
> and resulting behaviour.

Yes, that's the new method trojans are using to hide tasks... No need to
install complicated kernel modules anymore :-)

More seriously: That's a problem with current procps utils... They just
don't show them. I can't complain too much because I'm not doing any
code, but it would be nice to have a working top...

As a workaround, to at least see the threads without inspecting /proc
directly, you can use the 'm' and 'H' flags to ps, i.e.

$ ps auwxH

Regards,
Nuno Silva

2004-10-06 07:52:01

by Michal Schmidt

[permalink] [raw]
Subject: Re: Invisible threads in 2.6.9

Nuno Silva wrote:
> Yes, that's the new method trojans are using to hide tasks... No need to
> install complicated kernel modules anymore :-)
>
> More seriously: That's a problem with current procps utils... They just
> don't show them. I can't complain too much because I'm not doing any
> code, but it would be nice to have a working top...
>
> As a workaround, to at least see the threads without inspecting /proc
> directly, you can use the 'm' and 'H' flags to ps, i.e.
>
> $ ps auwxH
>

It doesn't work for me:

# ps aux | grep [t]hread
michich 7447 0.0 0.0 0 0 pts/1 Zl+ 09:43 0:00 [threadbug] <defunct>
# ps auwxH | grep [t]hread
#

And I can't inspect /proc directly:
# cd /proc/7447
# ls -l
ls: cannot read symbolic link cwd: No such file or directory
ls: cannot read symbolic link root: No such file or directory
ls: cannot read symbolic link exe: No such file or directory
total 0
dr-xr-xr-x 2 root root 0 Oct 6 09:48 attr
-r-------- 1 root root 0 Oct 6 09:48 auxv
-r--r--r-- 1 root root 0 Oct 6 09:44 cmdline
lrwxrwxrwx 1 root root 0 Oct 6 09:48 cwd
-r-------- 1 root root 0 Oct 6 09:48 environ
lrwxrwxrwx 1 root root 0 Oct 6 09:48 exe
dr-x------ 2 root root 0 Oct 6 09:44 fd
-r--r--r-- 1 root root 0 Oct 6 09:48 maps
-rw------- 1 root root 0 Oct 6 09:48 mem
-r--r--r-- 1 root root 0 Oct 6 09:48 mounts
lrwxrwxrwx 1 root root 0 Oct 6 09:48 root
-r--r--r-- 1 root root 0 Oct 6 09:44 stat
-r--r--r-- 1 root root 0 Oct 6 09:48 statm
-r--r--r-- 1 root root 0 Oct 6 09:44 status
dr-xr-xr-x 3 root root 0 Oct 6 09:45 task
-r--r--r-- 1 root root 0 Oct 6 09:48 wchan
# cd task
bash: cd: task: No such file or directory
# ls -l task
ls: task: No such file or directory
# ls -l | grep task
ls: cannot read symbolic link cwd: No such file or directory
ls: cannot read symbolic link root: No such file or directory
ls: cannot read symbolic link exe: No such file or directory
dr-xr-xr-x 3 root root 0 Oct 6 09:45 task

Isn't it strange?

Michal Schmidt

2004-10-06 10:15:10

by Patryk Jakubowski

[permalink] [raw]
Subject: Re: Invisible threads in 2.6.9

Michal Schmidt wrote:

>
> It doesn't work for me:
>
> # ps aux | grep [t]hread
> michich 7447 0.0 0.0 0 0 pts/1 Zl+ 09:43 0:00
> [threadbug] <defunct>
> # ps auwxH | grep [t]hread
> #
>
> And I can't inspect /proc directly:
>
> Isn't it strange?
>
> Michal Schmidt
>

Yes. It is badly strange for me. It is not the bug in procps. procps
scans the /proc tree for information, top and ps too. I know that I'm
amateur programmer, but not kernel programmer.

When i run this example program (threadbug) and let its PID be 6024. Its
first subthread can be PID 6025 (for example). The lider thread exits.
And /proc/6024/task is inaccesible, ls shows it empty.
/proc/6024/task/6025 should exist. However, cat /proc/6025/cmdline
returns ./threadbug. But ls -d /proc/602? do not shows /proc/6025. Any
way. I'm almost sure that /proc/6024/task/* should be visible. It is bug
in procfs filesystem or buggy strategy to show tasks in /proc.



I put here my session with threadbug. Important lines are prefixed with "!".

pat@pat:/tmp$ cat > threadbug.c
#include <pthread.h>
#include <unistd.h>

void *run(void *arg)
{
for(;;);
}

int main()
{
pthread_t t;
int i;
for (i = 0; i < 10; ++i)
pthread_create(&t, NULL, run, NULL);
sleep(30);
pthread_exit(NULL);
}

pat@pat:/tmp$ gcc -o threadbug -lpthread threadbug.c

pat@pat:/tmp$ ./threadbug &
[1] 6907

pat@pat:/tmp$ date
Wed Oct 6 11:37:41 CEST 2004

! pat@pat:/tmp$ ls /proc/6907/task # threads are detectable
6907 6908 6909 6910 6911 6912 6913 6914 6915 6916 6917

pat@pat:/tmp$ ps m
PID TTY STAT TIME COMMAND
1647 pts/0 - 0:00 /bin/bash
- - Ss 0:00 -
6019 pts/2 - 0:00 /bin/bash
- - Ss 0:00 -
6035 pts/3 - 0:00 /bin/bash
- - Ss+ 0:00 -
6907 pts/2 - 0:00 ./threadbug
- - S 0:00 -
- - R 0:06 -
- - R 0:06 -
- - R 0:06 -
- - R 0:05 -
- - R 0:06 -
- - R 0:06 -
- - R 0:06 -
- - R 0:05 -
- - R 0:07 -
- - R 0:07 -
6928 pts/2 - 0:00 ps m
- - R+ 0:00 -

! pat@pat:/tmp$ date # now leader thread is exited
! Wed Oct 6 11:40:15 CEST 2004

pat@pat:/tmp$ ls /proc/6907/task
! ls: /proc/6907/task: No such file or directory # no threads and
other info

pat@pat:/tmp$ ps m
PID TTY STAT TIME COMMAND
1647 pts/0 - 0:00 /bin/bash
- - Ss 0:00 -
6019 pts/2 - 0:00 /bin/bash
- - Ss 0:00 -
6035 pts/3 - 0:00 /bin/bash
- - Ss+ 0:00 -
6907 pts/2 - 0:00 [threadbug] <defunct>
6942 pts/2 - 0:00 ps m
- - R+ 0:00 -

! pat@pat:/tmp$ cat /proc/69[01]?/cmdline # where are the threads?

! pat@pat:/tmp$ cat /proc/6907/cmdline # where is info?

pat@pat:/tmp$ cat /proc/6908/cmdline # some info is hidden here
./threadbug

pat@pat:/tmp$ cat /proc/6909/cmdline
./threadbug

pat@pat:/tmp$ cat /proc/6910/cmdline
./threadbug

! pat@pat:/tmp$ cat /proc/6910/status # and here
Name: threadbug
State: R (running)
SleepAVG: 0%
! Tgid: 6907
! Pid: 6910 # this is PID of the thread of process 6907
PPid: 6019
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 100 100 100 100
FDSize: 256
Groups: 4 5 24 29 33 44 100 1001
VmSize: 83528 kB
VmLck: 0 kB
VmRSS: 456 kB
VmData: 82076 kB
VmStk: 12 kB
VmExe: 1 kB
VmLib: 83343 kB
! Threads: 11 # 10 subthreads and 1 leader thread
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000080000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
pat@pat:/tmp$ kill %1
pat@pat:/tmp$
[1]+ Terminated ./threadbug
pat@pat:/tmp$


I have checked threadbug program ont RedHat Enterprise with 2.4.21
kernel with NPTL 0.60. Threads was visible after leader thread exits and
process become zombie state.

My configuration:
Kernel: 2.6.9-rc9, SMP/SMT, not preemptible
libc6-2.3.2

GNU_LIBPTHREAD_VERSION = NPTL 0.60
P4 Presscott processor, HT
gcc-3.4




----------------------------------------------------------------------
Portal INTERIA.PL zaprasza... >>> http://link.interia.pl/f17cb

2004-10-06 11:36:33

by Patryk Jakubowski

[permalink] [raw]
Subject: Re: Invisible threads in 2.6.9

Petr Vandrovec wrote:

>[email protected], 2004-08-27 10:34:04-07:00, [email protected]
> [PATCH] fix MT reparenting when thread group leader dies
>
>but it is possible that it worked before that patch and this one
>actually rebroke it.
>
>/proc/<tid> is visible because of:
>
>[email protected], 2004-03-01 23:03:02-08:00, [email protected]
> [PATCH] revert the /proc thread visibility fix
>
>which was needed to get gdb to work.
> Petr
>
>
>
>
I think this should be fixed in stable kernel version, but it isn't. I
have consulted this problem in a forum. Few people can reproduce the
bug. They have kernels 2.6.7, 2.6.8. I am pretty sure I have kernel
2.6.9-rc3 from kernel.org :) I downloaded it to check if the bug is not
fixed.

Pat


----------------------------------------------------------------------
Portal INTERIA.PL zaprasza... >>> http://link.interia.pl/f17cb

2004-10-06 14:58:45

by Chris Friesen

[permalink] [raw]
Subject: Re: Invisible threads in 2.6.9

Patryk Jakubowski wrote:

> I think this should be fixed in stable kernel version, but it isn't. I
> have consulted this problem in a forum. Few people can reproduce the
> bug. They have kernels 2.6.7, 2.6.8. I am pretty sure I have kernel
> 2.6.9-rc3 from kernel.org :) I downloaded it to check if the bug is not
> fixed.

It works fine for me with 2.6.9-rc3, but I'm not using NPTL.

Chris

2004-10-07 00:26:56

by Albert Cahalan

[permalink] [raw]
Subject: Re: Invisible threads in 2.6.9

We do indeed have a kernel problem. I re-did the
example code using the raw clone() system call,
to avoid any pthreads troubles. I took out the
busy loop; add it back in if you care to verify
that it would indeed chew up CPU time.

(I started the threads stopped, so they wouldn't
need to have distinct stacks.)

-------------------------- begin example ---------------------------
$ ./zombie-leader
$ ps -mfL
UID PID PPID LWP C NLWP STIME TTY TIME CMD
albert 3224 3223 - 0 1 Sep29 pts/19 00:00:00 bash
albert - - 3224 0 1 Sep29 - 00:00:00 -
albert 7442 1 - 0 9 20:05 pts/19 00:00:00 [zombie-leader] <defunct>
albert 7457 3224 - 0 1 20:06 pts/19 00:00:00 xterm
albert - - 7457 0 1 20:06 - 00:00:00 -
albert 7475 3224 - 0 1 20:10 pts/19 00:00:00 ps -mfL
albert - - 7475 0 1 20:10 - 00:00:00 -
$ ls /proc/7442/task/
ls: /proc/7442/task/: No such file or directory
$ ls /proc/7442/
ls: cannot read symbolic link /proc/7442/cwd: Permission denied
ls: cannot read symbolic link /proc/7442/root: Permission denied
ls: cannot read symbolic link /proc/7442/exe: Permission denied
auxv cmdline cwd environ exe fd maps mem mounts root stat statm status task wchan
$ ps -mo stat,ppid,pid,tid,nlwp,args
STAT PPID PID TID NLWP COMMAND
- 3223 3224 - 1 bash
Ss - - 3224 1 -
- 1 7442 - 9 [zombie-leader] <defunct>
- 3224 7457 - 1 xterm
S - - 7457 1 -
- 3224 7477 - 1 ps -mo stat,ppid,pid,tid,nlwp,args
R+ - - 7477 1 -
---------------------------- end example -------------------------------

////////////////////// begin code ///////////////////////////
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <sched.h>

#ifndef CLONE_THREAD
#define CLONE_THREAD 0x00010000
#endif
#ifndef CLONE_DETACHED
#define CLONE_DETACHED 0x00400000
#endif
#ifndef CLONE_STOPPED
#define CLONE_STOPPED 0x02000000
#endif

// similar to NPTL pthreads, AFAIK
#define FLAGS (CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_VM|CLONE_THREAD|CLONE_DETACHED)

static pid_t one;

static void die(int signo){
(void)signo;
_exit(0);
}

static void hang(void){
for(;;) pause();
}

static int clone_fn(void *vp){
(void)vp;
hang();
return 0; // keep gcc happy
}

static long clone_stack_data[2048];
#ifdef __hppa__
static long *clone_stack = &clone_stack_data[0];
#else
static long *clone_stack = &clone_stack_data[2048];
#endif

int main(int argc, char *argv[]){
pid_t minime;
int i = 8;
(void)argc;
(void)argv;

one = getpid();
signal(SIGHUP,die);
if(fork()) hang(); // parent later killed as readyness signal

while(i--){
// better be stopped... they share a stack
minime = clone(clone_fn, clone_stack, FLAGS | CLONE_STOPPED, NULL);
if(minime==-1){
perror("no clone");
kill(one,SIGKILL);
_exit(8);
}
}

kill(one,SIGHUP); // let the shell know we're ready

_exit(0); // make task group leader a zombie
return 0; // keep gcc happy
}

/////////////////////// end code /////////////////////////////