2004-11-18 12:35:08

by kladit

[permalink] [raw]
Subject: kernel reports kill too late?

With kernels newer 2.6.10.rc1-bk18 glibc nptl checks failed.

Now this is fixed (See lkml "Futex queue_me/get_user ordering")
another problem showed off.

I get spurious "Failed to kill test process: No child processes"
errors during nptl checks now.

Appended is a simple test-prog derived from glibc's check which
runs fine on 2.6.10.rc1-bk18 but failed most of the time with
newer kernels.

(Not beeing a guru at all) I interpret the results as a delay
or loss of the status of a killed process which happens only
if this process runs a thread.
(Not calling sleep_mostly() as a thread works as expected.)

Further noteworthy: it happens not always.



Output on 2.6.10.rc1-bk18
thread starts spinning
thread alive
try to kill pid 1436
killed after 4007 waitpid calls
killed by signal: status = 0

Output on 2.6.10.rc2-bk2
test 0 --------------------------------
thread starts spinning
thread alive
try to kill pid 32321
killed after 3389 waitpid calls
killed by signal: status = 0
test 1 --------------------------------
thread starts spinning
thread alive
try to kill pid 32323
killed after 17 waitpid calls
killed by signal: status = 138
Kill failed! waitpid returned -1

Both systems are smp 2xP4 and 2xP3.

Can anyone else see this?

--
Regards Klaus



Attachments:
(No filename) (1.23 kB)
2waitpid-test.c (1.64 kB)
Download all attachments