2004-09-02 14:57:09

by Paolo Molaro

[permalink] [raw]
Subject: incorrect time accouting

While benchmarking, a user pointed out that time(1) reported
incorrect user and system times when running mono.
A typical example (running on 2.6.8.1 is):

$ /usr/bin/time mono so-sieve.exe 5000
Count: 1028
0.02user 0.00system 0:01.97elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+1509minor)pagefaults 0swaps

Where so-sieve.exe is a cpu-bound benchmark.
On 2.6.8.1 very low user and system times are reported every time, while
on both 2.4.19 and 2.2.20 sometimes the correct (or at least sensible)
results are reported, while sometimes very low timings are reported as
well.
top reports high cpu usage and low idle percentages, but with no cpu
time accounted to the mono process.
This looks like a mild security issue, since it appears there is some way
to circumvent the kernel's idea of the cpu resource usage of a process,
so the limits set become useless and users could bog down the system.

http://primates.ximian.com/~lupus/time-strace has the result of
strace -tt -f time /usr/local/bin/mono so-sieve.exe 5000 2> time-strace
The highlights include:

[pid 9630] 19:22:28.609566 execve("/usr/local/bin/mono", ["/usr/local/bin/mono", "so-sieve.exe", "5000"], [/* 33 vars */]) = 0
Pid 9630 is the main process: it creates a thread that will execute the
bulk of the cpu-intensive code (pid 9633):

[pid 9630] 19:22:28.839532 clone(Process 9633 attached child_stack=0x40d17b48, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, parent_tidptr=0x40d17bf8, {entry_number:6, base_addr:0x40d17bb0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0x40d17bf8) = 9633
The main loop starts here, after some memory allocation:

[pid 9633] 19:22:28.849138 brk(0x82ae000) = 0x82ae000
And it ends about two seconds later, with the next entry in the trace
for the 9633 pid:

[pid 9633] 19:22:30.780451 brk(0) = 0x82ae000
At the end, wait4 is called to collect the times, less than 0.5 sec
user+system:
[pid 9629] 19:22:30.821596 <... wait4 resumed> [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], 0, {ru_utime={0, 14997}, ru_stime={0, 1999}, ...}) = 9630

http://primates.ximian.com/~lupus/mono-1.1.1.tar.gz is the mono source,
if you don't have mono installed to reproduce (I tried to reproduce with
a simple pthread program what mono is doing, executing cpu-intensive
code in a subthread, but times are reported correctly there).
http://primates.ximian.com/~lupus/so-sieve.exe is the sample program,
but other programs exibited the same behaviour.
Let me know if more info is needed to track down the problem.

lupus

--
-----------------------------------------------------------------
[email protected] debian/rules
[email protected] Monkeys do it better


2004-09-03 02:18:18

by Peter Chubb

[permalink] [raw]
Subject: Re: incorrect time accouting

>>>>> "Paolo" == Paolo Molaro <[email protected]> writes:

Paolo> While benchmarking, a user pointed out that time(1) reported
Paolo> incorrect user and system times when running mono.
Paolo> A typical example (running on 2.6.8.1 is):



This is because mono is multithreaded. At present, time(1) records
only the time of the parent thread. This will change soon (I hope)
when the Roland McGrath's getrusage() patches are merged.

--
Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au
The technical we do immediately, the political takes *forever*

2004-09-03 16:39:26

by Paolo Molaro

[permalink] [raw]
Subject: Re: incorrect time accouting

On 09/03/04 Peter Chubb wrote:
> Paolo> While benchmarking, a user pointed out that time(1) reported
> Paolo> incorrect user and system times when running mono.
> Paolo> A typical example (running on 2.6.8.1 is):
>
> This is because mono is multithreaded. At present, time(1) records
> only the time of the parent thread. This will change soon (I hope)
> when the Roland McGrath's getrusage() patches are merged.

Thanks for the feedback. This doesn't explain, though, why on 2.4.x and
2.2.x sometimes sensible results are reported and sometimes not and it
doesn't explain the results I got with a simple pthread test case on 2.6.8.1.
[...doing more testing...]
See the attached test case: the results reported are:

$ gcc test-thread-rusage.c -lpthread
$ time ./a.out join
using a subthread
done

real 0m1.469s
user 0m1.461s
sys 0m0.003s
$ time ./a.out sleep
using a subthread

real 0m1.007s
user 0m0.000s
sys 0m0.001s

So it looks like times for subthreads are accounted for, but only when
the thread is joined (or if it is given time to properly exit). I guess
Roland's patch takes care of the case when a subthread is destroyed by
exiting from main() as well?
A little tracing shows that the intermittent 2.4/2.2 results and the
always low timings on 2.6 are probably due to the main program exiting
before the subthread cleaned up after itself (the results are
intermittent because it's a timing issue: we just wait for the subthread
to finish it's work and it's preempted away before it has a chance of
calling pthread_exit()).
Thanks.

lupus

--
-----------------------------------------------------------------
[email protected] debian/rules
[email protected] Monkeys do it better


Attachments:
(No filename) (1.71 kB)
test-thread-rusage.c (748.00 B)
Download all attachments