2006-08-01 14:43:36

by Balbir Singh

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Jay Lan wrote:
>
> -#define TASKSTATS_VERSION 1
> +#define TASKSTATS_VERSION 2
> +#define TASK_COMM_LEN 16
>

We should find a way to keep this in sync with with the definition
in linux/sched.h (won't we a warning if both this header and
linux/sched.h are included together?)



> + * fill in basic accounting fields
> + */
> +static void bacct_add_tsk(struct taskstats *stats, struct task_struct *tsk)
> +{
> + u64 run_time;
> + struct timespec uptime;
> +
> + /* calculate run_time in nsec */
> + do_posix_clock_monotonic_gettime(&uptime);
> + run_time = (u64)uptime.tv_sec*NSEC_PER_SEC + uptime.tv_nsec;
> + run_time -= (u64)current->group_leader->start_time.tv_sec * NSEC_PER_SEC
> + + current->group_leader->start_time.tv_nsec;
> + do_div(run_time, NSEC_PER_USEC); /* rebase run_time to usec */
> + stats->ac_etime = run_time;
> + do_div(run_time, USEC_PER_SEC); /* rebase run_time to sec */
> + stats->ac_btime = xtime.tv_sec - run_time;
> + if (thread_group_leader(tsk)) {
> + stats->ac_exitcode = tsk->exit_code;
> + if (tsk->flags & PF_FORKNOEXEC)
> + stats->ac_flag |= AFORK;
> + }
> + if (tsk->flags & PF_SUPERPRIV)
> + stats->ac_flag |= ASU;
> + if (tsk->flags & PF_DUMPCORE)
> + stats->ac_flag |= ACORE;
> + if (tsk->flags & PF_SIGNALED)
> + stats->ac_flag |= AXSIG;
> + stats->ac_nice = task_nice(tsk);
> + stats->ac_sched = tsk->policy;
> + stats->ac_uid = tsk->uid;
> + stats->ac_gid = tsk->gid;
> + stats->ac_pid = tsk->pid;
> + stats->ac_ppid = (tsk->parent) ? tsk->parent->pid : 0;
> + stats->ac_utime = tsk->utime * USEC_PER_TICK;
> + stats->ac_stime = tsk->stime * USEC_PER_TICK;

I think you should use the portable cputime_xxxx() API since
tsk->utime and tsk->stime are of type cputime_t


> + /* Each process gets a minimum of a half tick cpu time */
> + if ((stats->ac_utime == 0) && (stats->ac_stime == 0)) {
> + stats->ac_stime = USEC_PER_TICK/2;
> + }
> +

This is confusing. Half tick does not make any sense from the
scheduler view point (or am I missing something?), so why
return half a tick to the user.


--

Balbir Singh,
Linux Technology Center,
IBM Software Labs


2006-08-01 21:51:50

by Jay Lan

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Balbir Singh wrote:
> Jay Lan wrote:
>
>>
>> -#define TASKSTATS_VERSION 1
>> +#define TASKSTATS_VERSION 2
>> +#define TASK_COMM_LEN 16
>>
>
>
> We should find a way to keep this in sync with with the definition
> in linux/sched.h (won't we a warning if both this header and
> linux/sched.h are included together?)

I do not know how to sync it up. This header linux/taskstats.h is
meant to be included by userspace programs. If an application
happens to include linux/sched.h, which includes linux/time.h,
the application will very likely have compilation errors because
the "struct timespec" declaration in <linux/time.h> and <time.h>
are conflicting.

The <linux/acct.h> defines it to
#define ACCT_COMM 16

I can change our define to TS_COMM_LEN with remakes saying it
should be in sync with the TAKS_COMM_LEN defined in linux/sched.h.

If there is a better way, i am eager to know it.

>
>
>
>> + * fill in basic accounting fields
>> + */
>> +static void bacct_add_tsk(struct taskstats *stats, struct task_struct
>> *tsk)
>> +{
>> + u64 run_time;
>> + struct timespec uptime;
>> +
>> + /* calculate run_time in nsec */
>> + do_posix_clock_monotonic_gettime(&uptime);
>> + run_time = (u64)uptime.tv_sec*NSEC_PER_SEC + uptime.tv_nsec;
>> + run_time -= (u64)current->group_leader->start_time.tv_sec *
>> NSEC_PER_SEC
>> + + current->group_leader->start_time.tv_nsec;
>> + do_div(run_time, NSEC_PER_USEC); /* rebase run_time to usec */
>> + stats->ac_etime = run_time;
>> + do_div(run_time, USEC_PER_SEC); /* rebase run_time to sec */
>> + stats->ac_btime = xtime.tv_sec - run_time;
>> + if (thread_group_leader(tsk)) {
>> + stats->ac_exitcode = tsk->exit_code;
>> + if (tsk->flags & PF_FORKNOEXEC)
>> + stats->ac_flag |= AFORK;
>> + }
>> + if (tsk->flags & PF_SUPERPRIV)
>> + stats->ac_flag |= ASU;
>> + if (tsk->flags & PF_DUMPCORE)
>> + stats->ac_flag |= ACORE;
>> + if (tsk->flags & PF_SIGNALED)
>> + stats->ac_flag |= AXSIG;
>> + stats->ac_nice = task_nice(tsk);
>> + stats->ac_sched = tsk->policy;
>> + stats->ac_uid = tsk->uid;
>> + stats->ac_gid = tsk->gid;
>> + stats->ac_pid = tsk->pid;
>> + stats->ac_ppid = (tsk->parent) ? tsk->parent->pid : 0;
>> + stats->ac_utime = tsk->utime * USEC_PER_TICK;
>> + stats->ac_stime = tsk->stime * USEC_PER_TICK;
>
>
> I think you should use the portable cputime_xxxx() API since
> tsk->utime and tsk->stime are of type cputime_t

Will fix it.

>
>
>> + /* Each process gets a minimum of a half tick cpu time */
>> + if ((stats->ac_utime == 0) && (stats->ac_stime == 0)) {
>> + stats->ac_stime = USEC_PER_TICK/2;
>> + }
>> +
>
>
> This is confusing. Half tick does not make any sense from the
> scheduler view point (or am I missing something?), so why
> return half a tick to the user.

It must be inherited from old code dated back to Cray UNICOS.
I do not know if bad thing can happen if both utime and stime
are less than 1 usec... I guess not. But i agree that
half a tick does not make sense. To play safe, we can change
it to 1 usec if both utime and stime are sub microsecond.
What do you think?

Thanks,
- jay


>
>

2006-08-02 15:32:05

by Shailabh Nagar

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Jay Lan wrote:
> Balbir Singh wrote:
>
>> Jay Lan wrote:
>>
>>>
>>> -#define TASKSTATS_VERSION 1
>>> +#define TASKSTATS_VERSION 2
>>> +#define TASK_COMM_LEN 16
>>>
>>
>>
>>
>> We should find a way to keep this in sync with with the definition
>> in linux/sched.h (won't we a warning if both this header and
>> linux/sched.h are included together?)
>
>
> I do not know how to sync it up. This header linux/taskstats.h is
> meant to be included by userspace programs. If an application
> happens to include linux/sched.h, which includes linux/time.h,
> the application will very likely have compilation errors because
> the "struct timespec" declaration in <linux/time.h> and <time.h>
> are conflicting.
>
> The <linux/acct.h> defines it to
> #define ACCT_COMM 16
>
> I can change our define to TS_COMM_LEN with remakes saying it
> should be in sync with the TAKS_COMM_LEN defined in linux/sched.h.

This seems like a good enough way to do it. There's no real need for
the taskstats comm length to remain exactly in sync with the task struct's
comm length (by way of trying to include sched.h etc.) though avoiding the
compile error by renaming is desirable as Balbir pointed out.

Moreover, TASK_COMM_LEN in linux/sched.h isn't likely to change much -
if it increases and csa_acct users also really need the extra info provided,
taskstats can always be changed and version bumped up. If the size decreases
there's no harm done (strncpy should be sufficient protection).

--Shailabh

2006-08-02 17:17:20

by Balbir Singh

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Shailabh Nagar wrote:
> Jay Lan wrote:
>> Balbir Singh wrote:
>>
>>> Jay Lan wrote:
>>>
>>>>
>>>> -#define TASKSTATS_VERSION 1
>>>> +#define TASKSTATS_VERSION 2
>>>> +#define TASK_COMM_LEN 16
>>>>
>>>
>>>
>>>
>>> We should find a way to keep this in sync with with the definition
>>> in linux/sched.h (won't we a warning if both this header and
>>> linux/sched.h are included together?)
>>
>>
>> I do not know how to sync it up. This header linux/taskstats.h is
>> meant to be included by userspace programs. If an application
>> happens to include linux/sched.h, which includes linux/time.h,
>> the application will very likely have compilation errors because
>> the "struct timespec" declaration in <linux/time.h> and <time.h>
>> are conflicting.
>>
>> The <linux/acct.h> defines it to
>> #define ACCT_COMM 16
>>
>> I can change our define to TS_COMM_LEN with remakes saying it
>> should be in sync with the TAKS_COMM_LEN defined in linux/sched.h.
>
> This seems like a good enough way to do it. There's no real need for
> the taskstats comm length to remain exactly in sync with the task struct's
> comm length (by way of trying to include sched.h etc.) though avoiding the
> compile error by renaming is desirable as Balbir pointed out.
>
> Moreover, TASK_COMM_LEN in linux/sched.h isn't likely to change much -
> if it increases and csa_acct users also really need the extra info
> provided,
> taskstats can always be changed and version bumped up. If the size
> decreases
> there's no harm done (strncpy should be sufficient protection).
>
> --Shailabh
>

I am not sure if there is a version of BUG_ON() for compile time
asserts. Basically, if we have an infrastructure of the form

/*
* From C/C++ users journal November 2004
*/
#define STATIC_BUG_ON(e) \
switch (0) { \
case 0: \
case (e): \
; \
}

Then the STATIC_BUG_ON() can catch as shown below.

#define TASK_COMM_LEN 16
#define T_COMM_LEN 20

int
main(void)
{
STATIC_BUG_ON(TASK_COMM_LEN == T_COMM_LEN);
}

STATIC_BUG_ON gives the following warning

bug_on_c.c: In function `main':
bug_on_c.c:19: duplicate case value
bug_on_c.c:19: previously used here

but with T_COMM_LEN set to 16

It compiles without any errors, the code generated also
looks like it has no overhead

int
main(void)
{
8048310: 55 push %ebp
8048311: 89 e5 mov %esp,%ebp
8048313: 83 ec 08 sub $0x8,%esp
8048316: 83 e4 f0 and $0xfffffff0,%esp
STATIC_BUG_ON(TASK_COMM_LEN == T_COMM_LEN);
}
8048319: c9 leave
804831a: c3 ret
804831b: 90 nop


Assuming such infrastructure is available, you could then
do

#ifdef __KERNEL__
#include <linux/sched.h>
#define TS_COMM_LEN 16
STATIC_BUG_ON (TS_COMM_LEN == TASK_COMM_LEN);
#endif

Comments?

--

Balbir Singh,
Linux Technology Center,
IBM Software Labs

2006-08-02 17:38:14

by Shailabh Nagar

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Balbir Singh wrote:

>
> I am not sure if there is a version of BUG_ON() for compile time
> asserts. Basically, if we have an infrastructure of the form
>
> /*
> * From C/C++ users journal November 2004
> */
> #define STATIC_BUG_ON(e) \
> switch (0) { \
> case 0: \
> case (e): \
> ; \
> }
>
> Then the STATIC_BUG_ON() can catch as shown below.
>
> #define TASK_COMM_LEN 16
> #define T_COMM_LEN 20
>
> int
> main(void)
> {
> STATIC_BUG_ON(TASK_COMM_LEN == T_COMM_LEN);
> }
>
> STATIC_BUG_ON gives the following warning
>
> bug_on_c.c: In function `main':
> bug_on_c.c:19: duplicate case value
> bug_on_c.c:19: previously used here
>
> but with T_COMM_LEN set to 16
>
> It compiles without any errors, the code generated also
> looks like it has no overhead
>
> int
> main(void)
> {
> 8048310: 55 push %ebp
> 8048311: 89 e5 mov %esp,%ebp
> 8048313: 83 ec 08 sub $0x8,%esp
> 8048316: 83 e4 f0 and $0xfffffff0,%esp
> STATIC_BUG_ON(TASK_COMM_LEN == T_COMM_LEN);
> }
> 8048319: c9 leave
> 804831a: c3 ret
> 804831b: 90 nop
>
>
> Assuming such infrastructure is available, you could then
> do
>
> #ifdef __KERNEL__
> #include <linux/sched.h>
> #define TS_COMM_LEN 16
> STATIC_BUG_ON (TS_COMM_LEN == TASK_COMM_LEN);
> #endif
>
> Comments?
>

Neat trick !
Perhaps STATIC_WARNING is a more appropriate name but
something like this for general use may be good.


--Shailabh

2006-08-02 21:09:54

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

On Wed, 02 Aug 2006 22:47:07 +0530
Balbir Singh <[email protected]> wrote:

> I am not sure if there is a version of BUG_ON() for compile time
> asserts

We have BUILD_BUG_ON()

2006-08-02 23:47:03

by Jay Lan

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Andrew Morton wrote:
>On Wed, 02 Aug 2006 22:47:07 +0530
>Balbir Singh <[email protected]> wrote:
>
>
>>I am not sure if there is a version of BUG_ON() for compile time
>>asserts
>>
>
>We have BUILD_BUG_ON()
>

BUILD_BUG_ON() is a statement. I will do that inside bacct_add_tsk().

Thanks,
- jay

2006-08-03 03:03:08

by Balbir Singh

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Andrew Morton wrote:
> On Wed, 02 Aug 2006 22:47:07 +0530
> Balbir Singh <[email protected]> wrote:
>
>> I am not sure if there is a version of BUG_ON() for compile time
>> asserts
>
> We have BUILD_BUG_ON()

Thanks, I remember that we had something similar, but was lost
looking for it in asm-*/bug.h.

--

Balbir Singh,
Linux Technology Center,
IBM Software Labs

2006-08-07 21:23:11

by Jay Lan

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Jay Lan wrote:

[snip]

>>
>>
>>> + /* Each process gets a minimum of a half tick cpu time */
>>> + if ((stats->ac_utime == 0) && (stats->ac_stime == 0)) {
>>> + stats->ac_stime = USEC_PER_TICK/2;
>>> + }
>>> +
>>
>>
>>
>> This is confusing. Half tick does not make any sense from the
>> scheduler view point (or am I missing something?), so why
>> return half a tick to the user.
>
>
> It must be inherited from old code dated back to Cray UNICOS.
> I do not know if bad thing can happen if both utime and stime
> are less than 1 usec... I guess not. But i agree that
> half a tick does not make sense. To play safe, we can change
> it to 1 usec if both utime and stime are sub microsecond.
> What do you think?

Hi Balbir,

I figured this out. The tsk->stime (and utime as well) are
charged by 1 tick (or cputime) from the timer interrupt handler
through update_process_times->account_{user,system}_time.

The clock resolution is a tick. Any short process less than
1 tick will the counter being 0. It can be from 0 to 0.99999...
tick. A half tick is the average value.

I think it makes more sense to assign a half tick than assign
1 usec to the stime. What do you think? Certainly the code need
better explanation.

Regards,
- jay


[snip]

2006-08-08 05:22:46

by Balbir Singh

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Jay Lan wrote:
> Jay Lan wrote:
>
> [snip]
>
>>>
>>>
>>>> + /* Each process gets a minimum of a half tick cpu time */
>>>> + if ((stats->ac_utime == 0) && (stats->ac_stime == 0)) {
>>>> + stats->ac_stime = USEC_PER_TICK/2;
>>>> + }
>>>> +
>>>
>>>
>>>
>>> This is confusing. Half tick does not make any sense from the
>>> scheduler view point (or am I missing something?), so why
>>> return half a tick to the user.
>>
>>
>> It must be inherited from old code dated back to Cray UNICOS.
>> I do not know if bad thing can happen if both utime and stime
>> are less than 1 usec... I guess not. But i agree that
>> half a tick does not make sense. To play safe, we can change
>> it to 1 usec if both utime and stime are sub microsecond.
>> What do you think?
>
> Hi Balbir,
>
> I figured this out. The tsk->stime (and utime as well) are
> charged by 1 tick (or cputime) from the timer interrupt handler
> through update_process_times->account_{user,system}_time.
>
> The clock resolution is a tick. Any short process less than
> 1 tick will the counter being 0. It can be from 0 to 0.99999...
> tick. A half tick is the average value.
>

But the scheduling happens in the granularity of a tick, so the minimum each
task gets is a tick.

> I think it makes more sense to assign a half tick than assign
> 1 usec to the stime. What do you think? Certainly the code need
> better explanation.
>

Can't we leave these values as zero in case both stime and utime are zero.


> Regards,
> - jay
>
>
> [snip]


--
Warm Regards,
Balbir Singh,
Linux Technology Center,
IBM Software Labs

2006-08-08 16:40:01

by Jay Lan

[permalink] [raw]
Subject: Re: [patch 1/3] add basic accounting fields to taskstats

Balbir Singh wrote:
> Jay Lan wrote:
>> Jay Lan wrote:
>>
>> [snip]
>>
>>>>
>>>>
>>>>> + /* Each process gets a minimum of a half tick cpu time */
>>>>> + if ((stats->ac_utime == 0) && (stats->ac_stime == 0)) {
>>>>> + stats->ac_stime = USEC_PER_TICK/2;
>>>>> + }
>>>>> +
>>>>
>>>>
>>>>
>>>> This is confusing. Half tick does not make any sense from the
>>>> scheduler view point (or am I missing something?), so why
>>>> return half a tick to the user.
>>>
>>>
>>> It must be inherited from old code dated back to Cray UNICOS.
>>> I do not know if bad thing can happen if both utime and stime
>>> are less than 1 usec... I guess not. But i agree that
>>> half a tick does not make sense. To play safe, we can change
>>> it to 1 usec if both utime and stime are sub microsecond.
>>> What do you think?
>>
>> Hi Balbir,
>>
>> I figured this out. The tsk->stime (and utime as well) are
>> charged by 1 tick (or cputime) from the timer interrupt handler
>> through update_process_times->account_{user,system}_time.
>>
>> The clock resolution is a tick. Any short process less than
>> 1 tick will the counter being 0. It can be from 0 to 0.99999...
>> tick. A half tick is the average value.
>>
>
> But the scheduling happens in the granularity of a tick, so the
> minimum each task gets is a tick.
>
>> I think it makes more sense to assign a half tick than assign
>> 1 usec to the stime. What do you think? Certainly the code need
>> better explanation.
>>
>
> Can't we leave these values as zero in case both stime and utime are
> zero.

Yes, i will leave them as zero in this case.

Regards,
- jay

>
>
>> Regards,
>> - jay
>>
>>
>> [snip]
>
>