2006-01-17 18:42:41

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess at getrusage()

Ravikiran G Thirumalai wrote:
>
> Sorry for the delay..
>
> On Tue, Jan 10, 2006 at 10:03:35PM +0300, Oleg Nesterov wrote:
> >
> > Sorry, I can't undestand. Could you please be more verbose ?
>
> Last thread (RUSAGE_SELF) Exiting thread
>
> [ ... ]
>
> utime = cputime_add(utime, p->signal->utime); /* use cached load above */
> stime = cputime_add(stime, p->signal->stime); /* load from memory */

Thanks for your explanation, now I see what you mean.

But don't we already discussed this issue? I think that RUSAGE_SELF
case always not 100% accurate, so it is Ok to ignore this race.

What if that thread has not exited yet? We take tasklist lock, but
this can't help, because this thread possibly updates it's ->xtime
right now on another cpu, and we have exactly same problem.

> > However, do you have any numbers or thoughts why this optimization
> > can make any _visible_ effect?
>
> We know we don't need locks there, so I do not understand why we
> should keep them. Locks are always serializing and expensive operations. I
> believe on some arches disabling on-chip interrupts is also an expensive
> operation...some arches might use hypervisor calls to do that which I guess
> will have its own overhead...so why have it when we know we don't need it?

I think it is better not to complicate the code unless we can see
some difference in practice.

That said, I don't have a strong feeling that I am right (on both
issues), so please feel free to ignore me.

Oleg.


2006-01-17 19:53:00

by Ravikiran G Thirumalai

[permalink] [raw]
Subject: Re: [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess at getrusage()

On Tue, Jan 17, 2006 at 10:59:02PM +0300, Oleg Nesterov wrote:
> Ravikiran G Thirumalai wrote:
> >
> > Sorry for the delay..
> >
> > On Tue, Jan 10, 2006 at 10:03:35PM +0300, Oleg Nesterov wrote:
> > >
> > > Sorry, I can't undestand. Could you please be more verbose ?
> >
> > Last thread (RUSAGE_SELF) Exiting thread
> >
> > [ ... ]
> >
> > utime = cputime_add(utime, p->signal->utime); /* use cached load above */
> > stime = cputime_add(stime, p->signal->stime); /* load from memory */
>
> Thanks for your explanation, now I see what you mean.
>
> But don't we already discussed this issue? I think that RUSAGE_SELF
> case always not 100% accurate, so it is Ok to ignore this race.

It is not 100% accurate as in we lose time accounting for one clock tick
for the task_struct->utime, stime counters. But
task_struct->signal->utime,stime collect rusage times of an exiting thread,
so we would be introducing large inaccuracies if we don't use rmb here.
Take the case when an exiting thread has a large utime stime value, and
rusage reports utime before thread exit and stime after thread exit... the
result would look wierd.
So IMHO, while inaccuracies in task_struct->xxx time can be tolerated, it
might not be such a good idea to for task_struct->signal->xxx counters.

Thanks,
Kiran