Ravikiran G Thirumalai wrote:
>
> On Sun, Jan 08, 2006 at 02:49:31PM +0300, Oleg Nesterov wrote:
> > > + } else
> > > + /* See locking comments above */
> > > + smp_rmb();
> >
> > This patch doesn't try to optimize ->sighand.siglock locking,
> > and I think this is right. But this also means we don't need
> > rmb() here. It was needed to protect against "another thread
> > just exited, cpu can read ->c* values before thread_group_empty()
> > without taking siglock" case, now it is not possible.
>
> Don't we still need rmb for the RUSAGE_SELF case? we do not take the
> siglock for rusage self and the non c* signal fields are written to
> at __exit_signal...
I think it is unneeded because RUSAGE_SELF case is "racy" anyway even
if we held both locks, task_struct->xxx counters can change at any
moment.
But may be you are right.
> What is wrong with optimizing by not taking the siglock in RUSAGE_BOTH
> and RUSAGE_CHILDREN? I would like to add that in too unless I am
> missing something and the optimization is incorrect.
We can't have contention on ->siglock when need_lock == 0, so why should
we optimize this case?
Oleg.
On Mon, Jan 09, 2006 at 09:55:51PM +0300, Oleg Nesterov wrote:
> Ravikiran G Thirumalai wrote:
> >
> >
> > Don't we still need rmb for the RUSAGE_SELF case? we do not take the
> > siglock for rusage self and the non c* signal fields are written to
> > at __exit_signal...
>
> I think it is unneeded because RUSAGE_SELF case is "racy" anyway even
> if we held both locks, task_struct->xxx counters can change at any
> moment.
>
> But may be you are right.
Hmm...access to task_struct->xxx has been racy, but accessing the
signal->* counters were not. What if read of the signal->utime was a
hoisted read and signal->stime was a read after the counter is updated?
This was not a possibility earlier no?
>
> > What is wrong with optimizing by not taking the siglock in RUSAGE_BOTH
> > and RUSAGE_CHILDREN? I would like to add that in too unless I am
> > missing something and the optimization is incorrect.
>
> We can't have contention on ->siglock when need_lock == 0, so why should
> we optimize this case?
We would be saving 1 buslocked operation in that case (on some arches), a
cacheline fetch for exclusive (since signal and sighand are on different memory
locations), and disabling/enabling onchip interrupts. But yes, this would be a
smaller optimization....Unless you have strong objections this can also
go in?
Thanks,
Kiran
Signed-off-by: Ravikiran Thirumalai <[email protected]>
Index: linux-2.6.15-mm2/kernel/sys.c
===================================================================
--- linux-2.6.15-mm2.orig/kernel/sys.c 2006-01-09 12:44:30.000000000 -0800
+++ linux-2.6.15-mm2/kernel/sys.c 2006-01-09 12:45:07.000000000 -0800
@@ -1730,14 +1730,16 @@ static void k_getrusage(struct task_stru
switch (who) {
case RUSAGE_BOTH:
case RUSAGE_CHILDREN:
- spin_lock_irqsave(&p->sighand->siglock, flags);
+ if (need_lock)
+ spin_lock_irqsave(&p->sighand->siglock, flags);
utime = p->signal->cutime;
stime = p->signal->cstime;
r->ru_nvcsw = p->signal->cnvcsw;
r->ru_nivcsw = p->signal->cnivcsw;
r->ru_minflt = p->signal->cmin_flt;
r->ru_majflt = p->signal->cmaj_flt;
- spin_unlock_irqrestore(&p->sighand->siglock, flags);
+ if (need_lock)
+ spin_unlock_irqrestore(&p->sighand->siglock, flags);
if (who == RUSAGE_CHILDREN)
break;