Date: Fri, 24 Jul 2009 10:23:20 -0700
From: Matt Helsley <matthltc@us.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Matt Helsley <matthltc@us.ibm.com>, Ben Blum <bblum@google.com>,
       linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org,
       akpm@linux-foundation.org, serue@us.ibm.com, lizf@cn.fujitsu.com
Subject: Re: [PATCH 5/6] Makes procs file writable to move all threads by
	tgid at once
Message-ID: <20090724172320.GH5878@count0.beaverton.ibm.com>
References: <20090724032033.2463.79256.stgit@hastromil.mtv.corp.google.com> <20090724032200.2463.82408.stgit@hastromil.mtv.corp.google.com> <20090724155041.GF5878@count0.beaverton.ibm.com> <6599ad830907240901w4fc02097k83d0c1012708e6ee@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <6599ad830907240901w4fc02097k83d0c1012708e6ee@mail.gmail.com>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2457
Lines: 59

On Fri, Jul 24, 2009 at 09:01:46AM -0700, Paul Menage wrote:
> On Fri, Jul 24, 2009 at 8:50 AM, Matt Helsley<matthltc@us.ibm.com> wrote:
> >
> > There is much ado about not taking additional "global locks" in fork()
> > paths.
> >
> > * The fork and exit callbacks cgroup_fork() and cgroup_exit(), don't
> > * (usually) take cgroup_mutex. ?These are the two most performance
> > * critical pieces of code here.
> > ...
> >
> > and as I recall cgroup_fork() doesn't ever take cgroup_mutex because it is
> > so performance critical.
> 
> cgroup_mutex is a much bigger and heavier mutex than the new rwsem
> being introduced in this patch. It's sort of the BKL of cgroups,
> although where possible I'm encouraging use of finer-grained
> alternatives (such as subsystem-specific locks, the per-hierarchy
> lock, etc).
> 
> > Assuming the above comments in kernel/cgroup.c
> > are correct then this patch adds a performance regression by introducing a
> > global mutex in the fork path, doesn't it?
> 
> Yes, although to what magnitude isn't clear.

OK.

> Alternatives that we looked at were:
> 
> - add a clone_rwsem to task_struct, and require that a clone operation
> that's adding to the same thread group take a read lock on the
> leader's clone_rwsem; then the effect would be localised to a single
> process; but for a system that has one big multi-threaded server on
> it, the effect would still be similar to a global lock

Except most processes aren't big multi-threaded servers and not everyone
runs such processes. They'll experience the overhead of a global lock
when they don't have to. Again, it's a question of magnitudes we don't
know I think.

> 
> - move the housekeeping done by cgroup_fork() inside the tasklist_lock
> critical section in do_fork(); then cgroup_attach_proc() can rely on
> the existing global tasklist_lock to provide the necessary
> synchronization, rather than introducing a second global lock; the
> downside is that it slightly increases the size of the section where
> tasklist_lock is held for write.

Well, I imagine holding tasklist_lock is worse than cgroup_mutex in some
ways since it's used even more widely. Makes sense not to use it here..

Cheers,
	-Matt Helsley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/