Date: Tue, 29 Aug 2006 11:05:11 -0700
From: "Paul E. McKenney" <paulmck@us.ibm.com>
To: Paul Jackson <pj@sgi.com>
Cc: ego@in.ibm.com, mingo@elte.hu, nickpiggin@yahoo.com.au,
       arjan@infradead.org, rusty@rustcorp.com.au, torvalds@osdl.org,
       akpm@osdl.org, linux-kernel@vger.kernel.org, arjan@intel.linux.com,
       davej@redhat.com, dipankar@in.ibm.com, vatsa@in.ibm.com,
       ashok.raj@intel.com
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug
Message-ID: <20060829180511.GA1495@us.ibm.com>
Reply-To: paulmck@us.ibm.com
References: <20060824103417.GE2395@in.ibm.com> <1156417200.3014.54.camel@laptopd505.fenrus.org> <20060824140342.GI2395@in.ibm.com> <1156429015.3014.68.camel@laptopd505.fenrus.org> <44EDBDDE.7070203@yahoo.com.au> <20060824150026.GA14853@elte.hu> <20060825035328.GA6322@in.ibm.com> <20060827005944.67f51e92.pj@sgi.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20060827005944.67f51e92.pj@sgi.com>
User-Agent: Mutt/1.4.1i
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3803
Lines: 81

On Sun, Aug 27, 2006 at 12:59:44AM -0700, Paul Jackson wrote:
> Gautham wrote:
> > Which is why I did not expose the locking(at least the write side of it)
> > outside. We don't want too many lazy programmers anyway! 
> 
> Good thing you did that ;).  This lazy programmer was already
> having fantasies of changing the cpuset locks (in kernel/cpuset.c)
> manage_mutex and callback_mutex to these writer and reader "unfair
> rwsem" locks, respectively.
> 
> Essentially these cpuset locks need to guard the cpuset hierarchy, just
> as your locks need to guard cpu_online_map.
> 
> The change agents (such as a system admin changing something in
> the /dev/cpuset hierarchy) are big slow mammoths that appear rarely,
> and need to single thread their entire operation, preventing anyone
> else from changing the cpuset hierarchy for an extended period of time,
> while they validate the request and setup to make the requested change
> or changes.
> 
> The inhibitors are a swarm of locusts, that change nothing, and need
> quick, safe access, free of change during a brief critical section.
> 
> Finally the mammoths must not trample the locusts (change what the
> locusts see during their critical sections.)
> 
> The cpuset change agents (mammoths) take manage_mutex for their entire
> operation, locking each other out.  They also take callback_mutex when
> making the actual changes that might momentarilly make the cpuset
> structures inconsistent.

Can the locusts reasonably take a return value from the acquisition
primitive and feed it to the release primitive?

							Thanx, Paul

> The cpuset readers (locusts) take callback_mutex for the brief critical
> section during which they read out a value or two from the cpuset
> structures.
> 
> If I understand your unfair rwsem proposal, the cpuset locks differ
> from your proposal in these ways:
>  1) The cpuset locks are crafted from existing mutex mechanisms.
>  2) The 'reader' (change inhibitor) side, aka the callback_mutex,
>     is single threaded.  No nesting or sharing of that lock allowed.
>     This is ok for inhibiting changes to the cpuset structures, as
>     that is not done on any kernel hot code paths (the cpuset
>     'locusts' are actually few in number, with tricks such as
>     the cpuset generation number used to suppress the locust
>     population.)  It would obviously not be ok to single thread
>     reading cpu_online_map.
>  3) The 'writer' side (the mammoths), after taking the big
>     manage_mutex for its entire operation, *also* has to take the
>     small callback_mutex briefly around any actual changes, to lock
>     out readers.
>  4) Frequently accessed cpuset data, such as the cpus and memory
>     nodes allowed to a task, are cached in the task struct, to
>     keep from accessing the tasks cpuset frequently (more locust
>     population suppression.)
> 
> Differences (2) and (4) are compromises, imposed by difference (1).
> 
> The day might come when there are too many cpuset locusts -- too many
> tasks taking the cpuset callback_mutex, and then something like your
> unfair rwsem's could become enticing.
> 
> -- 
>                   I won't rest till it's the best ...
>                   Programmer, Linux Scalability
>                   Paul Jackson <pj@sgi.com> 1.925.600.0401
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/