Subject: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug


--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"


Attachments:
(No filename) (165.00 B)
4of4.patch (23.25 kB)
Download all attachments

2006-08-24 11:00:41

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Thu, 2006-08-24 at 16:04 +0530, Gautham R Shenoy wrote:
>
>
> This patch renames lock_cpu_hotplug to cpu_hotplug_disable and
> unlock_cpu_hotplug to cpu_hotplug_enable throughout the kernel.

Hi,

to be honest I dislike the new names too. You turned it into a refcount,
which is good, but the normal linux name for such refcount functions is
_get and _put..... and in addition the refcount technically isn't
hotplug specific, all you want is to keep the kernel data for the
processor as being "used", so cpu_get() and cpu_put() would sound
reasonable names to me, or cpu_data_get() cpu_data_put().


Greetings,
Arjan van de Ven

Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Thu, Aug 24, 2006 at 01:00:00PM +0200, Arjan van de Ven wrote:
> On Thu, 2006-08-24 at 16:04 +0530, Gautham R Shenoy wrote:
> >
> >
> > This patch renames lock_cpu_hotplug to cpu_hotplug_disable and
> > unlock_cpu_hotplug to cpu_hotplug_enable throughout the kernel.
>
> Hi,
>
> to be honest I dislike the new names too. You turned it into a refcount,
> which is good, but the normal linux name for such refcount functions is
> _get and _put..... and in addition the refcount technically isn't
> hotplug specific, all you want is to keep the kernel data for the
> processor as being "used", so cpu_get() and cpu_put() would sound
> reasonable names to me, or cpu_data_get() cpu_data_put().

Thus, choice of 'cpu_hotplug_disable' and 'cpu_hotplug_enable'
was determined on the basis of its purpose, as in *what* it does
as opposed to *how* it does it. :)

The name is not much of a concern. I just didn't feel all that
comfortable with lock_cpu_hotplug because locks are usually
used to safeguard the data,while here we are trying to
maintain the *state* of the system.

Regards
ego
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

2006-08-24 14:17:15

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Thu, 2006-08-24 at 19:33 +0530, Gautham R Shenoy wrote:
> On Thu, Aug 24, 2006 at 01:00:00PM +0200, Arjan van de Ven wrote:
> > On Thu, 2006-08-24 at 16:04 +0530, Gautham R Shenoy wrote:
> > >
> > >
> > > This patch renames lock_cpu_hotplug to cpu_hotplug_disable and
> > > unlock_cpu_hotplug to cpu_hotplug_enable throughout the kernel.
> >
> > Hi,
> >
> > to be honest I dislike the new names too. You turned it into a refcount,
> > which is good, but the normal linux name for such refcount functions is
> > _get and _put..... and in addition the refcount technically isn't
> > hotplug specific, all you want is to keep the kernel data for the
> > processor as being "used", so cpu_get() and cpu_put() would sound
> > reasonable names to me, or cpu_data_get() cpu_data_put().
>
> Thus, choice of 'cpu_hotplug_disable' and 'cpu_hotplug_enable'
> was determined on the basis of its purpose, as in *what* it does
> as opposed to *how* it does it. :)

well.. it comes down to the difference of locking to protect data versus
locking to protect against a specific piece of code. Almost always the
later turns out to be a mistake...


2006-08-24 14:56:06

by Nick Piggin

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Arjan van de Ven wrote:
> On Thu, 2006-08-24 at 19:33 +0530, Gautham R Shenoy wrote:
>
>>On Thu, Aug 24, 2006 at 01:00:00PM +0200, Arjan van de Ven wrote:
>>
>>>On Thu, 2006-08-24 at 16:04 +0530, Gautham R Shenoy wrote:
>>>
>>>>
>>>>This patch renames lock_cpu_hotplug to cpu_hotplug_disable and
>>>>unlock_cpu_hotplug to cpu_hotplug_enable throughout the kernel.
>>>
>>>Hi,
>>>
>>>to be honest I dislike the new names too. You turned it into a refcount,
>>>which is good, but the normal linux name for such refcount functions is
>>>_get and _put..... and in addition the refcount technically isn't
>>>hotplug specific, all you want is to keep the kernel data for the
>>>processor as being "used", so cpu_get() and cpu_put() would sound
>>>reasonable names to me, or cpu_data_get() cpu_data_put().
>>
>>Thus, choice of 'cpu_hotplug_disable' and 'cpu_hotplug_enable'
>>was determined on the basis of its purpose, as in *what* it does
>>as opposed to *how* it does it. :)
>
>
> well.. it comes down to the difference of locking to protect data versus
> locking to protect against a specific piece of code. Almost always the
> later turns out to be a mistake...

But it is not protecting a cpu from going away, it is protecting ALL
cpus from coming or leaving. In that respect it is much more like a
cpu_online_map lock rather than a data structure refcount.

It really is just like a reentrant rw semaphore... I don't see the
point of the name change, but I guess we don't like reentrant locks so
calling it something else might go down better with Linus ;)

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-08-24 15:08:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug


* Nick Piggin <[email protected]> wrote:

> It really is just like a reentrant rw semaphore... I don't see the
> point of the name change, but I guess we don't like reentrant locks so
> calling it something else might go down better with Linus ;)

what would fit best is a per-cpu scalable (on the read-side)
self-reentrant rw mutex. We are doing cpu hotplug locking in things like
fork or the slab code, while most boxes will do a CPU hotplug event only
once in the kernel's lifetime (during bootup), so a classic global
read-write lock is unjustified.

Ingo

2006-08-24 15:54:16

by Nick Piggin

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Ingo Molnar wrote:
> * Nick Piggin <[email protected]> wrote:
>
>
>>It really is just like a reentrant rw semaphore... I don't see the
>>point of the name change, but I guess we don't like reentrant locks so
>>calling it something else might go down better with Linus ;)
>
>
> what would fit best is a per-cpu scalable (on the read-side)
> self-reentrant rw mutex. We are doing cpu hotplug locking in things like
> fork or the slab code, while most boxes will do a CPU hotplug event only
> once in the kernel's lifetime (during bootup), so a classic global
> read-write lock is unjustified.

I agree with you completely.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Thu, Aug 24, 2006 at 05:00:26PM +0200, Ingo Molnar wrote:
>
> * Nick Piggin <[email protected]> wrote:
>
> > It really is just like a reentrant rw semaphore... I don't see the
> > point of the name change, but I guess we don't like reentrant locks so
> > calling it something else might go down better with Linus ;)
>
> what would fit best is a per-cpu scalable (on the read-side)
> self-reentrant rw mutex. We are doing cpu hotplug locking in things like
> fork or the slab code, while most boxes will do a CPU hotplug event only
> once in the kernel's lifetime (during bootup), so a classic global
> read-write lock is unjustified.

I agree. However, I was not sure if anything else other than for cpu_hotplug,
needs a self-reentrent rwmutex in the kernel.
Which is why I did not expose the locking(at least the write side of it)
outside. We don't want too many lazy programmers anyway!

However, even in case of cpu_hotplug, if we want to prevent
a hotplug event in some critical region where we are not going to sleep,
we may as well use preempt_disable[/enable]. Because __stop_machine_run waits
for all the tasks in the fast-path to complete before it changes
the online_cpus map, if I am not mistaken.

Only when you want your local online cpu_map to remain intact when
you wake up from sleep, should you use cpu_hotplug *lock*.

Ingo?

--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

2006-08-27 08:00:20

by Paul Jackson

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Gautham wrote:
> Which is why I did not expose the locking(at least the write side of it)
> outside. We don't want too many lazy programmers anyway!

Good thing you did that ;). This lazy programmer was already
having fantasies of changing the cpuset locks (in kernel/cpuset.c)
manage_mutex and callback_mutex to these writer and reader "unfair
rwsem" locks, respectively.

Essentially these cpuset locks need to guard the cpuset hierarchy, just
as your locks need to guard cpu_online_map.

The change agents (such as a system admin changing something in
the /dev/cpuset hierarchy) are big slow mammoths that appear rarely,
and need to single thread their entire operation, preventing anyone
else from changing the cpuset hierarchy for an extended period of time,
while they validate the request and setup to make the requested change
or changes.

The inhibitors are a swarm of locusts, that change nothing, and need
quick, safe access, free of change during a brief critical section.

Finally the mammoths must not trample the locusts (change what the
locusts see during their critical sections.)

The cpuset change agents (mammoths) take manage_mutex for their entire
operation, locking each other out. They also take callback_mutex when
making the actual changes that might momentarilly make the cpuset
structures inconsistent.

The cpuset readers (locusts) take callback_mutex for the brief critical
section during which they read out a value or two from the cpuset
structures.

If I understand your unfair rwsem proposal, the cpuset locks differ
from your proposal in these ways:
1) The cpuset locks are crafted from existing mutex mechanisms.
2) The 'reader' (change inhibitor) side, aka the callback_mutex,
is single threaded. No nesting or sharing of that lock allowed.
This is ok for inhibiting changes to the cpuset structures, as
that is not done on any kernel hot code paths (the cpuset
'locusts' are actually few in number, with tricks such as
the cpuset generation number used to suppress the locust
population.) It would obviously not be ok to single thread
reading cpu_online_map.
3) The 'writer' side (the mammoths), after taking the big
manage_mutex for its entire operation, *also* has to take the
small callback_mutex briefly around any actual changes, to lock
out readers.
4) Frequently accessed cpuset data, such as the cpus and memory
nodes allowed to a task, are cached in the task struct, to
keep from accessing the tasks cpuset frequently (more locust
population suppression.)

Differences (2) and (4) are compromises, imposed by difference (1).

The day might come when there are too many cpuset locusts -- too many
tasks taking the cpuset callback_mutex, and then something like your
unfair rwsem's could become enticing.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-08-27 08:42:12

by Keith Owens

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Paul Jackson (on Sun, 27 Aug 2006 00:59:44 -0700) wrote:
>The change agents (such as a system admin changing something in
>the /dev/cpuset hierarchy) are big slow mammoths that appear rarely,
>and need to single thread their entire operation, preventing anyone
>else from changing the cpuset hierarchy for an extended period of time,
>while they validate the request and setup to make the requested change
>or changes.
>
>The inhibitors are a swarm of locusts, that change nothing, and need
>quick, safe access, free of change during a brief critical section.
>
>Finally the mammoths must not trample the locusts (change what the
>locusts see during their critical sections.)

This requirement, and the similar requirement that cpu_online_mask
cannot change while anybody is reading it, both appear to cry out for
stop_machine(). Readers must be able to access the cpu related
structures at all time, without any extra locking. Updaters (which by
definition are extremely rare) stop all other cpus, do their work then
restart_machine(). That way only kernel/stop_machine.c has to care
that the cpu masks might change underneath it, the rest of the kernel
is protected with zero overhead.

2006-08-27 09:11:27

by Paul Jackson

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Keith wrote:
> Updaters (which by
> definition are extremely rare) stop all other cpus, do their work then
> restart_machine().

Cpuset updaters are rare, but they are not -that- rare.

And the cpuset file system allows one to configure some
cpusets to be modifiable by any end user.

One would not want such an end user to be able to stop
the machine at will by manipulating the cpusets they
are allowed to modify.

Nor would one want the cpuset actions done by say a
batch scheduler during one jobs setup to bring all the
other presently active jobs to a grinding halt.

So perhaps taking CPUs offline merits a stop_machine().

But I've little desire to stop_machine() when modifying
cpusets.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-08-29 18:04:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Sun, Aug 27, 2006 at 12:59:44AM -0700, Paul Jackson wrote:
> Gautham wrote:
> > Which is why I did not expose the locking(at least the write side of it)
> > outside. We don't want too many lazy programmers anyway!
>
> Good thing you did that ;). This lazy programmer was already
> having fantasies of changing the cpuset locks (in kernel/cpuset.c)
> manage_mutex and callback_mutex to these writer and reader "unfair
> rwsem" locks, respectively.
>
> Essentially these cpuset locks need to guard the cpuset hierarchy, just
> as your locks need to guard cpu_online_map.
>
> The change agents (such as a system admin changing something in
> the /dev/cpuset hierarchy) are big slow mammoths that appear rarely,
> and need to single thread their entire operation, preventing anyone
> else from changing the cpuset hierarchy for an extended period of time,
> while they validate the request and setup to make the requested change
> or changes.
>
> The inhibitors are a swarm of locusts, that change nothing, and need
> quick, safe access, free of change during a brief critical section.
>
> Finally the mammoths must not trample the locusts (change what the
> locusts see during their critical sections.)
>
> The cpuset change agents (mammoths) take manage_mutex for their entire
> operation, locking each other out. They also take callback_mutex when
> making the actual changes that might momentarilly make the cpuset
> structures inconsistent.

Can the locusts reasonably take a return value from the acquisition
primitive and feed it to the release primitive?

Thanx, Paul

> The cpuset readers (locusts) take callback_mutex for the brief critical
> section during which they read out a value or two from the cpuset
> structures.
>
> If I understand your unfair rwsem proposal, the cpuset locks differ
> from your proposal in these ways:
> 1) The cpuset locks are crafted from existing mutex mechanisms.
> 2) The 'reader' (change inhibitor) side, aka the callback_mutex,
> is single threaded. No nesting or sharing of that lock allowed.
> This is ok for inhibiting changes to the cpuset structures, as
> that is not done on any kernel hot code paths (the cpuset
> 'locusts' are actually few in number, with tricks such as
> the cpuset generation number used to suppress the locust
> population.) It would obviously not be ok to single thread
> reading cpu_online_map.
> 3) The 'writer' side (the mammoths), after taking the big
> manage_mutex for its entire operation, *also* has to take the
> small callback_mutex briefly around any actual changes, to lock
> out readers.
> 4) Frequently accessed cpuset data, such as the cpus and memory
> nodes allowed to a task, are cached in the task struct, to
> keep from accessing the tasks cpuset frequently (more locust
> population suppression.)
>
> Differences (2) and (4) are compromises, imposed by difference (1).
>
> The day might come when there are too many cpuset locusts -- too many
> tasks taking the cpuset callback_mutex, and then something like your
> unfair rwsem's could become enticing.
>
> --
> I won't rest till it's the best ...
> Programmer, Linux Scalability
> Paul Jackson <[email protected]> 1.925.600.0401
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-08-29 19:31:51

by Paul Jackson

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Paul E. McKenney wrtoe:
> Can the locusts reasonably take a return value from the acquisition
> primitive and feed it to the release primitive?

Yes - the locusts typically do:

mutex_lock(&callback_mutex);
... a line or two to read or write cpusets ...
mutex_unlock(&callback_mutex);

The lock and unlock are just a few lines apart. I could easily pass
a value from the lock (acquisition) to the unlock (release).

Why do you ask?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-08-29 20:02:22

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Tue, Aug 29, 2006 at 12:31:02PM -0700, Paul Jackson wrote:
> Paul E. McKenney wrtoe:
> > Can the locusts reasonably take a return value from the acquisition
> > primitive and feed it to the release primitive?
>
> Yes - the locusts typically do:
>
> mutex_lock(&callback_mutex);
> ... a line or two to read or write cpusets ...
> mutex_unlock(&callback_mutex);
>
> The lock and unlock are just a few lines apart. I could easily pass
> a value from the lock (acquisition) to the unlock (release).
>
> Why do you ask?

Because passing the value from the acquire to the release could remove
the need to store anything in the task structure, but allow the freedom
of implementation that would be provided by storing things in the task
structure.

Let me throw something together...

Thanx, Paul

2006-08-30 02:38:57

by Paul Jackson

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Paul E. McKenney wrote:
> Let me throw something together...

I think it's the hotplug folks that were most interested
in this "unfair rwsem" lock, for lock_cpu_hotplug().

I wouldn't spend any effort on throwing this together
just for cpusets. I'm not looking to change cpuset
locking anytime soon.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-08-30 15:13:24

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Tue, Aug 29, 2006 at 07:38:28PM -0700, Paul Jackson wrote:
> Paul E. McKenney wrote:
> > Let me throw something together...
>
> I think it's the hotplug folks that were most interested
> in this "unfair rwsem" lock, for lock_cpu_hotplug().

Yep, been chatting with them separately.

> I wouldn't spend any effort on throwing this together
> just for cpusets. I'm not looking to change cpuset
> locking anytime soon.

Well, my next question was going to be whether cpuset readers really
need to exclude the writers, or whether there can be a transition
period while the mastodon makes the change as long as it avoids stomping
the locusts. ;-)

Thanx, Paul

2006-08-30 17:55:21

by Paul Jackson

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

Paul E. McKenney wrote:
> Well, my next question was going to be whether cpuset readers really
> need to exclude the writers, or whether there can be a transition
> period while the mastodon makes the change as long as it avoids stomping
> the locusts. ;-)

The mastodon's (aka mammoths ;) may make a batch of several related
changes to the cpuset configuration. What's important is that the
locusts see either none or all of the changes in a given batch, not
some intermediate inconsistent state, and that the locusts see the
change batches in the same order they were applied.

Off the top of my head, I doubt I care when the locusts see the
changes. Some delay is ok, if that's your question.

But don't try too hard to fit any work you do to cpusets. For now,
I don't plan to mess with cpuset locking anytime soon. And when I
do next, it might be that all I need to do is to change the quick
lock held by the locusts from a mutex to an ordinary rwsem, so that
multiple readers (locusts) can access the cpuset configuration in
parallel.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-08-30 18:13:05

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [RFC][PATCH 4/4] Rename lock_cpu_hotplug/unlock_cpu_hotplug

On Wed, Aug 30, 2006 at 10:54:34AM -0700, Paul Jackson wrote:
> Paul E. McKenney wrote:
> > Well, my next question was going to be whether cpuset readers really
> > need to exclude the writers, or whether there can be a transition
> > period while the mastodon makes the change as long as it avoids stomping
> > the locusts. ;-)
>
> The mastodon's (aka mammoths ;) may make a batch of several related
> changes to the cpuset configuration. What's important is that the
> locusts see either none or all of the changes in a given batch, not
> some intermediate inconsistent state, and that the locusts see the
> change batches in the same order they were applied.
>
> Off the top of my head, I doubt I care when the locusts see the
> changes. Some delay is ok, if that's your question.
>
> But don't try too hard to fit any work you do to cpusets. For now,
> I don't plan to mess with cpuset locking anytime soon. And when I
> do next, it might be that all I need to do is to change the quick
> lock held by the locusts from a mutex to an ordinary rwsem, so that
> multiple readers (locusts) can access the cpuset configuration in
> parallel.

Sounds like a job for RCU (if locusts never sleep in critical sections)
or SRCU (if locusts do block). ;-) Seriously, this would get you
deterministic read-side acquisition overhead. And very low overhead
at that -- no memory barriers, cache misses, or atomic instructions
in the common case of no mammoths/mastodons.

That said, I am not familiar enough with cpusets to know if there are
any gotchas in making the change appear atomic. The usual approach
is to link updates in, so that readers either see the new state or
do not, but don't see any intermediate states. I would guess that
updates sometimes involve moving resources from one set to another,
and that such moves must be atomic in the "mv" command sense. There
are some reasonably straightforward ways of making this happen.

Thanx, Paul