2009-09-02 05:33:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Fri, 2009-08-28 at 15:30 +0530, Gautham R Shenoy wrote:
> Hi,
>
> This is the version 2 of the patch series to provide a cpu-offline framework
> that enables the administrators choose the state the offline CPU must be put
> into when multiple such states are exposed by the underlying architecture.
>
> Version 1 of the Patch can be found here:
> http://lkml.org/lkml/2009/8/6/236
>
> The patch-series exposes the following sysfs tunables to
> allow the system-adminstrator to choose the state of a CPU:
>
> To query the available hotplug states, one needs to read the sysfs tunable:
> /sys/devices/system/cpu/cpu<number>/available_hotplug_states
> To query or set the current state, on needs to read/write the sysfs tunable:
> /sys/devices/system/cpu/cpu<number>/current_states
>
> The patchset ensures that the writes to the "current_state" sysfs file are
> serialized against the writes to the "online" file.
>
> This patchset also contains the offline state driver implemented for
> pSeries. For pSeries, we define three available_hotplug_states. They are:
>
> online: The processor is online.
>
> deallocate: This is the the default behaviour when the cpu is offlined
> even in the absense of this driver. The CPU would call make an
> rtas_stop_self() call and hand over the CPU back to the resource pool,
> thereby effectively deallocating that vCPU from the LPAR.
> NOTE: This would result in a configuration change to the LPAR
> which is visible to the outside world.
>
> deactivate: This cedes the vCPU to the hypervisor which
> in turn can put the vCPU time to the best use.
> NOTE: This option DOES NOT result in a configuration change
> and the vCPU would be still entitled to the LPAR to which it earlier
> belong to.
>
> Awaiting your feedback.

I'm still thinking this is a bad idea.

The OS should only know about online/offline.

Use the hypervisor interface to deal with the cpu once its offline.

That is, I think this interface you propose is a layering violation.


2009-09-02 20:02:22

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Wed 2009-09-02 07:33:31, Peter Zijlstra wrote:
> On Fri, 2009-08-28 at 15:30 +0530, Gautham R Shenoy wrote:
> > Hi,
> >
> > This is the version 2 of the patch series to provide a cpu-offline framework
> > that enables the administrators choose the state the offline CPU must be put
> > into when multiple such states are exposed by the underlying architecture.
> >
> > Version 1 of the Patch can be found here:
> > http://lkml.org/lkml/2009/8/6/236
> >
> > The patch-series exposes the following sysfs tunables to
> > allow the system-adminstrator to choose the state of a CPU:
> >
> > To query the available hotplug states, one needs to read the sysfs tunable:
> > /sys/devices/system/cpu/cpu<number>/available_hotplug_states
> > To query or set the current state, on needs to read/write the sysfs tunable:
> > /sys/devices/system/cpu/cpu<number>/current_states
> >
> > The patchset ensures that the writes to the "current_state" sysfs file are
> > serialized against the writes to the "online" file.
> >
> > This patchset also contains the offline state driver implemented for
> > pSeries. For pSeries, we define three available_hotplug_states. They are:
> >
> > online: The processor is online.
> >
> > deallocate: This is the the default behaviour when the cpu is offlined
> > even in the absense of this driver. The CPU would call make an
> > rtas_stop_self() call and hand over the CPU back to the resource pool,
> > thereby effectively deallocating that vCPU from the LPAR.
> > NOTE: This would result in a configuration change to the LPAR
> > which is visible to the outside world.
> >
> > deactivate: This cedes the vCPU to the hypervisor which
> > in turn can put the vCPU time to the best use.
> > NOTE: This option DOES NOT result in a configuration change
> > and the vCPU would be still entitled to the LPAR to which it earlier
> > belong to.
> >
> > Awaiting your feedback.
>
> I'm still thinking this is a bad idea.
>
> The OS should only know about online/offline.
>
> Use the hypervisor interface to deal with the cpu once its offline.
>
> That is, I think this interface you propose is a layering violation.

Agreed. Plus having interface like 'go to this state during offliine'
then 'go offline' is strange/stupid. For hypervisor case, you might
want to change 'state' of cpu that is already offline.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-09-24 00:50:52

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Wed, 2009-09-02 at 07:33 +0200, Peter Zijlstra wrote:
>
> I'm still thinking this is a bad idea.
>
> The OS should only know about online/offline.
>
> Use the hypervisor interface to deal with the cpu once its offline.
>
> That is, I think this interface you propose is a layering violation.
>
I don't quite follow your logic here. This is useful for more than just
hypervisors. For example, take the HV out of the picture for a moment
and imagine that the HW has the ability to offline CPU in various power
levels, with varying latencies to bring them back.

For example, the HW can put them in some low power state where they can
be re-plugged quickly, or can shutdown entire power planes completely,
possibly allowing physical hotplug, but that takes much longer to bring
them back into the pool.

In any case, regarding the pseries case, this is how our hypervisor
works and I don't think we can change it, other than by always going
into the "cede" function and having some weird separate interface in the
arch to then whack them into some different state.

Ben.

2009-09-24 07:50:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Thu, 2009-09-24 at 10:48 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2009-09-02 at 07:33 +0200, Peter Zijlstra wrote:
> >
> > I'm still thinking this is a bad idea.
> >
> > The OS should only know about online/offline.
> >
> > Use the hypervisor interface to deal with the cpu once its offline.
> >
> > That is, I think this interface you propose is a layering violation.
> >
> I don't quite follow your logic here. This is useful for more than just
> hypervisors. For example, take the HV out of the picture for a moment
> and imagine that the HW has the ability to offline CPU in various power
> levels, with varying latencies to bring them back.

cpu-hotplug is an utter slow path, anybody saying latency and hotplug in
the same sentence doesn't seem to grasp either or both concepts.


2009-09-24 08:40:49

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Thu, 2009-09-24 at 09:51 +0200, Peter Zijlstra wrote:
> > I don't quite follow your logic here. This is useful for more than just
> > hypervisors. For example, take the HV out of the picture for a moment
> > and imagine that the HW has the ability to offline CPU in various power
> > levels, with varying latencies to bring them back.
>
> cpu-hotplug is an utter slow path, anybody saying latency and hotplug in
> the same sentence doesn't seem to grasp either or both concepts.

Let's forget about latency then. Let's imagine I want to set a CPU
offline to save power, vs. setting it offline -and- opening the back
door of the machine to actually physically replace it :-)

In any case, I don't see the added feature as being problematic, and
not such a "layering violation" as you seem to imply it is. It's a
convenient way to atomically take the CPU out -and- convey some
information about the "intent" to the hypervisor, and I really fail
to see why you have so strong objections about it.

Ben.

2009-09-24 11:32:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Thu, 2009-09-24 at 18:38 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2009-09-24 at 09:51 +0200, Peter Zijlstra wrote:
> > > I don't quite follow your logic here. This is useful for more than just
> > > hypervisors. For example, take the HV out of the picture for a moment
> > > and imagine that the HW has the ability to offline CPU in various power
> > > levels, with varying latencies to bring them back.
> >
> > cpu-hotplug is an utter slow path, anybody saying latency and hotplug in
> > the same sentence doesn't seem to grasp either or both concepts.
>
> Let's forget about latency then. Let's imagine I want to set a CPU
> offline to save power, vs. setting it offline -and- opening the back
> door of the machine to actually physically replace it :-)

If the hardware is capable of physical hotplug, then surely powering the
socket down saves most power and is the preferred mode?

> In any case, I don't see the added feature as being problematic, and
> not such a "layering violation" as you seem to imply it is. It's a
> convenient way to atomically take the CPU out -and- convey some
> information about the "intent" to the hypervisor, and I really fail
> to see why you have so strong objections about it.

Ignorance on my part probably :-)

I'm simply not seeing a use case for it, except for the virt case, which
I think we should bug the virt interface with and not the cpu-hotplug
interface.


2009-09-24 11:41:30

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Thu, 24 Sep 2009 13:33:07 +0200
Peter Zijlstra <[email protected]> wrote:

> On Thu, 2009-09-24 at 18:38 +1000, Benjamin Herrenschmidt wrote:
> > On Thu, 2009-09-24 at 09:51 +0200, Peter Zijlstra wrote:
> > > > I don't quite follow your logic here. This is useful for more
> > > > than just hypervisors. For example, take the HV out of the
> > > > picture for a moment and imagine that the HW has the ability to
> > > > offline CPU in various power levels, with varying latencies to
> > > > bring them back.
> > >
> > > cpu-hotplug is an utter slow path, anybody saying latency and
> > > hotplug in the same sentence doesn't seem to grasp either or both
> > > concepts.
> >
> > Let's forget about latency then. Let's imagine I want to set a CPU
> > offline to save power, vs. setting it offline -and- opening the back
> > door of the machine to actually physically replace it :-)
>
> If the hardware is capable of physical hotplug, then surely powering
> the socket down saves most power and is the preferred mode?

btw just to take away a perception that generally powering down sockets
help; it does not help for all cpus. Some cpus are so efficient in idle
that the incremental gain one would get by "offlining" a core is just
not worth it
(in fact, in x86, it's the same thing)

I obviously can't speak for p-series cpus, just wanted to point out
that there is no universal truth about "offlining saves power".

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-09-25 07:27:44

by Vaidyanathan Srinivasan

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

* Arjan van de Ven <[email protected]> [2009-09-24 13:41:23]:

> On Thu, 24 Sep 2009 13:33:07 +0200
> Peter Zijlstra <[email protected]> wrote:
>
> > On Thu, 2009-09-24 at 18:38 +1000, Benjamin Herrenschmidt wrote:
> > > On Thu, 2009-09-24 at 09:51 +0200, Peter Zijlstra wrote:
> > > > > I don't quite follow your logic here. This is useful for more
> > > > > than just hypervisors. For example, take the HV out of the
> > > > > picture for a moment and imagine that the HW has the ability to
> > > > > offline CPU in various power levels, with varying latencies to
> > > > > bring them back.
> > > >
> > > > cpu-hotplug is an utter slow path, anybody saying latency and
> > > > hotplug in the same sentence doesn't seem to grasp either or both
> > > > concepts.
> > >
> > > Let's forget about latency then. Let's imagine I want to set a CPU
> > > offline to save power, vs. setting it offline -and- opening the back
> > > door of the machine to actually physically replace it :-)
> >
> > If the hardware is capable of physical hotplug, then surely powering
> > the socket down saves most power and is the preferred mode?
>
> btw just to take away a perception that generally powering down sockets
> help; it does not help for all cpus. Some cpus are so efficient in idle
> that the incremental gain one would get by "offlining" a core is just
> not worth it
> (in fact, in x86, it's the same thing)
>
> I obviously can't speak for p-series cpus, just wanted to point out
> that there is no universal truth about "offlining saves power".

Hi Arjan,

As you have said, on some cpus the extra effort of offlining does not
save us any extra power, and the state will be same as idle. The
assertion that offlining saves power is still valid, it could be same
as idle or better depending on the architecture and implementation.

On x86 we still need the code (Venki posted) to take cpus to C6 on
offline to save power or else offlining consumes more power than idle
due to C1/hlt state. This framework can help here as well if we have
any apprehension on making lowest sleep state as default on x86 and
want the administrator to decide.

--Vaidy

2009-09-25 07:42:26

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] cpu: pseries: Offline state framework.

On Fri, 25 Sep 2009 12:55:49 +0530
Vaidyanathan Srinivasan <[email protected]> wrote:

> > I obviously can't speak for p-series cpus, just wanted to point out
> > that there is no universal truth about "offlining saves power".
>
> Hi Arjan,
>
> As you have said, on some cpus the extra effort of offlining does not
> save us any extra power, and the state will be same as idle. The
> assertion that offlining saves power is still valid, it could be same
> as idle or better depending on the architecture and implementation.
>
> On x86 we still need the code (Venki posted) to take cpus to C6 on
> offline to save power or else offlining consumes more power than idle
> due to C1/hlt state. This framework can help here as well if we have
> any apprehension on making lowest sleep state as default on x86 and
> want the administrator to decide.

even with Venki's patch, all our measurements indicate that taking
cores away is damage on x86. Don't let that stop what you do for
powerpc, but for x86 it's not a win. Linux is good at keeping cores in
C6 long enough that the downside of offlining is bigger...



--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org