Hi,
This series of patches enables Aggressive Link Power Management for AHCI
devices, as documented in the AHCI spec. On my laptop (a Lenovo X60), this
saves me a full watt of power. On other systems, reported power savings
range from .5-1.5 Watts. It has been tested by the kind folks at #powertop
with similar results. Please give it a try and let me know what you think.
thanks,
Kristen
--
Kristen Carlson Accardi wrote:
> Hi,
> This series of patches enables Aggressive Link Power Management for AHCI
> devices, as documented in the AHCI spec. On my laptop (a Lenovo X60), this
> saves me a full watt of power. On other systems, reported power savings
> range from .5-1.5 Watts. It has been tested by the kind folks at #powertop
> with similar results. Please give it a try and let me know what you think.
I'm not sure about this. We need better PM framework to support
powersaving in other controllers and some ahcis don't save much when
only link power management is used, they need to be turned off
completely && I don't think scsi sysfs is the right place to export this
interface.
--
tejun
Tejun Heo wrote:
> Kristen Carlson Accardi wrote:
>> Hi,
>> This series of patches enables Aggressive Link Power Management for AHCI
>> devices, as documented in the AHCI spec. On my laptop (a Lenovo X60), this
>> saves me a full watt of power. On other systems, reported power savings
>> range from .5-1.5 Watts. It has been tested by the kind folks at #powertop
>> with similar results. Please give it a try and let me know what you think.
>
> I'm not sure about this. We need better PM framework to support
> powersaving in other controllers and some ahcis don't save much when
> only link power management is used, they need to be turned off
A better PM framework would definitely be nice :)
> completely && I don't think scsi sysfs is the right place to export this
> interface.
That's about the only place we have right now.
Jeff
Tejun Heo wrote:
> Kristen Carlson Accardi wrote:
>> Hi,
>> This series of patches enables Aggressive Link Power Management for AHCI
>> devices, as documented in the AHCI spec. On my laptop (a Lenovo X60), this
>> saves me a full watt of power. On other systems, reported power savings
>> range from .5-1.5 Watts. It has been tested by the kind folks at #powertop
>> with similar results. Please give it a try and let me know what you think.
>
> I'm not sure about this. We need better PM framework to support
> powersaving in other controllers and some ahcis don't save much when
> only link power management is used,
do you have data to support this? The data we have from this patch is
that it saves typically a Watt of power (depends on the machine of
course, but the range is 0.5W to 1.5W). If you want to also have an
even more agressive thing where you want to start disabling the entire
controller... I don't see how this is in conflict with saving power on
the link level by "just" enabling a hardware feature ....
Arjan van de Ven wrote:
> Tejun Heo wrote:
>> Kristen Carlson Accardi wrote:
>>> Hi,
>>> This series of patches enables Aggressive Link Power Management for
>>> AHCI devices, as documented in the AHCI spec. On my laptop (a Lenovo
>>> X60), this
>>> saves me a full watt of power. On other systems, reported power savings
>>> range from .5-1.5 Watts. It has been tested by the kind folks at
>>> #powertop
>>> with similar results. Please give it a try and let me know what you
>>> think.
>>
>> I'm not sure about this. We need better PM framework to support
>> powersaving in other controllers and some ahcis don't save much when
>> only link power management is used,
>
> do you have data to support this? The data we have from this patch is
> that it saves typically a Watt of power (depends on the machine of
> course, but the range is 0.5W to 1.5W). If you want to also have an even
> more agressive thing where you want to start disabling the entire
> controller... I don't see how this is in conflict with saving power on
> the link level by "just" enabling a hardware feature ....
SATA standard defines lower power phy states. So the same argument
you're using for AHCI applies there too -- "just" enabling an existing
hardware feature.
Jeff
Jeff Garzik wrote:
> Arjan van de Ven wrote:
>> Tejun Heo wrote:
>>> Kristen Carlson Accardi wrote:
>>>> Hi,
>>>> This series of patches enables Aggressive Link Power Management for
>>>> AHCI devices, as documented in the AHCI spec. On my laptop (a
>>>> Lenovo X60), this
>>>> saves me a full watt of power. On other systems, reported power
>>>> savings
>>>> range from .5-1.5 Watts. It has been tested by the kind folks at
>>>> #powertop
>>>> with similar results. Please give it a try and let me know what you
>>>> think.
>>>
>>> I'm not sure about this. We need better PM framework to support
>>> powersaving in other controllers and some ahcis don't save much when
>>> only link power management is used,
>>
>> do you have data to support this? The data we have from this patch is
>> that it saves typically a Watt of power (depends on the machine of
>> course, but the range is 0.5W to 1.5W). If you want to also have an
>> even more agressive thing where you want to start disabling the entire
>> controller... I don't see how this is in conflict with saving power on
>> the link level by "just" enabling a hardware feature ....
>
> SATA standard defines lower power phy states. So the same argument
> you're using for AHCI applies there too -- "just" enabling an existing
> hardware feature.
>
yes I'm not arguing against that. I was trying to find out (and
suggest-unless-proven-otherwise) that the 2 are not exclusive or
conflicting... in fact I assume both are wanted concurrently.
Arjan van de Ven wrote:
> Jeff Garzik wrote:
>> SATA standard defines lower power phy states. So the same argument
>> you're using for AHCI applies there too -- "just" enabling an existing
>> hardware feature.
> yes I'm not arguing against that. I was trying to find out (and
> suggest-unless-proven-otherwise) that the 2 are not exclusive or
> conflicting... in fact I assume both are wanted concurrently.
Yes and no. As I understand it, AHCI's capability is an automatic
version of what standard SATA phys provide manually. In AHCI's case,
the hardware automatically manages the link power, possibly cycling it
hundreds of times per second. In the standard case, software must
determine when a different power state is appropriate based on current
conditions, and update the phy appropriately.
Jeff
Arjan van de Ven wrote:
>> I'm not sure about this. We need better PM framework to support
>> powersaving in other controllers and some ahcis don't save much
>> when only link power management is used,
>
> do you have data to support this?
Yeah, it was some Lenovo notebook. Pavel is more familiar with the
hardware. Pavel, what was the notebook which didn't save much power
with standard SATA power save but needed port to be completely turned off?
> The data we have from this patch is that it saves typically a Watt of
> power (depends on the machine of course, but the range is 0.5W to
> 1.5W). If you want to also have an even more agressive thing where
> you want to start disabling the entire controller... I don't see how
> this is in conflict with saving power on the link level by "just"
> enabling a hardware feature ....
Well, both implement about the same thing. I prefer software
implementation because it's more generic and ALPE/ASP seems too
aggressive to me. Here are reasons why sw implementation wasn't merged.
1. It didn't have proper interface with userland. This was mainly
because of missing ATA sysfs nodes. I'm not sure whether adding this to
scsi node is a good idea.
2. It was focused on SATA link PS and couldn't cover the Lenovo case.
I think we need something at the block layer.
Thanks.
--
tejun
Tejun Heo wrote:
>> do you have data to support this?
>
> Yeah, it was some Lenovo notebook. Pavel is more familiar with the
> hardware. Pavel, what was the notebook which didn't save much power
> with standard SATA power save but needed port to be completely turned off?
Pavel, if you have time, could you measure this with Kristen's patch?
>
>> The data we have from this patch is that it saves typically a Watt of
>> power (depends on the machine of course, but the range is 0.5W to
>> 1.5W). If you want to also have an even more agressive thing where
>> you want to start disabling the entire controller... I don't see how
>> this is in conflict with saving power on the link level by "just"
>> enabling a hardware feature ....
>
> Well, both implement about the same thing. I prefer software
> implementation because it's more generic and ALPE/ASP seems too
> aggressive to me.
Too aggressive in what way?
There are tradeoffs on either side. Doing things in software is more
work for the cpu, and depending on the implementation, will consume
more power on the CPU side. (for example if you need regular timers
that just consumes the power you are saving back up). The hardware can
obviously switch very fast (because it's independent of any software),
yet of course the software has higher level knowledge about how idle
the link really is (like it knows if any files are open etc etc).
To be honest, I would be surprised if software could do significantly
better than hardware though; it seems a simple problem: Idle -> go to
low power, and estimating idle isn't all that hard on a link level...
there's not all THAT much the kernel can estimate better I suspect.
This debate is very similar to the cpufreq debate from 4 years ago,
where there were 3 levels: do it in the CPU, do it in the kernel or do
it in userspace. All three are valid; whichever is best depends on the
exact hardware that you have...
(and you can argue that first everyone started in userspace, then the
hardware improved that made a kernelspace implementation better
(ondemand) and now Turbo Mode is more or less moving this to the
hardware... I wouldn't be surprised if the sata side will show a
similar trend)
Arjan van de Ven wrote:
>>> The data we have from this patch is that it saves typically a Watt of
>>> power (depends on the machine of course, but the range is 0.5W to
>>> 1.5W). If you want to also have an even more agressive thing where
>>> you want to start disabling the entire controller... I don't see how
>>> this is in conflict with saving power on the link level by "just"
>>> enabling a hardware feature ....
>>
>> Well, both implement about the same thing. I prefer software
>> implementation because it's more generic and ALPE/ASP seems too
>> aggressive to me.
>
> Too aggressive in what way?
There are devices which lock up hard if PHY enters PS mode (only
physical power removal can reset it) and I wouldn't be surprised if some
devices aren't happy with PS being too aggressive. Well, I actually
expect to see such devices. It's ATA after all. This is unknown
territory and that's why I was using 'seems ... to me'.
> There are tradeoffs on either side. Doing things in software is more
> work for the cpu, and depending on the implementation, will consume more
> power on the CPU side. (for example if you need regular timers that just
> consumes the power you are saving back up). The hardware can obviously
> switch very fast (because it's independent of any software), yet of
> course the software has higher level knowledge about how idle the link
> really is (like it knows if any files are open etc etc).
>
> To be honest, I would be surprised if software could do significantly
> better than hardware though; it seems a simple problem: Idle -> go to
> low power, and estimating idle isn't all that hard on a link level...
> there's not all THAT much the kernel can estimate better I suspect.
I don't think the end result will vary in any significant way. My
biggest argument for sw implementation is it can be used for other
controllers.
> This debate is very similar to the cpufreq debate from 4 years ago,
> where there were 3 levels: do it in the CPU, do it in the kernel or do
> it in userspace. All three are valid; whichever is best depends on the
> exact hardware that you have...
> (and you can argue that first everyone started in userspace, then the
> hardware improved that made a kernelspace implementation better
> (ondemand) and now Turbo Mode is more or less moving this to the
> hardware... I wouldn't be surprised if the sata side will show a similar
> trend)
Currently, ahci is the only one which has controller-side automatic PS
but some ATA devices (hdds) implement device initiated PS (DIPS). The
sw implementation supports SW HIPS and DIPS. We can add HW HIPS support
and hook ALPE/ASP support there but I don't think it would have benefits
over SW implementation.
I think it's a bit different from cpufreq. ATA is cheaper and more
broken and much more diverse.
Thanks.
--
tejun
We will do AHCI link PM -- presuming that I can be convinced that it
does not repeatedly park the hard drive heads, or something similarly
annoying on PATA<->SATA bridges and similar setups.
IF it works as advertised -- a big if considering all the AHCI silicon
implementations out there -- we definitely want to use it.
Jeff
On Tue, 12 Jun 2007 00:43:12 -0400
Jeff Garzik <[email protected]> wrote:
> We will do AHCI link PM -- presuming that I can be convinced that it
> does not repeatedly park the hard drive heads, or something similarly
> annoying on PATA<->SATA bridges and similar setups.
>
> IF it works as advertised -- a big if considering all the AHCI silicon
> implementations out there -- we definitely want to use it.
>
> Jeff
>
I understand that this is a concern of yours based on some experience you
had with earlier controllers. In general, this behavior would be considered
incorrect - link power management should not translate to disk parking, even
on PATA->SATA brigdes, and if it does, then that's completely broken. That
said, I would believe you if you said broken hardware exists, and when you
get specific examples of it, you can add it to the blacklist for this feature.
On Tue, 12 Jun 2007 13:40:15 +0900
Tejun Heo <[email protected]> wrote:
> I don't think the end result will vary in any significant way. My
> biggest argument for sw implementation is it can be used for other
> controllers.
What I had in mind when I created the new port operation "enable_pm"
was that other controllers (besides the ahci controller) could define their
own method of enabling power management. Maybe for non-ahci controllers this
is a software based solution which uses generic SATA dipm/hipm stuff and
polling.
See patch 2/3 of this series for the implementation of this.
Kristen
Hi!
> >> I'm not sure about this. We need better PM framework to support
> >> powersaving in other controllers and some ahcis don't save much
> >> when only link power management is used,
> >
> > do you have data to support this?
>
> Yeah, it was some Lenovo notebook. Pavel is more familiar with the
> hardware. Pavel, what was the notebook which didn't save much power
> with standard SATA power save but needed port to be completely turned off?
Thinkpad x60. Some one Kristen probably used while developing the
patch :-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Wed, 13 Jun 2007 11:04:30 +0200
Pavel Machek <[email protected]> wrote:
> Hi!
>
> > >> I'm not sure about this. We need better PM framework to support
> > >> powersaving in other controllers and some ahcis don't save much
> > >> when only link power management is used,
> > >
> > > do you have data to support this?
> >
> > Yeah, it was some Lenovo notebook. Pavel is more familiar with the
> > hardware. Pavel, what was the notebook which didn't save much power
> > with standard SATA power save but needed port to be completely turned off?
>
> Thinkpad x60. Some one Kristen probably used while developing the
> patch :-).
Yes - that confirms my conclusion that the first patch just wasn't
done correctly - cause when I measure the power savings with a power
meter on the X60 with my patches I see ~ 1W.
Hi!
> >Yeah, it was some Lenovo notebook. Pavel is more
> >familiar with the
> >hardware. Pavel, what was the notebook which didn't
> >save much power
> >with standard SATA power save but needed port to be
> >completely turned off?
>
> Pavel, if you have time, could you measure this with
> Kristen's patch?
Kristen has same machine as me, and I have seen similar '1W' saving
with previous version of the patch. I'd trust her results.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Hi!
> >> I'm not sure about this. We need better PM framework to support
> >> powersaving in other controllers and some ahcis don't save much
> >> when only link power management is used,
> >
> > do you have data to support this?
>
> Yeah, it was some Lenovo notebook. Pavel is more familiar with the
> hardware. Pavel, what was the notebook which didn't save much power
> with standard SATA power save but needed port to be completely turned off?
Uhuh, now I understand why Arjan wanted me to test.
But I have same hw as Kristen, so I assume there must have been
something wrong with the old tests.
Sorry for confusion.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Kristen Carlson Accardi wrote:
>>>>> I'm not sure about this. We need better PM framework to support
>>>>> powersaving in other controllers and some ahcis don't save much
>>>>> when only link power management is used,
>>>> do you have data to support this?
>>> Yeah, it was some Lenovo notebook. Pavel is more familiar with the
>>> hardware. Pavel, what was the notebook which didn't save much power
>>> with standard SATA power save but needed port to be completely turned off?
>> Thinkpad x60. Some one Kristen probably used while developing the
>> patch :-).
>
> Yes - that confirms my conclusion that the first patch just wasn't
> done correctly - cause when I measure the power savings with a power
> meter on the X60 with my patches I see ~ 1W.
Hmmm... Could it be that the controller doesn't enter powersave state
when SControl is written to?
--
tejun
On Tue, Jun 12 2007, Tejun Heo wrote:
> Arjan van de Ven wrote:
> >> I'm not sure about this. We need better PM framework to support
> >> powersaving in other controllers and some ahcis don't save much
> >> when only link power management is used,
> >
> > do you have data to support this?
>
> Yeah, it was some Lenovo notebook. Pavel is more familiar with the
> hardware. Pavel, what was the notebook which didn't save much power
> with standard SATA power save but needed port to be completely turned off?
>
> > The data we have from this patch is that it saves typically a Watt of
> > power (depends on the machine of course, but the range is 0.5W to
> > 1.5W). If you want to also have an even more agressive thing where
> > you want to start disabling the entire controller... I don't see how
> > this is in conflict with saving power on the link level by "just"
> > enabling a hardware feature ....
>
> Well, both implement about the same thing. I prefer software
> implementation because it's more generic and ALPE/ASP seems too
> aggressive to me. Here are reasons why sw implementation wasn't merged.
>
> 1. It didn't have proper interface with userland. This was mainly
> because of missing ATA sysfs nodes. I'm not sure whether adding this to
> scsi node is a good idea.
>
> 2. It was focused on SATA link PS and couldn't cover the Lenovo case.
>
> I think we need something at the block layer.
I think the hardware method is preferable, actually. Doing this in the
block layer would mean keeping track of idle time, and that quickly
turns into a lot of timer management. Not exactly free, in terms of CPU
usage.
I've yet to do some power measurements with this ahci patch, I just
noticed that with min_power performance drops from ~55mb/sec to
~15mb/sec sequential on my drive. That's pretty drastic :-)
--
Jens Axboe
Jens Axboe wrote:
>> 1. It didn't have proper interface with userland. This was mainly
>> because of missing ATA sysfs nodes. I'm not sure whether adding this to
>> scsi node is a good idea.
>>
>> 2. It was focused on SATA link PS and couldn't cover the Lenovo case.
>>
>> I think we need something at the block layer.
>
> I think the hardware method is preferable, actually. Doing this in the
> block layer would mean keeping track of idle time, and that quickly
> turns into a lot of timer management. Not exactly free, in terms of CPU
> usage.
Yeah, software implementation certainly has complexity overhead.
> I've yet to do some power measurements with this ahci patch, I just
> noticed that with min_power performance drops from ~55mb/sec to
> ~15mb/sec sequential on my drive. That's pretty drastic :-)
That's another thing I don't like about ALPE/ASP. According to the
spec, there is no idle timer whatsoever. The controller is supposed to
drive the link into PS mode whenever FIS is not in flight, so the link
goes in and out of PS state repeatedly when commands are issued
back-to-back. Getting out of PS state takes a bit of time and slows
down things.
--
tejun