2006-11-19 08:21:45

by Chuck Ebbert

[permalink] [raw]
Subject: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

When doing 'make oldconfig' we should ask about suspend/resume
debug features when SOFTWARE_SUSPEND is not enabled.

Signed-off-by: Chuck Ebbert <[email protected]>

--- 2.6.19-rc6-32.orig/kernel/power/Kconfig
+++ 2.6.19-rc6-32/kernel/power/Kconfig
@@ -38,7 +38,7 @@ config PM_DEBUG

config DISABLE_CONSOLE_SUSPEND
bool "Keep console(s) enabled during suspend/resume (DANGEROUS)"
- depends on PM && PM_DEBUG
+ depends on PM_DEBUG && SOFTWARE_SUSPEND
default n
---help---
This option turns off the console suspend mechanism that prevents
@@ -49,7 +49,7 @@ config DISABLE_CONSOLE_SUSPEND

config PM_TRACE
bool "Suspend/resume event tracing"
- depends on PM && PM_DEBUG && X86_32 && EXPERIMENTAL
+ depends on PM_DEBUG && SOFTWARE_SUSPEND && X86_32 && EXPERIMENTAL
default n
---help---
This enables some cheesy code to save the last PM event point in the
--
Chuck


2006-11-19 08:29:42

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi!

> When doing 'make oldconfig' we should ask about suspend/resume
> debug features when SOFTWARE_SUSPEND is not enabled.

These are suspend-to-ram debugging features, mostly, so no, they
should not depend on software suspend.

NAK.

> Signed-off-by: Chuck Ebbert <[email protected]>
>
> --- 2.6.19-rc6-32.orig/kernel/power/Kconfig
> +++ 2.6.19-rc6-32/kernel/power/Kconfig
> @@ -38,7 +38,7 @@ config PM_DEBUG
>
> config DISABLE_CONSOLE_SUSPEND
> bool "Keep console(s) enabled during suspend/resume (DANGEROUS)"
> - depends on PM && PM_DEBUG
> + depends on PM_DEBUG && SOFTWARE_SUSPEND
> default n
> ---help---
> This option turns off the console suspend mechanism that prevents
> @@ -49,7 +49,7 @@ config DISABLE_CONSOLE_SUSPEND
>
> config PM_TRACE
> bool "Suspend/resume event tracing"
> - depends on PM && PM_DEBUG && X86_32 && EXPERIMENTAL
> + depends on PM_DEBUG && SOFTWARE_SUSPEND && X86_32 && EXPERIMENTAL
> default n
> ---help---
> This enables some cheesy code to save the last PM event point in the

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-11-19 17:36:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Sun, 19 Nov 2006, Chuck Ebbert wrote:
>
> When doing 'make oldconfig' we should ask about suspend/resume
> debug features when SOFTWARE_SUSPEND is not enabled.

That's wrong.

I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
broken.

Sane people use suspend-to-ram, and that's when you need the suspend and
resume debugging.

Software-suspend is silly. I want my machine back in three seconds, not
waiting for minutes..

Linus

2006-11-19 17:47:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sunday, 19 November 2006 18:33, Linus Torvalds wrote:
>
> On Sun, 19 Nov 2006, Chuck Ebbert wrote:
> >
> > When doing 'make oldconfig' we should ask about suspend/resume
> > debug features when SOFTWARE_SUSPEND is not enabled.
>
> That's wrong.
>
> I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> broken.
>
> Sane people use suspend-to-ram, and that's when you need the suspend and
> resume debugging.
>
> Software-suspend is silly. I want my machine back in three seconds, not
> waiting for minutes..

In fact that's up to 30 seconds on a modern box, usually less than that.

And suspend-to-ram doesn't work on quilte a lot of boxes right now. Also, you
can use the software suspend on boxes that don't support the suspend-to-ram
at all.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-11-19 17:50:26

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, 2006-11-19 at 09:33 -0800, Linus Torvalds wrote:
>
> On Sun, 19 Nov 2006, Chuck Ebbert wrote:
> >
> > When doing 'make oldconfig' we should ask about suspend/resume
> > debug features when SOFTWARE_SUSPEND is not enabled.
>
> That's wrong.
>
> I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> broken.
>
> Sane people use suspend-to-ram, and that's when you need the suspend and
> resume debugging.

Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
AGP card, I have no choices except swsusp or reboot.

-Mike

2006-11-19 18:22:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Sun, 19 Nov 2006, Rafael J. Wysocki wrote:
>
> In fact that's up to 30 seconds on a modern box, usually less than that.

Right. If the machine boots quickly, it's fast. Of course, if the machine
boots quickly, you might as well often just shut down and reboot.

> And suspend-to-ram doesn't work on quilte a lot of boxes right now. Also, you
> can use the software suspend on boxes that don't support the suspend-to-ram
> at all.

One large reason STR often doesn't work is that people don't even test it,
because people point to the suspend-to-disk instead. suspend-to-disk is
the problem, not the solution.

I've been working at making the machines I have able to STR, and almost
always it's a driver that is buggy. Thank God for the suspend/resume
debugging - the thing that Chuck tried to disable. That's often the _only_
way to debug these things, and it's actually pretty powerful (but
time-consuming - having to insert TRACE_RESUME() markers into the device
driver that doesn't resume and recompile and reboot).

Anyway, the way to debug this for people who are interested (have a
machine that doesn't boot) is:

- enable PM_DEBUG, and PM_TRACE

- use a script like this:

#!/bin/sh
sync
echo 1 > /sys/power/pm_trace
echo mem > /sys/power/state

to suspend

- if it doesn't come back up (which is usually the problem), reboot by
holding the power button down, and look at the dmesg output for things
like

Magic number: 4:156:725
hash matches drivers/base/power/resume.c:28
hash matches device 0000:01:00.0

which means that the last trace event was just before trying to resume
device 0000:01:00.0. Then figure out what driver is controlling that
device (lspci and /sys/devices/pci* is your friend), and see if you can
fix it, disable it, or trace into its resume function.

For example, the above happens to be the VGA device on my EVO, which I
used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out
that "radeonfb" simply cannot resume that device - it tries to set the
PLL's, and it just _hangs_. Using the regular VGA console and letting X
resume it instead works fine.

The point being that PM_TRACE is wonderful, and it's wonderful exactly for
NOT using suspend-to-disk. The other point being that people have gotten
lazy, and accept half a minute (minimum - usually it's longer) boot times,
when STR is a lot more pleasant, but it does require some detective work
when it doesn't boot.

I wish more people tried STR, instead of having the STD people tell them
not to!

Linus

2006-11-19 18:25:53

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Sun, 19 Nov 2006, Mike Galbraith wrote:
>
> Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> AGP card, I have no choices except swsusp or reboot.

Try using regular VGA console, and letting X re-initialize the card. It's
worked for me on several machines (Mac Mini and Compaq EVO) where the
kernel cannot initialize the graphics itself.

It does mean that you have to run X when suspending, but most people do
anyway..

Linus

2006-11-19 19:01:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sunday, 19 November 2006 18:52, Mike Galbraith wrote:
> On Sun, 2006-11-19 at 09:33 -0800, Linus Torvalds wrote:
> >
> > On Sun, 19 Nov 2006, Chuck Ebbert wrote:
> > >
> > > When doing 'make oldconfig' we should ask about suspend/resume
> > > debug features when SOFTWARE_SUSPEND is not enabled.
> >
> > That's wrong.
> >
> > I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> > broken.
> >
> > Sane people use suspend-to-ram, and that's when you need the suspend and
> > resume debugging.
>
> Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> AGP card, I have no choices except swsusp or reboot.

Have you tried s2ram (http://en.opensuse.org/S2ram)?

Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-11-19 19:00:56

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, 2006-11-19 at 10:25 -0800, Linus Torvalds wrote:
>
> On Sun, 19 Nov 2006, Mike Galbraith wrote:
> >
> > Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> > AGP card, I have no choices except swsusp or reboot.
>
> Try using regular VGA console, and letting X re-initialize the card. It's
> worked for me on several machines (Mac Mini and Compaq EVO) where the
> kernel cannot initialize the graphics itself.
>
> It does mean that you have to run X when suspending, but most people do
> anyway..

Thanks for the tip, but it didn't work. It suspended instantly, and got
my hopes up (manually, SuSE says "not supported, go away"), but resume
still left me with an utterly dead box (minus flashing crud on display).

-Mike

2006-11-19 18:59:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sunday, 19 November 2006 19:21, Linus Torvalds wrote:
>
> On Sun, 19 Nov 2006, Rafael J. Wysocki wrote:
> >
> > In fact that's up to 30 seconds on a modern box, usually less than that.
>
> Right. If the machine boots quickly, it's fast. Of course, if the machine
> boots quickly, you might as well often just shut down and reboot.

Yes, if the only thing you want to run is the kernel. The applications aren't
going to start so quickly, you know. ;-)

> > And suspend-to-ram doesn't work on quilte a lot of boxes right now. Also, you
> > can use the software suspend on boxes that don't support the suspend-to-ram
> > at all.
>
> One large reason STR often doesn't work is that people don't even test it,

Well, the majority of bugzilla reports I'm tracking are STR-related, so people
_do_ test it, really.

> because people point to the suspend-to-disk instead.

Who they?

> suspend-to-disk is the problem, not the solution.

If given machine doesn't support the STR or has a broken BIOS (which means
pretty much the same), the suspend-to-disk is the _only_ way in which you can
suspend it.

I guess you're referring to using the suspend-to-disk instead of the STR which
wouldn't be very reasonable indeed, but still I don't know why you consider
it as a problem.

> I've been working at making the machines I have able to STR, and almost
> always it's a driver that is buggy. Thank God for the suspend/resume
> debugging - the thing that Chuck tried to disable. That's often the _only_
> way to debug these things, and it's actually pretty powerful (but
> time-consuming - having to insert TRACE_RESUME() markers into the device
> driver that doesn't resume and recompile and reboot).
>
> Anyway, the way to debug this for people who are interested (have a
> machine that doesn't boot) is:
>
> - enable PM_DEBUG, and PM_TRACE

This only works on i386, no?

> - use a script like this:
>
> #!/bin/sh
> sync
> echo 1 > /sys/power/pm_trace
> echo mem > /sys/power/state
>
> to suspend
>
> - if it doesn't come back up (which is usually the problem), reboot by
> holding the power button down, and look at the dmesg output for things
> like
>
> Magic number: 4:156:725
> hash matches drivers/base/power/resume.c:28
> hash matches device 0000:01:00.0
>
> which means that the last trace event was just before trying to resume
> device 0000:01:00.0. Then figure out what driver is controlling that
> device (lspci and /sys/devices/pci* is your friend), and see if you can
> fix it, disable it, or trace into its resume function.
>
> For example, the above happens to be the VGA device on my EVO, which I
> used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out
> that "radeonfb" simply cannot resume that device - it tries to set the
> PLL's, and it just _hangs_. Using the regular VGA console and letting X
> resume it instead works fine.
>
> The point being that PM_TRACE is wonderful, and it's wonderful exactly for
> NOT using suspend-to-disk. The other point being that people have gotten
> lazy, and accept half a minute (minimum - usually it's longer) boot times,
> when STR is a lot more pleasant, but it does require some detective work
> when it doesn't boot.
>
> I wish more people tried STR, instead of having the STD people tell them
> not to!

I don't know of anyone who's doing that.

Yes, we often ask people to try the STD when they report a problem with the
STR to get one more data point, but that doesn't mean we tell them not to try
the STR any more. Bug reports regarding the STR are accepted and welcome.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-11-19 19:51:53

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, 2006-11-19 at 19:58 +0100, Rafael J. Wysocki wrote:
> On Sunday, 19 November 2006 18:52, Mike Galbraith wrote:
> > On Sun, 2006-11-19 at 09:33 -0800, Linus Torvalds wrote:
> > >
> > > On Sun, 19 Nov 2006, Chuck Ebbert wrote:
> > > >
> > > > When doing 'make oldconfig' we should ask about suspend/resume
> > > > debug features when SOFTWARE_SUSPEND is not enabled.
> > >
> > > That's wrong.
> > >
> > > I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> > > broken.
> > >
> > > Sane people use suspend-to-ram, and that's when you need the suspend and
> > > resume debugging.
> >
> > Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> > AGP card, I have no choices except swsusp or reboot.
>
> Have you tried s2ram (http://en.opensuse.org/S2ram)?

Cool. That shows potential. On an 2.6.19-rc6-rt4 kernel, it looked
like it _might_ have eventually gotten past boot. At one line of kernel
output every ~10 seconds though, I gave up. Virgin 2.6.19-rc6 went
panic with a black screen... have options, will tinker.

(i _was_ quite content with swsusp, but now i want it all;)

Thanks,

-Mike

2006-11-19 19:55:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Sun, 19 Nov 2006, Rafael J. Wysocki wrote:
>
> > because people point to the suspend-to-disk instead.
>
> Who they?

Like you _just_ did.

> > - enable PM_DEBUG, and PM_TRACE
>
> This only works on i386, no?

Right now the trivial functions are only available on i386, yes. The
concept works anywhere that has a CMOS chip, so if somebody were to spend
a few minutes testing it on x86-64 and others, it would work elsewhere
too..

> I don't know of anyone who's doing that.

I know. I'm probably the only one. Frustrating.

Linus

2006-11-19 19:56:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Sun, 19 Nov 2006, Mike Galbraith wrote:
>
> Thanks for the tip, but it didn't work. It suspended instantly, and got
> my hopes up (manually, SuSE says "not supported, go away"), but resume
> still left me with an utterly dead box (minus flashing crud on display).

Right. That's why we have PM_DEBUG and PM_TRACE, and why I sent out the
small email about how to use them.

The utterly dead box is the common case when some driver doesn't actually
resume properly ;(

Linus

2006-11-19 21:29:36

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sunday, 19 November 2006 20:54, Linus Torvalds wrote:
>
> On Sun, 19 Nov 2006, Rafael J. Wysocki wrote:
> >
> > > because people point to the suspend-to-disk instead.
> >
> > Who they?
>
> Like you _just_ did.

No, I didn't. It was referred to in the message that started this thread.

I don't remeber telling _anyone_ not to use the STR and use the STD instead.

> > > - enable PM_DEBUG, and PM_TRACE
> >
> > This only works on i386, no?
>
> Right now the trivial functions are only available on i386, yes. The
> concept works anywhere that has a CMOS chip, so if somebody were to spend
> a few minutes testing it on x86-64 and others, it would work elsewhere
> too..

I can do that if someone gives me the code.

> > I don't know of anyone who's doing that.
>
> I know. I'm probably the only one. Frustrating.

Well, it isn't that well documented ...

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-11-19 21:41:42

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi Linus.

On Sun, 2006-11-19 at 09:33 -0800, Linus Torvalds wrote:
>
> On Sun, 19 Nov 2006, Chuck Ebbert wrote:
> >
> > When doing 'make oldconfig' we should ask about suspend/resume
> > debug features when SOFTWARE_SUSPEND is not enabled.
>
> That's wrong.
>
> I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> broken.
>
> Sane people use suspend-to-ram, and that's when you need the suspend and
> resume debugging.
>
> Software-suspend is silly. I want my machine back in three seconds, not
> waiting for minutes..

If it's taking minutes, something is wrong. You should be looking at
more like 10-30 seconds, depending on which implementation you're using,
the speed of your cpu and hard disk and how much ram was saved in the
image. For suspend2, for example, the rule of thumb is

ram_in_use_in_MB() / hard_disk_speed() / 2 seconds + bios time + time to
get to starting the resume

Where hard disk speed is the result of hdparm -t. Assumes LZF
compression (that's the /2).

Regards,

Nigel

2006-11-19 22:04:26

by Christer Weinigel

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Linus Torvalds <[email protected]> writes:

> I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> broken.
>
> Sane people use suspend-to-ram, and that's when you need the suspend and
> resume debugging.
>
> Software-suspend is silly. I want my machine back in three seconds, not
> waiting for minutes..

Suspend-to-disk has a few advantages over suspend-to-ram:

With suspend-to-disk I can remove the battery (to put in a fresh
battery when traveling), try doing that with suspend-to-ram. It's
also really useful for my moving my old Thinkpad which has a battery
which is so bad it can't even hold power for suspend-for-ram for more
than a few seconds.

With suspend-to-disk I can suspend Linux, resume Windows to be able to
run that one program I have to run under Windows, suspend Windows and
resume Linux again without having to restart alla applications.

And yes, this have been very useful for me in real life. So it's
really nice to have both suspend-to-disk and suspend-to-ram.

/Christer


--
"Just how much can I get away with and still go to heaven?"

Christer Weinigel <[email protected]> http://www.weinigel.se

2006-11-19 22:59:44

by Romano Giannetti

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, 2006-11-19 at 19:55 +0100, Rafael J. Wysocki wrote:
> On Sunday, 19 November 2006 19:21, Linus Torvalds wrote:
> >
> > On Sun, 19 Nov 2006, Rafael J. Wysocki wrote:
> > >
> > > In fact that's up to 30 seconds on a modern box, usually less than that.
> >
> > Right. If the machine boots quickly, it's fast. Of course, if the machine
> > boots quickly, you might as well often just shut down and reboot.
>
> Yes, if the only thing you want to run is the kernel. The applications aren't
> going to start so quickly, you know. ;-)

Can I say a word in defence of STD? It's true that now that I have STR
working(*) for the first time on my old vaio I use it much less, but it
has been my salvation in the last four years, and for this I have to
thanks Nigel, Pavel, Rafael and all the people involved. And still now
it is very useful. I can STD a session with tens of application opened,
and come back after changing batteries in less than a minute _doing
other things_, and not opening applications and files all over the
place. So yes, I think it's useful.

And when suspend-to-both will work radiply and safely, that will be
great (and a point over The Other SO...)

Romano

(*) if only for this, all the troubles I had upgrading to Ubuntu Edgy
has been worthwhile.


2006-11-19 23:25:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Sun, 19 Nov 2006, Rafael J. Wysocki wrote:
> > concept works anywhere that has a CMOS chip, so if somebody were to spend
> > a few minutes testing it on x86-64 and others, it would work elsewhere
> > too..
>
> I can do that if someone gives me the code.

Well, I actually _think_ it works almost as-is on x86-64 too, but the
magic is all in that small inline asm thing in <linux/resume-trace.h>.

The ".long" in that inline asm probably needs to be a ".quad" - see gow
"generate_resume_trace()" uses it.

Also the "show_file_hash()" right now has a "tracedata += 6", and it
should probably be "tracedata += 2 + sizeof(void *)" instead.

IOW, it really should be very easy to get to work on x86-64, I just
haven't had the energy or inclination, since the only devices I've
personally used it on have been regular 32-bit ones..

Linus

2006-11-20 22:18:39

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

One reason why I've generally avoided using suspend-to-ram is that
after my laptop (IBM Thinkpad T60p) comes out of suspend-to-ram, it is
consuming 30W of power --- as opposed to before I hibernated, when my
laptop was consuming only 17W or 18W of power. (And even without
doing things like unloading all of my USB modules, normally my laptop
will consume about 24W after a fresh reboot --- which makes the 30W
power consumption after a suspend-to-ram especially troubling,)

If I unload all of my USB modules, and shutdown the parallel port, the
wired ethernet port, etc., I get power savinges down to 17W --- and
once I was able to push it all the way down to 15W, although in
practice it's much more common that I can get the power consumption
down to 17W or 18W. Unfortunately, after the laptop wakss up from a
suspend-to-ram, the laptop is apparently powering up all of the
devices in high power mode while I can get some of the power savings
back by manually loading and then unlodaing a whole bunch of device
drivers, in practice that only gets me from 30W to 24W or so. This is
probably much more of a hardware bug than an OS bug, but because of
this fact, if I'm going to be running the laptop for any amount of
time after resume, I'm better off using suspend-to-disk.

If someone has a suggestion for how I can save the power state of all
of the various components in my laptop so that the laptop cna be
brought back to the 18W state after a suspend-to-ram, I'm all ears....

- Ted

2006-11-21 02:03:17

by Matthew Garrett

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Mon, Nov 20, 2006 at 05:17:56PM -0500, Theodore Tso wrote:

> If someone has a suggestion for how I can save the power state of all
> of the various components in my laptop so that the laptop cna be
> brought back to the 18W state after a suspend-to-ram, I'm all ears....

A good start might be to compare the PCI configuration registers before
and after suspend. However, I suspect it's more complicated than that.
Are you using the closed ATI drivers? If so, it's possible that they do
something at X startup that they're not doing on resume. A good
comparison might be to see if the power consumption is dramatically
different over suspend/resume if you only boot to text mode - that is,
never start X at all.

--
Matthew Garrett | [email protected]

2006-11-22 15:23:30

by Adrian Bunk

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, Nov 19, 2006 at 09:33:37AM -0800, Linus Torvalds wrote:
>...
> I never use SOFTWARE_SUSPEND, and I think the whole concept is totally
> broken.
>
> Sane people use suspend-to-ram, and that's when you need the suspend and
> resume debugging.
>
> Software-suspend is silly. I want my machine back in three seconds, not
> waiting for minutes..

It depends on what you are using STD for. ;-)

It might be a bit off-topic for this thread, but I'm using STD for
turning my PC off in the evening and getting it back in exactly the same
state in the morning.

This might not have been the original purpose of suspend, but it works
quite well, STR is obviously not an alternative, and it doesn't matter
much whether it takes a minute.

> Linus

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-11-22 15:40:43

by Alan

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

> This might not have been the original purpose of suspend, but it works
> quite well, STR is obviously not an alternative, and it doesn't matter
> much whether it takes a minute.

Suspend to ram at the moment is for the very very brave. Having added
resume quirks to the PCI resume code I've got basic resume behaving on at
two boxes where resume ate the disks. I've now found a third case in
testing where str/resume resumes when using Jmicron 365/366 your IDE disks
swapped over which makes a really nasty mess, and that controllers like
the SIL680 come back from resume misclocked and so on.

Patches for some of these will follow, to go with the pci resume quirk
patch already posted.

Alan

2006-11-22 23:05:19

by Mark Lord

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Linus Torvalds wrote:
>
> Sane people use suspend-to-ram, and that's when you need the suspend and
> resume debugging.
>
> Software-suspend is silly. I want my machine back in three seconds, not
> waiting for minutes..

ALL of my notebooks here have always been capable of Suspend-to-RAM with Linux,
and that's how I like it. We've had six different models (IBM, Dell, Toshiba),
and they have all worked fine with S2R.

The older APM ones had zero issues, and just always worked.

The newer ACPI ones have required some tweaking to the sleep/wakeup scripts,
but now work perfectly with suspend-to-ram.

Suspend-to-disk (and resume from disk) are way too slow for regular use,
except of course for the swsusp2 version, which is nearly as quick as S2R is.

I do use S2D when travelling, as it's much easier on the batteries while
the notebook is packed away on the red-eye flights.

Cheers

2006-11-23 16:04:08

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi!

> > > > Sane people use suspend-to-ram, and that's when you need the suspend and
> > > > resume debugging.
> > >
> > > Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> > > AGP card, I have no choices except swsusp or reboot.
> >
> > Have you tried s2ram (http://en.opensuse.org/S2ram)?
>
> Cool. That shows potential. On an 2.6.19-rc6-rt4 kernel, it looked
> like it _might_ have eventually gotten past boot. At one line of kernel
> output every ~10 seconds though, I gave up. Virgin 2.6.19-rc6 went
> panic with a black screen... have options, will tinker.

Try that -rt kernel, but turn off dyntick/hires timers. Also try
hitting keyboard while resuming.

--
Thanks for all the (sleeping) penguins.

2006-11-23 16:04:39

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi!

> > And suspend-to-ram doesn't work on quilte a lot of boxes right now. Also, you
> > can use the software suspend on boxes that don't support the suspend-to-ram
> > at all.
>
> One large reason STR often doesn't work is that people don't even test it,
> because people point to the suspend-to-disk instead. suspend-to-disk is
> the problem, not the solution.

Actually, I do not think we do that any more. But if s2ram is broken,
trying s2disk is indeed useful, because many driver problems are
common and s2disk is slightly easier to debug.

Pavel
--
Thanks for all the (sleeping) penguins.

2006-11-23 21:34:37

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Thu, 2006-11-23 at 13:39 +0000, Pavel Machek wrote:
> Hi!
>
> > > > > Sane people use suspend-to-ram, and that's when you need the suspend and
> > > > > resume debugging.
> > > >
> > > > Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> > > > AGP card, I have no choices except swsusp or reboot.
> > >
> > > Have you tried s2ram (http://en.opensuse.org/S2ram)?
> >
> > Cool. That shows potential. On an 2.6.19-rc6-rt4 kernel, it looked
> > like it _might_ have eventually gotten past boot. At one line of kernel
> > output every ~10 seconds though, I gave up. Virgin 2.6.19-rc6 went
> > panic with a black screen... have options, will tinker.
>
> Try that -rt kernel, but turn off dyntick/hires timers. Also try
> hitting keyboard while resuming.

Hmm, I'll try that. Interesting (read odd from my perspective) that the
rt kernel gets much further. This is pretty hard to look at.

2006-11-23 21:41:09

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Thursday, 23 November 2006 22:36, Mike Galbraith wrote:
> On Thu, 2006-11-23 at 13:39 +0000, Pavel Machek wrote:
> > Hi!
> >
> > > > > > Sane people use suspend-to-ram, and that's when you need the suspend and
> > > > > > resume debugging.
> > > > >
> > > > > Here I am wishing I had the _opportunity_ to be sane. With my ATI X850
> > > > > AGP card, I have no choices except swsusp or reboot.
> > > >
> > > > Have you tried s2ram (http://en.opensuse.org/S2ram)?
> > >
> > > Cool. That shows potential. On an 2.6.19-rc6-rt4 kernel, it looked
> > > like it _might_ have eventually gotten past boot. At one line of kernel
> > > output every ~10 seconds though, I gave up. Virgin 2.6.19-rc6 went
> > > panic with a black screen... have options, will tinker.
> >
> > Try that -rt kernel, but turn off dyntick/hires timers. Also try
> > hitting keyboard while resuming.
>
> Hmm, I'll try that. Interesting (read odd from my perspective) that the
> rt kernel gets much further. This is pretty hard to look at.

Is your system an i386?

2006-11-24 06:37:05

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Thu, 2006-11-23 at 22:36 +0100, Rafael J. Wysocki wrote:
> On Thursday, 23 November 2006 22:36, Mike Galbraith wrote:
> > On Thu, 2006-11-23 at 13:39 +0000, Pavel Machek wrote:
> > >
> > > Try that -rt kernel, but turn off dyntick/hires timers. Also try
> > > hitting keyboard while resuming.
> >
> > Hmm, I'll try that. Interesting (read odd from my perspective) that the
> > rt kernel gets much further. This is pretty hard to look at.
>
> Is your system an i386?

Yeah, single P4/HT supermarket box.

I tried the dynticks/hires-timers/kbd suggestion, no difference. It
still boots in medicated snail mode, and emits a stream of IRQ9: nobody
cared messages (fasteoi acpi, irqpoll = nogo) while doing so.

-Mike

2006-11-24 18:09:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND



On Fri, 24 Nov 2006, Mike Galbraith wrote:
>
> I tried the dynticks/hires-timers/kbd suggestion, no difference. It
> still boots in medicated snail mode, and emits a stream of IRQ9: nobody
> cared messages (fasteoi acpi, irqpoll = nogo) while doing so.

"medicated snail mode". Lol.

The "IRQ9: nobody cared" things are not unheard of. The Mac Mini had the
same issue, and in that case it was because the firmware on that machine
was just broken, and didn't re-enable ACPI mode correctly on resume. We
worked around it by forcing a re-enable by hand, so that _exact_ issue
isn't going to be your problem (or it would work since 2.6.18), but it's
not unlikely that there is some other SCI setup that could have been
missed.

One thing that is often worth testing is to try with APIC support if you
don't have it already, or if you do have it, try _without_ APIC support.
The firmware generally is only tested against MS operating systems, so
it's only ever been tested for the particular irq setup that Windows tends
to use, and as a result sometimes things work better in one more or
another.

Based on the "fasteoi", you're obviously right now using the APIC, and
that's _usually_ the mode that works better. But just in case, try booting
with "noapic".

Also, can you send out the boot log with APIC information ("apic=debug").
Of course, don't disable the apic for that case, or it won't be very
useful ;)

Linus

2006-11-24 21:42:45

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Fri, 2006-11-24 at 10:08 -0800, Linus Torvalds wrote:

> One thing that is often worth testing is to try with APIC support if you
> don't have it already, or if you do have it, try _without_ APIC support.
> The firmware generally is only tested against MS operating systems, so
> it's only ever been tested for the particular irq setup that Windows tends
> to use, and as a result sometimes things work better in one more or
> another.

I'll try noapic. (though I value my apic more than s2ram;)

> Based on the "fasteoi", you're obviously right now using the APIC, and
> that's _usually_ the mode that works better. But just in case, try booting
> with "noapic".
>
> Also, can you send out the boot log with APIC information ("apic=debug").
> Of course, don't disable the apic for that case, or it won't be very
> useful ;)

Will do. (not tomorrow though, helping friend with home construction)

I found the driver that was causing my panic problem with vanilla, it
was my Sat TV card. I haven't ever used it with Linux, and haven't used
it with that other OS but a couple of times to try it out(stupid stupid
purchase), so I tossed it.

Box still paniced though, apparently after restoring all drivers (last
trace mark was end of function marker). I disabled trace, and enabled
the "leave console active, danger danger" option in hopes of capturing
the panic, and what do you know, for the first time ever, box restored
all the way into X... only to panic about two seconds later. Nothing
whatsoever got to my serial console. Switching to a text console before
suspend didn't help either, the panic is invisible. Drat.

I'll toss rocks at it.

-Mike

2006-11-25 12:34:00

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi!

> > This might not have been the original purpose of suspend, but it works
> > quite well, STR is obviously not an alternative, and it doesn't matter
> > much whether it takes a minute.
>
> Suspend to ram at the moment is for the very very brave. Having added
> resume quirks to the PCI resume code I've got basic resume behaving on at
> two boxes where resume ate the disks. I've now found a third case in
> testing where str/resume resumes when using Jmicron 365/366 your IDE disks
> swapped over which makes a really nasty mess, and that controllers like
> the SIL680 come back from resume misclocked and so on.

Hmm... how common are these machines? We are using unpatched kernel
for suse10.2... OTOH we only support machines from the whitelist, all
notebooks...
Pavel

--
Thanks for all the (sleeping) penguins.

2006-11-25 14:12:29

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Fri, 2006-11-24 at 10:08 -0800, Linus Torvalds wrote:
>
> On Fri, 24 Nov 2006, Mike Galbraith wrote:
> >
> > I tried the dynticks/hires-timers/kbd suggestion, no difference. It
> > still boots in medicated snail mode, and emits a stream of IRQ9: nobody
> > cared messages (fasteoi acpi, irqpoll = nogo) while doing so.
>
> "medicated snail mode". Lol.

It turns out, that this is caused by console=ttyS0,115200n8. Without
the problematic driver found by pm_trace for the old Pinacle SAT card...

02:02.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11)
Subsystem: Pinnacle Systems Inc. PCTV Sat (DBC receiver)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (4000ns min, 10000ns max)
Interrupt: pin A routed to IRQ 17
Region 0: Memory at f2100000 (32-bit, prefetchable) [size=4K]
Capabilities: [44] Vital Product Data
Capabilities: [4c] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

went boom here -->02:02.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11)
Subsystem: Pinnacle Systems Inc. PCTV Sat (DBC receiver)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (1000ns min, 63750ns max)
Interrupt: pin A routed to IRQ 17
Region 0: Memory at f2101000 (32-bit, prefetchable) [size=4K]
Capabilities: [44] Vital Product Data
Capabilities: [4c] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

02:03.0 Multimedia controller: Philips Semiconductors SAA7134 Video Broadcast Decoder (rev 01)
Subsystem: Creatix Polymedia GmbH Medion 7134
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 8000ns max)
Interrupt: pin A routed to IRQ 20
Region 0: Memory at f2002000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [40] Power Management version 1
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-

...and the serial console, I can now suspend to ram just fine with stock
2.6.19-rc6. As a serial port test, I set up a getty on ttyS0, and
logged in via minicom prior to suspend. After resume, the serial port
isn't talking. Kill -9 the shell though, and all is fine again.

> Based on the "fasteoi", you're obviously right now using the APIC, and
> that's _usually_ the mode that works better. But just in case, try booting
> with "noapic".

Makes no difference for vanilla, works just fine either way, I don't
even have to remove vga=0x314. The RT kernel has some issues in this
area with s2ram, but that's a different story.

Hope you're having/had a nice weekend, and thanks for the tips!

-Mike

2006-11-25 16:08:16

by Alan

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

> Hmm... how common are these machines? We are using unpatched kernel
> for suse10.2... OTOH we only support machines from the whitelist, all

I've always said IDE and software suspend are unsafe. The more work I do
the more clearly this is/was the case.

The really nasty "resume eats your disk" cases I know about are
thankfully for older systems - VIA KT133 and similar era chipsets.
There is a recent nasty - Jmicron goes totally to **** on resume because
of resume quirks not being run but it goes so spectacularly wrong it
doesn't seem to get far enough to corrupt.

Lots of other controllers don't work correctly on resume but thats much
less of a problem and with UDMA misclocking generally turns into a CRC
error storm and stop.

Andrew has about 2/3rds of the bits I've done now, will push the rest
when I've done a little more testing/checking. At that point libata ought
to be resume safe. Someone who cares about drivers/ide legacy support can
then copy the work over.

Alan

2006-11-25 17:34:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Saturday, 25 November 2006 17:08, Alan wrote:
> > Hmm... how common are these machines? We are using unpatched kernel
> > for suse10.2... OTOH we only support machines from the whitelist, all
>
> I've always said IDE and software suspend are unsafe. The more work I do
> the more clearly this is/was the case.
>
> The really nasty "resume eats your disk" cases I know about are
> thankfully for older systems - VIA KT133 and similar era chipsets.
> There is a recent nasty - Jmicron goes totally to **** on resume because
> of resume quirks not being run but it goes so spectacularly wrong it
> doesn't seem to get far enough to corrupt.
>
> Lots of other controllers don't work correctly on resume but thats much
> less of a problem and with UDMA misclocking generally turns into a CRC
> error storm and stop.
>
> Andrew has about 2/3rds of the bits I've done now, will push the rest
> when I've done a little more testing/checking. At that point libata ought
> to be resume safe. Someone who cares about drivers/ide legacy support can
> then copy the work over.

So, it seems we should discourage people from suspending with the old IDE
until someone fixes it.

Perhaps we should update the documentation with this information (?)

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-11-25 17:34:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi,

On Saturday, 25 November 2006 15:11, Mike Galbraith wrote:
> On Fri, 2006-11-24 at 10:08 -0800, Linus Torvalds wrote:
> >
> > On Fri, 24 Nov 2006, Mike Galbraith wrote:
> > >
> > > I tried the dynticks/hires-timers/kbd suggestion, no difference. It
> > > still boots in medicated snail mode, and emits a stream of IRQ9: nobody
> > > cared messages (fasteoi acpi, irqpoll = nogo) while doing so.
> >
> > "medicated snail mode". Lol.
>
> It turns out, that this is caused by console=ttyS0,115200n8. Without
> the problematic driver found by pm_trace for the old Pinacle SAT card...
>
> 02:02.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11)
> Subsystem: Pinnacle Systems Inc. PCTV Sat (DBC receiver)
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 32 (4000ns min, 10000ns max)
> Interrupt: pin A routed to IRQ 17
> Region 0: Memory at f2100000 (32-bit, prefetchable) [size=4K]
> Capabilities: [44] Vital Product Data
> Capabilities: [4c] Power Management version 2
> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
>
> went boom here -->02:02.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11)
> Subsystem: Pinnacle Systems Inc. PCTV Sat (DBC receiver)
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 32 (1000ns min, 63750ns max)
> Interrupt: pin A routed to IRQ 17
> Region 0: Memory at f2101000 (32-bit, prefetchable) [size=4K]
> Capabilities: [44] Vital Product Data
> Capabilities: [4c] Power Management version 2
> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
>
> 02:03.0 Multimedia controller: Philips Semiconductors SAA7134 Video Broadcast Decoder (rev 01)
> Subsystem: Creatix Polymedia GmbH Medion 7134
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 32 (2000ns min, 8000ns max)
> Interrupt: pin A routed to IRQ 20
> Region 0: Memory at f2002000 (32-bit, non-prefetchable) [size=1K]
> Capabilities: [40] Power Management version 1
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>
> ...and the serial console, I can now suspend to ram just fine with stock
> 2.6.19-rc6. As a serial port test, I set up a getty on ttyS0, and
> logged in via minicom prior to suspend. After resume, the serial port
> isn't talking. Kill -9 the shell though, and all is fine again.

Hm, could you please file a bugzilla report regarding the serial console for
the information of its maintainer(s)?

Greetings,
Rafael

2006-11-26 04:54:39

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sat, 2006-11-25 at 18:12 +0100, Rafael J. Wysocki wrote:

> Hm, could you please file a bugzilla report regarding the serial console for
> the information of its maintainer(s)?

Yeah, I'll rummage around a bit first though.

-Mike

2006-11-26 07:11:37

by Robert Hancock

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Alan wrote:
> Lots of other controllers don't work correctly on resume but thats much
> less of a problem and with UDMA misclocking generally turns into a CRC
> error storm and stop.
>
> Andrew has about 2/3rds of the bits I've done now, will push the rest
> when I've done a little more testing/checking. At that point libata ought
> to be resume safe. Someone who cares about drivers/ide legacy support can
> then copy the work over.

btw, I have some code almost ready for sata_nv to add proper
suspend/resume support. Unfortunately I have trouble testing it, since
STR doesn't work on my machine since, guess what - the video doesn't
come back! It doesn't even take the monitor out of standby mode. None of
the acpi_sleep options seem to work, and vbetool appears to helpfully
segfault on any operation so that's out. This is an NVIDIA SLI setup so
that probably makes things a bit more complicated.

In any case, it should be better than what we have right now for
suspend/resume support in sata_nv, namely the "do nothing, won't work
(at least not for CK804 and later)" implementation..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2006-11-26 10:23:13

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sunday, 26 November 2006 08:11, Robert Hancock wrote:
> Alan wrote:
> > Lots of other controllers don't work correctly on resume but thats much
> > less of a problem and with UDMA misclocking generally turns into a CRC
> > error storm and stop.
> >
> > Andrew has about 2/3rds of the bits I've done now, will push the rest
> > when I've done a little more testing/checking. At that point libata ought
> > to be resume safe. Someone who cares about drivers/ide legacy support can
> > then copy the work over.
>
> btw, I have some code almost ready for sata_nv to add proper
> suspend/resume support. Unfortunately I have trouble testing it, since
> STR doesn't work on my machine since, guess what - the video doesn't
> come back! It doesn't even take the monitor out of standby mode. None of
> the acpi_sleep options seem to work, and vbetool appears to helpfully
> segfault on any operation so that's out.

I guess it's x86-64? Which version of vbetool?

> This is an NVIDIA SLI setup so that probably makes things a bit more
> complicated.

Ouch.

> In any case, it should be better than what we have right now for
> suspend/resume support in sata_nv, namely the "do nothing, won't work
> (at least not for CK804 and later)" implementation..

I think I can test it if you send me the patch.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-11-26 20:36:51

by Robert Hancock

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

--- linux-2.6.19-rc6-mm1-admafixnoresume/drivers/ata/sata_nv.c 2006-11-26 00:53:44.000000000 -0600
+++ linux-2.6.19-rc6-mm1-admafix/drivers/ata/sata_nv.c 2006-11-26 00:54:30.000000000 -0600
@@ -49,7 +49,7 @@
#include <linux/libata.h>

#define DRV_NAME "sata_nv"
-#define DRV_VERSION "3.2"
+#define DRV_VERSION "3.3"

#define NV_ADMA_DMA_BOUNDARY 0xffffffffUL

@@ -213,12 +213,21 @@ struct nv_adma_port_priv {
dma_addr_t cpb_dma;
struct nv_adma_prd *aprd;
dma_addr_t aprd_dma;
+ void __iomem * ctl_block;
+ void __iomem * gen_block;
+ void __iomem * notifier_clear_block;
u8 flags;
};

+struct nv_host_priv {
+ unsigned long type;
+};
+
#define NV_ADMA_CHECK_INTR(GCTL, PORT) ((GCTL) & ( 1 << (19 + (12 * (PORT)))))

static int nv_init_one (struct pci_dev *pdev, const struct pci_device_id *ent);
+static void nv_remove_one (struct pci_dev *pdev);
+static int nv_pci_device_resume(struct pci_dev *pdev);
static void nv_ck804_host_stop(struct ata_host *host);
static irqreturn_t nv_generic_interrupt(int irq, void *dev_instance);
static irqreturn_t nv_nf2_interrupt(int irq, void *dev_instance);
@@ -239,6 +248,8 @@ static irqreturn_t nv_adma_interrupt(int
static void nv_adma_irq_clear(struct ata_port *ap);
static int nv_adma_port_start(struct ata_port *ap);
static void nv_adma_port_stop(struct ata_port *ap);
+static int nv_adma_port_suspend(struct ata_port *ap, pm_message_t mesg);
+static int nv_adma_port_resume(struct ata_port *ap);
static void nv_adma_error_handler(struct ata_port *ap);
static void nv_adma_host_stop(struct ata_host *host);
static void nv_adma_bmdma_setup(struct ata_queued_cmd *qc);
@@ -292,7 +303,9 @@ static struct pci_driver nv_pci_driver =
.name = DRV_NAME,
.id_table = nv_pci_tbl,
.probe = nv_init_one,
- .remove = ata_pci_remove_one,
+ .suspend = ata_pci_device_suspend,
+ .resume = nv_pci_device_resume,
+ .remove = nv_remove_one,
};

static struct scsi_host_template nv_sht = {
@@ -311,6 +324,8 @@ static struct scsi_host_template nv_sht
.slave_configure = ata_scsi_slave_config,
.slave_destroy = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
+ .suspend = ata_scsi_device_suspend,
+ .resume = ata_scsi_device_resume,
};

static struct scsi_host_template nv_adma_sht = {
@@ -330,6 +345,8 @@ static struct scsi_host_template nv_adma
.slave_configure = nv_adma_slave_config,
.slave_destroy = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
+ .suspend = ata_scsi_device_suspend,
+ .resume = ata_scsi_device_resume,
};

static const struct ata_port_operations nv_generic_ops = {
@@ -438,6 +455,8 @@ static const struct ata_port_operations
.scr_write = nv_scr_write,
.port_start = nv_adma_port_start,
.port_stop = nv_adma_port_stop,
+ .port_suspend = nv_adma_port_suspend,
+ .port_resume = nv_adma_port_resume,
.host_stop = nv_adma_host_stop,
};

@@ -492,32 +511,10 @@ MODULE_VERSION(DRV_VERSION);

static int adma_enabled = 1;

-static inline void __iomem *__nv_adma_ctl_block(void __iomem *mmio,
- unsigned int port_no)
-{
- mmio += NV_ADMA_PORT + port_no * NV_ADMA_PORT_SIZE;
- return mmio;
-}
-
-static inline void __iomem *nv_adma_ctl_block(struct ata_port *ap)
-{
- return __nv_adma_ctl_block(ap->host->mmio_base, ap->port_no);
-}
-
-static inline void __iomem *nv_adma_gen_block(struct ata_port *ap)
-{
- return (ap->host->mmio_base + NV_ADMA_GEN);
-}
-
-static inline void __iomem *nv_adma_notifier_clear_block(struct ata_port *ap)
-{
- return (nv_adma_gen_block(ap) + NV_ADMA_NOTIFIER_CLEAR + (4 * ap->port_no));
-}
-
static void nv_adma_register_mode(struct ata_port *ap)
{
- void __iomem *mmio = nv_adma_ctl_block(ap);
struct nv_adma_port_priv *pp = ap->private_data;
+ void __iomem *mmio = pp->ctl_block;
u16 tmp;

if (pp->flags & NV_ADMA_PORT_REGISTER_MODE)
@@ -531,8 +528,8 @@ static void nv_adma_register_mode(struct

static void nv_adma_mode(struct ata_port *ap)
{
- void __iomem *mmio = nv_adma_ctl_block(ap);
struct nv_adma_port_priv *pp = ap->private_data;
+ void __iomem *mmio = pp->ctl_block;
u16 tmp;

if (!(pp->flags & NV_ADMA_PORT_REGISTER_MODE))
@@ -693,7 +690,7 @@ static void nv_adma_check_cpb(struct ata
For NCQ commands the current status may have nothing to do with
the command just completed. */
if(qc->tf.protocol != ATA_PROT_NCQ)
- ata_status = readb(nv_adma_ctl_block(ap) + (ATA_REG_STATUS * 4));
+ ata_status = readb(pp->ctl_block + (ATA_REG_STATUS * 4));

if(have_err || force_err)
ata_status |= ATA_ERR;
@@ -751,7 +748,7 @@ static irqreturn_t nv_adma_interrupt(int

if (ap && !(ap->flags & ATA_FLAG_DISABLED)) {
struct nv_adma_port_priv *pp = ap->private_data;
- void __iomem *mmio = nv_adma_ctl_block(ap);
+ void __iomem *mmio = pp->ctl_block;
u16 status;
u32 gen_ctl;
int have_global_err = 0;
@@ -769,7 +766,7 @@ static irqreturn_t nv_adma_interrupt(int
notifier_error = readl(mmio + NV_ADMA_NOTIFIER_ERROR);
notifier_clears[i] = notifier | notifier_error;

- gen_ctl = readl(nv_adma_gen_block(ap) + NV_ADMA_GEN_CTL);
+ gen_ctl = readl(pp->gen_block + NV_ADMA_GEN_CTL);

if( !NV_ADMA_CHECK_INTR(gen_ctl, ap->port_no) && !notifier &&
!notifier_error)
@@ -827,10 +824,10 @@ static irqreturn_t nv_adma_interrupt(int
if(notifier_clears[0] || notifier_clears[1]) {
/* Note: Both notifier clear registers must be written
if either is set, even if one is zero, according to NVIDIA. */
- writel(notifier_clears[0],
- nv_adma_notifier_clear_block(host->ports[0]));
- writel(notifier_clears[1],
- nv_adma_notifier_clear_block(host->ports[1]));
+ struct nv_adma_port_priv *pp = host->ports[0]->private_data;
+ writel(notifier_clears[0], pp->notifier_clear_block);
+ pp = host->ports[1]->private_data;
+ writel(notifier_clears[1], pp->notifier_clear_block);
}

spin_unlock(&host->lock);
@@ -840,7 +837,8 @@ static irqreturn_t nv_adma_interrupt(int

static void nv_adma_irq_clear(struct ata_port *ap)
{
- void __iomem *mmio = nv_adma_ctl_block(ap);
+ struct nv_adma_port_priv *pp = ap->private_data;
+ void __iomem *mmio = pp->ctl_block;
u16 status = readw(mmio + NV_ADMA_STAT);
u32 notifier = readl(mmio + NV_ADMA_NOTIFIER);
u32 notifier_error = readl(mmio + NV_ADMA_NOTIFIER_ERROR);
@@ -849,7 +847,7 @@ static void nv_adma_irq_clear(struct ata
/* clear ADMA status */
writew(status, mmio + NV_ADMA_STAT);
writel(notifier | notifier_error,
- nv_adma_notifier_clear_block(ap));
+ pp->notifier_clear_block);

/** clear legacy status */
outb(inb(dma_stat_addr), dma_stat_addr);
@@ -931,7 +929,7 @@ static int nv_adma_port_start(struct ata
int rc;
void *mem;
dma_addr_t mem_dma;
- void __iomem *mmio = nv_adma_ctl_block(ap);
+ void __iomem *mmio;
u16 tmp;

VPRINTK("ENTER\n");
@@ -946,6 +944,13 @@ static int nv_adma_port_start(struct ata
goto err_out;
}

+ mmio = ap->host->mmio_base + NV_ADMA_PORT +
+ ap->port_no * NV_ADMA_PORT_SIZE;
+ pp->ctl_block = mmio;
+ pp->gen_block = ap->host->mmio_base + NV_ADMA_GEN;
+ pp->notifier_clear_block = pp->gen_block +
+ NV_ADMA_NOTIFIER_CLEAR + (4 * ap->port_no);
+
mem = dma_alloc_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ,
&mem_dma, GFP_KERNEL);

@@ -986,9 +991,9 @@ static int nv_adma_port_start(struct ata
/* clear CPB fetch count */
writew(0, mmio + NV_ADMA_CPB_COUNT);

- /* clear GO for register mode */
+ /* clear GO for register mode, enable interrupt */
tmp = readw(mmio + NV_ADMA_CTL);
- writew(tmp & ~NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);
+ writew( (tmp & ~NV_ADMA_CTL_GO) | NV_ADMA_CTL_AIEN, mmio + NV_ADMA_CTL);

tmp = readw(mmio + NV_ADMA_CTL);
writew(tmp | NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
@@ -1010,7 +1015,7 @@ static void nv_adma_port_stop(struct ata
{
struct device *dev = ap->host->dev;
struct nv_adma_port_priv *pp = ap->private_data;
- void __iomem *mmio = nv_adma_ctl_block(ap);
+ void __iomem *mmio = pp->ctl_block;

VPRINTK("ENTER\n");

@@ -1022,6 +1027,55 @@ static void nv_adma_port_stop(struct ata
ata_port_stop(ap);
}

+static int nv_adma_port_suspend(struct ata_port *ap, pm_message_t mesg)
+{
+ struct nv_adma_port_priv *pp = ap->private_data;
+ void __iomem *mmio = pp->ctl_block;
+
+ /* Go to register mode - clears GO */
+ nv_adma_register_mode(ap);
+
+ /* clear CPB fetch count */
+ writew(0, mmio + NV_ADMA_CPB_COUNT);
+
+ /* disable interrupt, shut down port */
+ writew(0, mmio + NV_ADMA_CTL);
+
+ return 0;
+}
+
+static int nv_adma_port_resume(struct ata_port *ap)
+{
+ struct nv_adma_port_priv *pp = ap->private_data;
+ void __iomem *mmio = pp->ctl_block;
+ u16 tmp;
+
+ /* set CPB block location */
+ writel(pp->cpb_dma & 0xFFFFFFFF, mmio + NV_ADMA_CPB_BASE_LOW);
+ writel((pp->cpb_dma >> 16 ) >> 16, mmio + NV_ADMA_CPB_BASE_HIGH);
+
+ /* clear any outstanding interrupt conditions */
+ writew(0xffff, mmio + NV_ADMA_STAT);
+
+ /* initialize port variables */
+ pp->flags |= NV_ADMA_PORT_REGISTER_MODE;
+
+ /* clear CPB fetch count */
+ writew(0, mmio + NV_ADMA_CPB_COUNT);
+
+ /* clear GO for register mode, enable interrupt */
+ tmp = readw(mmio + NV_ADMA_CTL);
+ writew((tmp & ~NV_ADMA_CTL_GO) | NV_ADMA_CTL_AIEN, mmio + NV_ADMA_CTL);
+
+ tmp = readw(mmio + NV_ADMA_CTL);
+ writew(tmp | NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
+ readl( mmio + NV_ADMA_CTL ); /* flush posted write */
+ udelay(1);
+ writew(tmp & ~NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
+ readl( mmio + NV_ADMA_CTL ); /* flush posted write */
+
+ return 0;
+}

static void nv_adma_setup_port(struct ata_probe_ent *probe_ent, unsigned int port)
{
@@ -1067,15 +1121,6 @@ static int nv_adma_host_init(struct ata_
for (i = 0; i < probe_ent->n_ports; i++)
nv_adma_setup_port(probe_ent, i);

- for (i = 0; i < probe_ent->n_ports; i++) {
- void __iomem *mmio = __nv_adma_ctl_block(probe_ent->mmio_base, i);
- u16 tmp;
-
- /* enable interrupt, clear reset if not already clear */
- tmp = readw(mmio + NV_ADMA_CTL);
- writew(tmp | NV_ADMA_CTL_AIEN, mmio + NV_ADMA_CTL);
- }
-
return 0;
}

@@ -1129,8 +1174,6 @@ static void nv_adma_qc_prep(struct ata_q
NV_CPB_CTL_APRD_VALID |
NV_CPB_CTL_IEN;

- VPRINTK("qc->flags = 0x%lx\n", qc->flags);
-
if (!(qc->flags & ATA_QCFLAG_DMAMAP) ||
(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE)) {
nv_adma_register_mode(qc->ap);
@@ -1148,6 +1191,8 @@ static void nv_adma_qc_prep(struct ata_q
if (qc->tf.protocol == ATA_PROT_NCQ)
ctl_flags |= NV_CPB_CTL_QUEUE | NV_CPB_CTL_FPDMA;

+ VPRINTK("qc->flags = 0x%lx\n", qc->flags);
+
nv_adma_tf_to_cpb(&qc->tf, cpb->tf);

nv_adma_fill_sg(qc, cpb);
@@ -1161,7 +1206,7 @@ static void nv_adma_qc_prep(struct ata_q
static unsigned int nv_adma_qc_issue(struct ata_queued_cmd *qc)
{
struct nv_adma_port_priv *pp = qc->ap->private_data;
- void __iomem *mmio = nv_adma_ctl_block(qc->ap);
+ void __iomem *mmio = pp->ctl_block;

VPRINTK("ENTER\n");

@@ -1346,13 +1391,13 @@ static void nv_adma_error_handler(struct
{
struct nv_adma_port_priv *pp = ap->private_data;
if(!(pp->flags & NV_ADMA_PORT_REGISTER_MODE)) {
- void __iomem *mmio = nv_adma_ctl_block(ap);
+ void __iomem *mmio = pp->ctl_block;
int i;
u16 tmp;

u32 notifier = readl(mmio + NV_ADMA_NOTIFIER);
u32 notifier_error = readl(mmio + NV_ADMA_NOTIFIER_ERROR);
- u32 gen_ctl = readl(nv_adma_gen_block(ap) + NV_ADMA_GEN_CTL);
+ u32 gen_ctl = readl(pp->gen_block + NV_ADMA_GEN_CTL);
u32 status = readw(mmio + NV_ADMA_STAT);

ata_port_printk(ap, KERN_ERR, "EH in ADMA mode, notifier 0x%X "
@@ -1397,6 +1442,7 @@ static int nv_init_one (struct pci_dev *
static int printed_version = 0;
struct ata_port_info *ppi[2];
struct ata_probe_ent *probe_ent;
+ struct nv_host_priv *hpriv;
int pci_dev_busy = 0;
int rc;
u32 bar;
@@ -1411,7 +1457,7 @@ static int nv_init_one (struct pci_dev *
if (pci_resource_start(pdev, bar) == 0)
return -ENODEV;

- if ( !printed_version++)
+ if (!printed_version++)
dev_printk(KERN_DEBUG, &pdev->dev, "version " DRV_VERSION "\n");

rc = pci_enable_device(pdev);
@@ -1442,6 +1488,10 @@ static int nv_init_one (struct pci_dev *
}

rc = -ENOMEM;
+
+ hpriv = kmalloc(sizeof(*hpriv), GFP_KERNEL);
+ if (!hpriv)
+ goto err_out_regions;

ppi[0] = ppi[1] = &nv_port_info[type];
probe_ent = ata_pci_init_native_mode(pdev, ppi, ATA_PORT_PRIMARY | ATA_PORT_SECONDARY);
@@ -1453,6 +1503,8 @@ static int nv_init_one (struct pci_dev *
rc = -EIO;
goto err_out_free_ent;
}
+ probe_ent->private_data = hpriv;
+ hpriv->type = type;

base = (unsigned long)probe_ent->mmio_base;

@@ -1497,6 +1549,60 @@ err_out:
return rc;
}

+static void nv_remove_one (struct pci_dev *pdev)
+{
+ struct ata_host *host = dev_get_drvdata(&pdev->dev);
+ struct nv_host_priv *hpriv = host->private_data;
+
+ kfree(hpriv);
+ ata_pci_remove_one(pdev);
+}
+
+static int nv_pci_device_resume(struct pci_dev *pdev)
+{
+ struct ata_host *host = dev_get_drvdata(&pdev->dev);
+ struct nv_host_priv *hpriv = host->private_data;
+
+ ata_pci_device_do_resume(pdev);
+
+ if (pdev->dev.power.power_state.event == PM_EVENT_SUSPEND) {
+ if(hpriv->type >= CK804) {
+ u8 regval;
+
+ pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
+ regval |= NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
+ pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
+ }
+ if(hpriv->type == ADMA) {
+ u32 tmp32;
+ struct nv_adma_port_priv *pp;
+ /* enable/disable ADMA on the ports appropriately */
+ pci_read_config_dword(pdev, NV_MCP_SATA_CFG_20, &tmp32);
+
+ pp = host->ports[0]->private_data;
+ if(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE)
+ tmp32 &= ~(NV_MCP_SATA_CFG_20_PORT0_EN |
+ NV_MCP_SATA_CFG_20_PORT0_PWB_EN);
+ else
+ tmp32 |= (NV_MCP_SATA_CFG_20_PORT0_EN |
+ NV_MCP_SATA_CFG_20_PORT0_PWB_EN);
+ pp = host->ports[1]->private_data;
+ if(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE)
+ tmp32 &= ~(NV_MCP_SATA_CFG_20_PORT1_EN |
+ NV_MCP_SATA_CFG_20_PORT1_PWB_EN);
+ else
+ tmp32 |= (NV_MCP_SATA_CFG_20_PORT1_EN |
+ NV_MCP_SATA_CFG_20_PORT1_PWB_EN);
+
+ pci_write_config_dword(pdev, NV_MCP_SATA_CFG_20, tmp32);
+ }
+ }
+
+ ata_host_resume(host);
+
+ return 0;
+}
+
static void nv_ck804_host_stop(struct ata_host *host)
{
struct pci_dev *pdev = to_pci_dev(host->dev);
@@ -1513,18 +1619,8 @@ static void nv_ck804_host_stop(struct at
static void nv_adma_host_stop(struct ata_host *host)
{
struct pci_dev *pdev = to_pci_dev(host->dev);
- int i;
u32 tmp32;

- for (i = 0; i < host->n_ports; i++) {
- void __iomem *mmio = __nv_adma_ctl_block(host->mmio_base, i);
- u16 tmp;
-
- /* disable interrupt */
- tmp = readw(mmio + NV_ADMA_CTL);
- writew(tmp & ~NV_ADMA_CTL_AIEN, mmio + NV_ADMA_CTL);
- }
-
/* disable ADMA on the ports */
pci_read_config_dword(pdev, NV_MCP_SATA_CFG_20, &tmp32);
tmp32 &= ~(NV_MCP_SATA_CFG_20_PORT0_EN |


Attachments:
sata_nv-add-suspend-resume-support.patch (14.60 kB)

2006-11-26 20:52:56

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi!

> > Hmm... how common are these machines? We are using unpatched kernel
> > for suse10.2... OTOH we only support machines from the whitelist, all
>
> I've always said IDE and software suspend are unsafe. The more work I do
> the more clearly this is/was the case.

Well, there's unsafe as in "crashes", and that's unsafe as in "eats
disks".

> The really nasty "resume eats your disk" cases I know about are
> thankfully for older systems - VIA KT133 and similar era chipsets.

Aha, good. Hopefully noone has notebook with those.

> There is a recent nasty - Jmicron goes totally to **** on resume because
> of resume quirks not being run but it goes so spectacularly wrong it
> doesn't seem to get far enough to corrupt.

Good :-). Crashing is nasty, but we probably won't add that machine to
whitelist.

> Andrew has about 2/3rds of the bits I've done now, will push the rest
> when I've done a little more testing/checking. At that point libata ought
> to be resume safe. Someone who cares about drivers/ide legacy support can
> then copy the work over.

Thanks. I do not think we care about old mainboards enough to do
2.6.16-stable backport.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-11-27 13:51:58

by Matthew Garrett

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, Nov 26, 2006 at 02:30:15PM -0600, Robert Hancock wrote:

> Yes, it's x86-64, with whatever version of vbetool comes with Fedora Core 5.

Ought to be fixed in the next release of vbetool. The x86emu code in the
current version is a touch broken, and nvidia bioses seem to trip it
quite well.

2006-11-27 20:01:24

by Stefan Seyfried

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, Nov 26, 2006 at 01:11:33AM -0600, Robert Hancock wrote:
> options seem to work, and vbetool appears to helpfully segfault on any operation so that's out.

Try this one:

From: Matthew Garrett
Subject: Fix failures on AMD64

This patch fixes at least some of the cases where vbetool segfaulted on x86_64
while the x86 emulator was executing BIOS code.

--- x86-common.c
+++ x86-common.c
@@ -33,8 +33,8 @@

#include "include/lrmi.h"

-#define REAL_MEM_BASE ((void *)0x10000)
-#define REAL_MEM_SIZE 0x90000
+#define REAL_MEM_BASE ((void *)0x1000)
+#define REAL_MEM_SIZE 0xa0000
#define REAL_MEM_BLOCKS 0x100

struct mem_block {


I have this in our vbetool-0.7 packages and have no reports about
segfaults since then.
--
Stefan Seyfried
QA / R&D Team Mobile Devices | "Any ideas, John?"
SUSE LINUX Products GmbH, N?rnberg | "Well, surrounding them's out."

2006-11-28 10:02:55

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun, 2006-11-26 at 05:53 +0100, Mike Galbraith wrote:
> On Sat, 2006-11-25 at 18:12 +0100, Rafael J. Wysocki wrote:
>
> > Hm, could you please file a bugzilla report regarding the serial console for
> > the information of its maintainer(s)?
>
> Yeah, I'll rummage around a bit first though.

The serial console appears to be innocent. The suspend/resume methods
for my 16550A serial port aren't even being _called_, apparently because
pnp swiped ttyS0. Disabling pnp support, the port suspends and resumes
just peachy. /sys has a different layout without pnp enabled. With it,
there are platform/ttyS[1-3] entries, but ttyS0 is in pnp. Without pnp
support, the platform/ttyS* entries are gone, as is of course pnp/ttyS0.

I guess I'll have to switch "rummage" directories.

-Mike

2006-11-29 10:56:18

by Mike Galbraith

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Tue, 2006-11-28 at 11:01 +0100, Mike Galbraith wrote:
> On Sun, 2006-11-26 at 05:53 +0100, Mike Galbraith wrote:
> > On Sat, 2006-11-25 at 18:12 +0100, Rafael J. Wysocki wrote:
> >
> > > Hm, could you please file a bugzilla report regarding the serial console for
> > > the information of its maintainer(s)?
> >
> > Yeah, I'll rummage around a bit first though.
>
> The serial console appears to be innocent. The suspend/resume methods
> for my 16550A serial port aren't even being _called_, apparently because
> pnp swiped ttyS0.

Well, after further rummaging, the suspend/reCONFIG_PNPsume methods
aren't being called because we're using 8250_pnp.c when doesn't have
any :)

2006-11-29 11:05:14

by Mike Galbraith

[permalink] [raw]
Subject: [rfc patch] Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Wed, 2006-11-29 at 11:21 +0100, Mike Galbraith wrote:
> > The serial console appears to be innocent. The suspend/resume methods
> > for my 16550A serial port aren't even being _called_, apparently because
> > pnp swiped ttyS0.

(ahem, bad aim with mouse, resuming)

Well, after further rummaging, the suspend/resume methods aren't being
called because we're using 8250_pnp.c when CONFIG_PNP is set, and it
doesn't have any :) The below works for me, though I'm not sure it's
sufficient. If it is deemed sufficient, I'll submit it to Andrew.

Add suspend/resume methods to 8250_pnp driver.

--- linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c.org 2006-11-29 07:14:15.000000000 +0100
+++ linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c 2006-11-29 10:55:17.000000000 +0100
@@ -464,11 +464,38 @@ static void __devexit serial_pnp_remove(
serial8250_unregister_port(line - 1);
}

+#ifdef CONFIG_PM
+static int serial_pnp_suspend(struct pnp_dev *dev, pm_message_t state)
+{
+ long line = (long)pnp_get_drvdata(dev);
+
+ if (!line)
+ return -ENODEV;
+ serial8250_suspend_port(line - 1);
+ return 0;
+}
+
+static int serial_pnp_resume(struct pnp_dev *dev)
+{
+ long line = (long)pnp_get_drvdata(dev);
+
+ if (!line)
+ return -ENODEV;
+ serial8250_resume_port(line - 1);
+ return 0;
+}
+
+#endif /* CONFIG_PM */
+
static struct pnp_driver serial_pnp_driver = {
.name = "serial",
- .id_table = pnp_dev_table,
.probe = serial_pnp_probe,
.remove = __devexit_p(serial_pnp_remove),
+#ifdef CONFIG_PM
+ .suspend = serial_pnp_suspend,
+ .resume = serial_pnp_resume,
+#endif
+ .id_table = pnp_dev_table,
};

static int __init serial8250_pnp_init(void)





2006-11-29 14:16:11

by Pavel Machek

[permalink] [raw]
Subject: Re: [rfc patch] Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

Hi!

> On Wed, 2006-11-29 at 11:21 +0100, Mike Galbraith wrote:
> > > The serial console appears to be innocent. The suspend/resume methods
> > > for my 16550A serial port aren't even being _called_, apparently because
> > > pnp swiped ttyS0.
>
> (ahem, bad aim with mouse, resuming)
>
> Well, after further rummaging, the suspend/resume methods aren't being
> called because we're using 8250_pnp.c when CONFIG_PNP is set, and it
> doesn't have any :) The below works for me, though I'm not sure it's
> sufficient. If it is deemed sufficient, I'll submit it to Andrew.
>
> Add suspend/resume methods to 8250_pnp driver.

Patch looks okay to me... care to sign-it-off and submit it to akpm?

Pavel

> --- linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c.org 2006-11-29 07:14:15.000000000 +0100
> +++ linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c 2006-11-29 10:55:17.000000000 +0100
> @@ -464,11 +464,38 @@ static void __devexit serial_pnp_remove(
> serial8250_unregister_port(line - 1);
> }
>
> +#ifdef CONFIG_PM
> +static int serial_pnp_suspend(struct pnp_dev *dev, pm_message_t state)
> +{
> + long line = (long)pnp_get_drvdata(dev);
> +
> + if (!line)
> + return -ENODEV;
> + serial8250_suspend_port(line - 1);
> + return 0;
> +}
> +
> +static int serial_pnp_resume(struct pnp_dev *dev)
> +{
> + long line = (long)pnp_get_drvdata(dev);
> +
> + if (!line)
> + return -ENODEV;
> + serial8250_resume_port(line - 1);
> + return 0;
> +}
> +
> +#endif /* CONFIG_PM */
> +
> static struct pnp_driver serial_pnp_driver = {
> .name = "serial",
> - .id_table = pnp_dev_table,
> .probe = serial_pnp_probe,
> .remove = __devexit_p(serial_pnp_remove),
> +#ifdef CONFIG_PM
> + .suspend = serial_pnp_suspend,
> + .resume = serial_pnp_resume,
> +#endif
> + .id_table = pnp_dev_table,
> };
>
> static int __init serial8250_pnp_init(void)
>
>
>
>
>

--
Thanks for all the (sleeping) penguins.

2006-11-29 19:57:45

by Andrew Morton

[permalink] [raw]
Subject: Re: [rfc patch] Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Wed, 29 Nov 2006 11:30:31 +0100
Mike Galbraith <[email protected]> wrote:

> On Wed, 2006-11-29 at 11:21 +0100, Mike Galbraith wrote:
> > > The serial console appears to be innocent. The suspend/resume methods
> > > for my 16550A serial port aren't even being _called_, apparently because
> > > pnp swiped ttyS0.
>
> (ahem, bad aim with mouse, resuming)
>
> Well, after further rummaging, the suspend/resume methods aren't being
> called because we're using 8250_pnp.c when CONFIG_PNP is set, and it
> doesn't have any :) The below works for me, though I'm not sure it's
> sufficient. If it is deemed sufficient, I'll submit it to Andrew.
>
> Add suspend/resume methods to 8250_pnp driver.
>
> --- linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c.org 2006-11-29 07:14:15.000000000 +0100
> +++ linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c 2006-11-29 10:55:17.000000000 +0100
> @@ -464,11 +464,38 @@ static void __devexit serial_pnp_remove(
> serial8250_unregister_port(line - 1);
> }
>
> +#ifdef CONFIG_PM
> +static int serial_pnp_suspend(struct pnp_dev *dev, pm_message_t state)
> +{
> + long line = (long)pnp_get_drvdata(dev);

Please avoid adding long lines. (heh, I kill me)

> + if (!line)
> + return -ENODEV;
> + serial8250_suspend_port(line - 1);
> + return 0;
> +}
> +
> +static int serial_pnp_resume(struct pnp_dev *dev)
> +{
> + long line = (long)pnp_get_drvdata(dev);
> +
> + if (!line)
> + return -ENODEV;
> + serial8250_resume_port(line - 1);
> + return 0;
> +}

We'd usually do

#else
#define serial_pnp_suspend NULL
#define serial_pnp_resume NULL

here

> +
> +#endif /* CONFIG_PM */
> +
> static struct pnp_driver serial_pnp_driver = {
> .name = "serial",
> - .id_table = pnp_dev_table,
> .probe = serial_pnp_probe,
> .remove = __devexit_p(serial_pnp_remove),
> +#ifdef CONFIG_PM
> + .suspend = serial_pnp_suspend,
> + .resume = serial_pnp_resume,
> +#endif

and hence omit the ifdefs here.

2006-11-30 04:14:18

by Mike Galbraith

[permalink] [raw]
Subject: Re: [rfc patch] Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Wed, 2006-11-29 at 11:49 -0800, Andrew Morton wrote:

> > +#ifdef CONFIG_PM
> > +static int serial_pnp_suspend(struct pnp_dev *dev, pm_message_t state)
> > +{
> > + long line = (long)pnp_get_drvdata(dev);
>
> Please avoid adding long lines. (heh, I kill me)

Ok. I also changed the place I got it from.

> We'd usually do
>
> #else
> #define serial_pnp_suspend NULL
> #define serial_pnp_resume NULL
>
> here
>
> > +
> > +#endif /* CONFIG_PM */
> > +
> > static struct pnp_driver serial_pnp_driver = {
> > .name = "serial",
> > - .id_table = pnp_dev_table,
> > .probe = serial_pnp_probe,
> > .remove = __devexit_p(serial_pnp_remove),
> > +#ifdef CONFIG_PM
> > + .suspend = serial_pnp_suspend,
> > + .resume = serial_pnp_resume,
> > +#endif
>
> and hence omit the ifdefs here.

New patch.

Add suspend/resume methods to drivers/serial/8250_pnp.c. Tested on a
P4/HT 16550A box, ttyS0 login survives across suspend to ram.

Signed-off-by: Mike Galbraith <[email protected]>

--- linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c.org 2006-11-29 07:14:15.000000000 +0100
+++ linux-2.6.19-rc6-mm2/drivers/serial/8250_pnp.c 2006-11-29 20:49:33.000000000 +0100
@@ -459,16 +459,43 @@ serial_pnp_probe(struct pnp_dev *dev, co

static void __devexit serial_pnp_remove(struct pnp_dev *dev)
{
- long line = (long)pnp_get_drvdata(dev);
+ int line = (long)pnp_get_drvdata(dev);
if (line)
serial8250_unregister_port(line - 1);
}

+#ifdef CONFIG_PM
+static int serial_pnp_suspend(struct pnp_dev *dev, pm_message_t state)
+{
+ int line = (int)pnp_get_drvdata(dev);
+
+ if (!line)
+ return -ENODEV;
+ serial8250_suspend_port(line - 1);
+ return 0;
+}
+
+static int serial_pnp_resume(struct pnp_dev *dev)
+{
+ int line = (int)pnp_get_drvdata(dev);
+
+ if (!line)
+ return -ENODEV;
+ serial8250_resume_port(line - 1);
+ return 0;
+}
+#else
+#define serial_pnp_suspend NULL
+#define serial_pnp_resume NULL
+#endif /* CONFIG_PM */
+
static struct pnp_driver serial_pnp_driver = {
.name = "serial",
- .id_table = pnp_dev_table,
.probe = serial_pnp_probe,
.remove = __devexit_p(serial_pnp_remove),
+ .suspend = serial_pnp_suspend,
+ .resume = serial_pnp_resume,
+ .id_table = pnp_dev_table,
};

static int __init serial8250_pnp_init(void)


2006-12-03 12:54:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sunday, 26 November 2006 21:30, Robert Hancock wrote:
> Rafael J. Wysocki wrote:
> >> btw, I have some code almost ready for sata_nv to add proper
> >> suspend/resume support. Unfortunately I have trouble testing it, since
> >> STR doesn't work on my machine since, guess what - the video doesn't
> >> come back! It doesn't even take the monitor out of standby mode. None of
> >> the acpi_sleep options seem to work, and vbetool appears to helpfully
> >> segfault on any operation so that's out.
> >
> > I guess it's x86-64? Which version of vbetool?
>
> Yes, it's x86-64, with whatever version of vbetool comes with Fedora Core 5.
>
> >
> >> This is an NVIDIA SLI setup so that probably makes things a bit more
> >> complicated.
> >
> > Ouch.
> >
> >> In any case, it should be better than what we have right now for
> >> suspend/resume support in sata_nv, namely the "do nothing, won't work
> >> (at least not for CK804 and later)" implementation..
> >
> > I think I can test it if you send me the patch.
> >
> > Greetings,
> > Rafael
>
> Here it is attached. This patch needs to be applied to 2.6.19-rc6-mm1 on
> top of the other one I just submitted, "[PATCH -mm] sata_nv: fix ATAPI
> in ADMA mode". I don't think it's too horribly broken since it did what
> it should have on some aborted suspend-to-disk attempts. (It looks like
> STD doesn't work either, it complains there is no swap available even
> though there is.. maybe because there's 2GB of RAM and 2GB of swap? It
> should still fit though, I would think, as not all RAM is in use..) In
> any case, if it works out OK then I can submit this formally.

I coudn't test it earlier, sorry.

On top of 2.6.19-rc6-mm2 applies and works (although the box also resumes
just fine without it ;-)).

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-12-03 21:40:15

by Kyle Moffett

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Nov 19, 2006, at 17:04:23, Christer Weinigel wrote:
> With suspend-to-disk I can remove the battery (to put in a fresh
> battery when traveling), try doing that with suspend-to-ram.

My PowerBook can do this with suspend-to-ram too; it has an internal
capacitor or battery of some sort which charges in a few minutes and
holds enough power to keep the RAM alive for around a minute while I
swap batteries.

Cheers,
Kyle Moffett



2006-12-04 10:50:39

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Sun 2006-12-03 16:40:04, Kyle Moffett wrote:
> On Nov 19, 2006, at 17:04:23, Christer Weinigel wrote:
> >With suspend-to-disk I can remove the battery (to put in a fresh
> >battery when traveling), try doing that with suspend-to-ram.
>
> My PowerBook can do this with suspend-to-ram too; it has an internal
> capacitor or battery of some sort which charges in a few minutes and
> holds enough power to keep the RAM alive for around a minute while I
> swap batteries.

Well.. OTOH your powerbook is probably the _only_ machine that can do
that :-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Subject: Re: [patch] PM: suspend/resume debugging should depend on SOFTWARE_SUSPEND

On Mon, 04 Dec 2006, Pavel Machek wrote:
> On Sun 2006-12-03 16:40:04, Kyle Moffett wrote:
> > On Nov 19, 2006, at 17:04:23, Christer Weinigel wrote:
> > >With suspend-to-disk I can remove the battery (to put in a fresh
> > >battery when traveling), try doing that with suspend-to-ram.
> >
> > My PowerBook can do this with suspend-to-ram too; it has an internal
> > capacitor or battery of some sort which charges in a few minutes and
> > holds enough power to keep the RAM alive for around a minute while I
> > swap batteries.
>
> Well.. OTOH your powerbook is probably the _only_ machine that can do
> that :-).

Well, most ThinkPads can do that if you stick a secondary bay battery into
it, but I feel the powerbook solution is a lot more elegant :)

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh