by Sam Ravnborg

[permalink] [raw]

Subject: Re: Long delay in resume from RAM (Was Re: [patch 00/69] -stablereview)

On Thu, May 24, 2007 at 02:21:26PM -0700, Andrew Morton wrote:
>
> It's not a matter of when it's evaluated. The user is supposed to be
> able to set EXTRA_CFLAGS on the command-line, yes? If they do that then
> the "=" in there will rub out their efforts. The makefiles should be
> appending new things to EXTRA_CFLAGS, rather than doing a replacement?

There is no way to specify additional CFLAGS on the commandline today.
For sparse we took the shorthand CF so you can do:

make C=2 CF=-warn-bitwise

But we have no such thing for CFLAGS.
If there is a real need I can cook up something.
But frankly I have alway edited top-level Makefile and be doen with it.

I will fix it so Kbuild do not barf out if you set EXTRA_* on the commandline
but silently ignore it instead.

Sam

2007-05-25 05:18:56

Hi.

On Tue, 2007-05-29 at 14:33 -0700, Linus Torvalds wrote:
>
> On Wed, 30 May 2007, Nigel Cunningham wrote:
> >
> > On Tue, 2007-05-29 at 10:19 -0400, Mark Lord wrote:
> > >
> > > How about blocking brk() and mmap(MAP_ANONYMOUS) in addition to
> > > the filesystem VFS callers? Or is that starting to get messy again?
> >
> > Yeah. Getting messy again :)
>
> Indeed. And also misses the point - the point being that we don't actually
> need to freeze anything at all most of the time. There's nothing wrong
> with making memory allocations etc.
>
> And yes, suspend is different from hibernate. I can see how hibernate
> people are worried about people writing to things after doing the
> snapshot, but those concerns don't exist with suspend. With suspend, the
> biggest concern is accessing a device after it has been suspended, but on
> the other hand, also the fact that we end up having driver writers used
> to the system being "runnable", so they do things that really do require a
> full-fledged system (and sometimes that means just some delayed action
> using a kernel thread, other times it seems to rely on more complex
> behaviour like firmware loading :^p )

Yeah, but they can't. Even after the freezing of processes has been
removed from the normal suspend to ram path, we're still going to have
this issue with the suspend to ram after writing a hibernation image
path.

Regards,

Nigel

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2007-05-30 10:27:15

by Romano Giannetti

[permalink] [raw]

Subject: Re: pcmcia resume 60 second hang. Re: [patch 00/69] -stable review

On Tue, 2007-05-29 at 07:55 -0700, Linus Torvalds wrote:
>
> On Tue, 29 May 2007, Romano Giannetti wrote:
> >
> > - The good (?) news. I have made 7 suspend/resume cycle (to ram, I
> > haven't tested hibernation) with a 2.6.21.2 with that patch, applied
> > manually. The system did suspend and resume nicely even compiling a
> > kernel and opening openoffice. Normally (le me stress _normally_) no
> > delay was apparent on resume. I do not know how dangerous is this... :-)
> >
> > - The bad (?) news. One time out of 7 I had the 60 seconds delay.
>
> Interesting. If you can re-create it, please do the sysrq-T thing again,
> to see what's up. (Also, you might do "sysrq-p", which gives the current
> process data, which sysrq-T does not).

I've got it, but I had a problem: I filled the dmesg buffer. I will try
to find where to enlarge it. I have posted the partial result to:

http://www.dea.icai.upcomillas.es/romano/linux/info/dmesg-resume-nofreeze.txt

in the hope that something can be used. I am running 2.6.21.2, with the
"no freeze kthreads at all" patch from Matthew Garrett, with this
add-on:

--- drivers/base/firmware_class.c.orig 2007-05-30 12:19:59.000000000 +0200
+++ drivers/base/firmware_class.c 2007-05-29 19:39:56.000000000 +0200
@@ -471,7 +471,11 @@
struct device *device)
{
int uevent = 1;
- return _request_firmware(firmware_p, name, device, uevent);
+ int rval;
+ printk(KERN_ERR "FW: requesting firmware (sync) for %s\n", name);
+ rval = _request_firmware(firmware_p, name, device, uevent);
+ printk(KERN_ERR "FW: return %d\n", rval);
+ return rval;
}

/**
@@ -545,7 +549,9 @@
struct task_struct *task;
struct firmware_work *fw_work = kmalloc(sizeof (struct firmware_work),
GFP_ATOMIC);
-
+
+ printk(KERN_ERR "FW: requesting firmware (async) for %s\n", name);
+
if (!fw_work)
return -ENOMEM;
if (!try_module_get(module)) {
@@ -569,8 +575,12 @@
fw_work->cont(NULL, fw_work->context);
module_put(fw_work->module);
kfree(fw_work);
+ printk(KERN_ERR "FW: failing return %d\n", PTR_ERR(task));
return PTR_ERR(task);
}
+
+ printk(KERN_ERR "FW: normal return\n");
+
return 0;
}

--
Romano Giannetti --- [email protected]
Sorry for the following disclaimer, it's attached by our otugoing server
and I cannot shut it up.

--
La presente comunicaci?n tiene car?cter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribuci?n, reproducci?n o uso de esta comunicaci?n y/o de la informaci?n contenida en la misma est?n estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicaci?n por error, por favor, notif?quelo inmediatamente al remitente contestando a este mensaje y proceda a continuaci?n a destruirlo. Gracias por su colaboraci?n.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.

2007-05-30 11:49:43

by Pavel Machek

[permalink] [raw]

Subject: Re: pcmcia resume 60 second hang. Re: [patch 00/69] -stable review

Hi!

> > > How about blocking brk() and mmap(MAP_ANONYMOUS) in addition to
> > > the filesystem VFS callers? Or is that starting to get messy again?
> >
> > Yeah. Getting messy again :)
>
> Indeed. And also misses the point - the point being that we don't actually
> need to freeze anything at all most of the time. There's nothing wrong
> with making memory allocations etc.
>
> And yes, suspend is different from hibernate. I can see how hibernate
> people are worried about people writing to things after doing the
> snapshot, but those concerns don't exist with suspend. With suspend, the
> biggest concern is accessing a device after it has been suspended, but on
> the other hand, also the fact that we end up having driver writers used
> to the system being "runnable", so they do things that really do require a
> full-fledged system (and sometimes that means just some delayed action
> using a kernel thread, other times it seems to rely on more complex
> behaviour like firmware loading :^p )

Notice that we want to be able to suspend while hibernating -- for
suspend to both behaviour. So drivers may _not_ rely on system being
runnable.

(Suspend to both is: write image to disk, then suspend to RAM. If you
do not run out of battery, resume is from RAM and fast, if you do, you
still can do resume from disk, not loosing your data).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-05-30 13:08:49

by Matthew Garrett

[permalink] [raw]

Subject: Re: pcmcia resume 60 second hang. Re: [patch 00/69] -stable review

On Wed, May 30, 2007 at 01:49:21PM +0200, Pavel Machek wrote:

(Trimmed the Cc:s quite heavily - I think this has gone somewhere beyond
the original point)

> Notice that we want to be able to suspend while hibernating -- for
> suspend to both behaviour. So drivers may _not_ rely on system being
> runnable.

So keep the driver layers read-only and unfreeze the processes after
doing the atomic copy.

--
Matthew Garrett | [email protected]

2007-05-30 13:17:55

Hi.

On Wed, 2007-05-30 at 16:04 +0200, Rafael J. Wysocki wrote:
> On Wednesday, 30 May 2007 15:17, Nigel Cunningham wrote:
> > On Wed, 2007-05-30 at 13:40 +0100, Matthew Garrett wrote:
> > > On Wed, May 30, 2007 at 01:49:21PM +0200, Pavel Machek wrote:
> > >
> > > (Trimmed the Cc:s quite heavily - I think this has gone somewhere beyond
> > > the original point)
> > >
> > > > Notice that we want to be able to suspend while hibernating -- for
> > > > suspend to both behaviour. So drivers may _not_ rely on system being
> > > > runnable.
> > >
> > > So keep the driver layers read-only and unfreeze the processes after
> > > doing the atomic copy.
> >
> > I know you probably won't care, but that's not an option for Suspend2 -
> > I get the possibility of a full image by overwriting LRU pages that were
> > saved prior to the atomic copy.
>
> This generally is a problem, not only for suspend2. :-)
>
> Once you've unfrozen the user land, we can't rely on the hibernation image any
> more, because some tasks may cause the on-disk filesystems' state to change.

True. I understood, perhaps wrongly, that when Matthew spoke of keeping
the drivers layers read-only, he was meaning stopping filesystem changes
by some other means.

Regards,

Nigel

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part