2002-12-27 14:12:18

by bert hubert

[permalink] [raw]
Subject: swsusp in 2.5.53 BUG on kernel/suspend.c line 718

Hi!

I wanted to try software suspend again in Linux as 2.5 is doing almost
everything pretty well for me already.

I boot my uniprocessor Pentium III laptop with:

kernel (hd0,0)/boot/vmlinuz-2.5.53 root=/dev/hda1 resume=/dev/hda2

# swapon -s
Filename Type Size Used Priority
/dev/hda2 partition 489972 0 -1

$ cat /proc/meminfo
MemTotal: 191240 kB

When I suspend, things proceed swimmingly, I see a lot of dots printed and
processes entering the refrigerator, until line 718 is hit in
kernel/suspend.c:

if (nr_copy_pages != count_and_copy_data_pages(pagedir_nosave)) /* copy */
BUG();

When I aded some printks, it turns out that count_and_copy_data pages
returns 5440 (decimal) and that nr_copy_pages is 5458, 18 more. Before this
function is called, the address c034c000 was printed twice prefixed with
'nosave', once during each call of count_and_copy_data_pages it appears.

So it appears some pages were freed in the critical section!

Another interesting note is that pdflush reported 'Bogus wakeup' twice
during the refrigeration phase. I also see two pdflushes running.

If I remove the BUG();, on resume it crashes on an unhandled NULL pointer,
the EIP is in a function aptly named do_magic() at +0x9e.

Compiler is gcc 3.2.1. Anything I can do to help, just let me know!

Regards,

bert

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO
http://netherlabs.nl Consulting


2002-12-27 15:01:14

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp in 2.5.53 BUG on kernel/suspend.c line 718

You need one-liner to fix this, search mailing lists.

Pavel

> Hi!
>
> I wanted to try software suspend again in Linux as 2.5 is doing almost
> everything pretty well for me already.
>
> I boot my uniprocessor Pentium III laptop with:
>
> kernel (hd0,0)/boot/vmlinuz-2.5.53 root=/dev/hda1 resume=/dev/hda2
>
> # swapon -s
> Filename Type Size Used Priority
> /dev/hda2 partition 489972 0 -1
>
> $ cat /proc/meminfo
> MemTotal: 191240 kB
>
> When I suspend, things proceed swimmingly, I see a lot of dots printed and
> processes entering the refrigerator, until line 718 is hit in
> kernel/suspend.c:
>
> if (nr_copy_pages != count_and_copy_data_pages(pagedir_nosave)) /* copy */
> BUG();
>
> When I aded some printks, it turns out that count_and_copy_data pages
> returns 5440 (decimal) and that nr_copy_pages is 5458, 18 more. Before this
> function is called, the address c034c000 was printed twice prefixed with
> 'nosave', once during each call of count_and_copy_data_pages it appears.
>
> So it appears some pages were freed in the critical section!
>
> Another interesting note is that pdflush reported 'Bogus wakeup' twice
> during the refrigeration phase. I also see two pdflushes running.
>
> If I remove the BUG();, on resume it crashes on an unhandled NULL pointer,
> the EIP is in a function aptly named do_magic() at +0x9e.
>
> Compiler is gcc 3.2.1. Anything I can do to help, just let me know!
>
> Regards,
>
> bert
>

--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

2002-12-27 18:26:18

by bert hubert

[permalink] [raw]
Subject: Re: swsusp in 2.5.53 BUG on kernel/suspend.c line 718

On Fri, Dec 27, 2002 at 04:09:30PM +0100, Pavel Machek wrote:
> You need one-liner to fix this, search mailing lists.

Your patch below indeed works, except for my network adaptor which needs
'ifconfig eth0 down', 'ifconfig eth0 up' before it works again.

It says:

NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timeout, status 00000000 00000240
00:01.1 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100
Ethernet (rev 82)

Can a non-guru add the magic handlers to network drivers to make them wake
up again properly?

Thanks!

--- clean/mm/page_alloc.c 2002-12-18 22:21:13.000000000 +0100
+++ linux-swsusp/mm/page_alloc.c 2002-12-18 22:30:47.000000000 +0100
@@ -389,7 +389,7 @@
unsigned long flags;
struct page *page = NULL;

- if (order == 0) {
+ if ((order == 0) && !cold) {
struct per_cpu_pages *pcp;

pcp = &zone->pageset[get_cpu()].pcp[cold];



--
http://www.PowerDNS.com Open source, database driven DNS Software
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO
http://netherlabs.nl Consulting