2005-04-10 23:18:20

by Andreas Steinmetz

[permalink] [raw]
Subject: Oops in swsusp

Pavel,
during testing of the encrypted swsusp_image on x86_64 I did get an Oops
from time to time at memcpy+11 called from swsusp_save+1090 which turns
out to be the memcpy in copy_data_pages() of swsusp.c.
The Oops is caused by a NULL pointer (I don't remember if it was source
or destination).
This Oops seems to be unrelated to the encrypted image addition as I
didn't touch any code in that area.
--
Andreas Steinmetz SPAMmers use [email protected]


2005-04-11 06:58:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Oops in swsusp

Hi,

On Monday, 11 of April 2005 01:17, Andreas Steinmetz wrote:
> Pavel,
> during testing of the encrypted swsusp_image on x86_64 I did get an Oops
> from time to time at memcpy+11 called from swsusp_save+1090 which turns
> out to be the memcpy in copy_data_pages() of swsusp.c.
> The Oops is caused by a NULL pointer (I don't remember if it was source
> or destination).

It's quite important, however. If it's the destination, it's probably a bug in
swsusp. Otherwise, the problem is more serious. Could you, please,
add BUG_ON(!pbe->address) right before the memcpy() and retest?

Greets,
Rafael


--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"

2005-04-11 11:00:53

by Pavel Machek

[permalink] [raw]
Subject: Re: Oops in swsusp

Hi!

> during testing of the encrypted swsusp_image on x86_64 I did get an Oops
> from time to time at memcpy+11 called from swsusp_save+1090 which turns
> out to be the memcpy in copy_data_pages() of swsusp.c.
> The Oops is caused by a NULL pointer (I don't remember if it was source
> or destination).
> This Oops seems to be unrelated to the encrypted image addition as I
> didn't touch any code in that area.

Could you still try to reproduce without any patches? Alocating memory
at wrong moment may cause something like that, and crypto routines
might do that.
Pavel
--
Boycott Kodak -- for their patent abuse against Java.

2005-04-11 13:09:49

by Andreas Steinmetz

[permalink] [raw]
Subject: Re: Oops in swsusp

Rafael J. Wysocki wrote:
> Hi,
>
> On Monday, 11 of April 2005 01:17, Andreas Steinmetz wrote:
>
>>Pavel,
>>during testing of the encrypted swsusp_image on x86_64 I did get an Oops
>>from time to time at memcpy+11 called from swsusp_save+1090 which turns
>>out to be the memcpy in copy_data_pages() of swsusp.c.
>>The Oops is caused by a NULL pointer (I don't remember if it was source
>>or destination).
>
>
> It's quite important, however. If it's the destination, it's probably a bug in
> swsusp. Otherwise, the problem is more serious. Could you, please,
> add BUG_ON(!pbe->address) right before the memcpy() and retest?

I'll try.

--
Andreas Steinmetz SPAMmers use [email protected]

2005-04-11 16:13:45

by Andreas Steinmetz

[permalink] [raw]
Subject: Re: Oops in swsusp

Rafael J. Wysocki wrote:
> Hi,
>
> On Monday, 11 of April 2005 01:17, Andreas Steinmetz wrote:
>
>>Pavel,
>>during testing of the encrypted swsusp_image on x86_64 I did get an Oops
>>from time to time at memcpy+11 called from swsusp_save+1090 which turns
>>out to be the memcpy in copy_data_pages() of swsusp.c.
>>The Oops is caused by a NULL pointer (I don't remember if it was source
>>or destination).
>
>
> It's quite important, however. If it's the destination, it's probably a bug in
> swsusp. Otherwise, the problem is more serious. Could you, please,
> add BUG_ON(!pbe->address) right before the memcpy() and retest?

Here's the result:

BUG_ON(!pbe->address) hits, so is seems to be a swsusp problem.

So I added a BUG_ON(!to_copy) as shown below (mangled for mailing):

if (saveable(zone, &zone_pfn)) {
struct page * page;
BUG_ON(!to_copy);
page = pfn_to_page(zone_pfn + zone->zone_start_pfn);
pbe->orig_address = (long) page_address(page);
/* copy_page is not usable for copying task structs. */
BUG_ON(!pbe->address);
memcpy((void *)pbe->address, (void *)pbe->orig_address,
PAGE_SIZE);
pbe++;
to_copy--;
}

The BUG_ON(!to_copy) hits, too.

So it seems that the result sum of saveable() increases between
count_data_pages() and copy_data_pages(). The problem seems to be
easiest hit when starting a larger GUI application and suspending during
application startup. It does ususally take a few suspend/resume
iterations to hit the bug.

Pavel,
the crypto stuff in swsusp.c starts in data_write() which is called
after copy_data_pages() if I'm right. In this case the crypto stuff
can't be the culprit.
--
Andreas Steinmetz SPAMmers use [email protected]

2005-04-11 17:06:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Oops in swsusp

Hi,

On Monday, 11 of April 2005 18:01, Andreas Steinmetz wrote:
> Rafael J. Wysocki wrote:
> > Hi,
> >
> > On Monday, 11 of April 2005 01:17, Andreas Steinmetz wrote:
> >
> >>Pavel,
> >>during testing of the encrypted swsusp_image on x86_64 I did get an Oops
> >>from time to time at memcpy+11 called from swsusp_save+1090 which turns
> >>out to be the memcpy in copy_data_pages() of swsusp.c.
> >>The Oops is caused by a NULL pointer (I don't remember if it was source
> >>or destination).
> >
> >
> > It's quite important, however. If it's the destination, it's probably a bug in
> > swsusp. Otherwise, the problem is more serious. Could you, please,
> > add BUG_ON(!pbe->address) right before the memcpy() and retest?
>
> Here's the result:
>
> BUG_ON(!pbe->address) hits, so is seems to be a swsusp problem.
>
> So I added a BUG_ON(!to_copy) as shown below (mangled for mailing):
>
> if (saveable(zone, &zone_pfn)) {
> struct page * page;
> BUG_ON(!to_copy);
> page = pfn_to_page(zone_pfn + zone->zone_start_pfn);
> pbe->orig_address = (long) page_address(page);
> /* copy_page is not usable for copying task structs. */
> BUG_ON(!pbe->address);
> memcpy((void *)pbe->address, (void *)pbe->orig_address,
> PAGE_SIZE);
> pbe++;
> to_copy--;
> }
>
> The BUG_ON(!to_copy) hits, too.

Oh, I see. You have discovered a bug in the code that was replaced
in 2.6.12-rc1. :-) Please use 2.6.12-rc2 or the latest -mm for
testing/development of swsusp.

Greets,
Rafael


--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"