2010-07-30 05:00:48

by Bojan Smojver

[permalink] [raw]
Subject: [PATCH]: Compress hibernation image with LZO (in-kernel)

This patch speeds up hibernate/thaw operations (in kernel ones) by
compressing the pages using LZO. Tests on my ThinkPad T510 show that
hibernation times can be cut by the factor of about 4 for an image of
around 1 GB (from about a minute down to about 15 seconds).

I'm not really familiar with kernel internals, so many of the things in
this patch could be very, very wrong. But go easy - this is essentially
the first time I've attempted something like this.

PS. I'm not on the list, so please CC me if you'd like to contact me.

--
Bojan


Attachments:
hibernate-lzo.patch (6.53 kB)

2010-07-30 10:44:31

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Fri, 2010-07-30 at 14:46 +1000, Bojan Smojver wrote:
> This patch speeds up hibernate/thaw operations

This is probably a bit simpler.

--
Bojan


Attachments:
hibernate-lzo.patch (6.46 kB)

2010-07-30 22:05:19

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi Bojan.

On 30/07/10 20:44, Bojan Smojver wrote:
> On Fri, 2010-07-30 at 14:46 +1000, Bojan Smojver wrote:
>> This patch speeds up hibernate/thaw operations
>
> This is probably a bit simpler.

In general, it looks good. Is an order 6 allocation really necessary,
though? I guess they'll be more reliably achieved with swsusp freeing so
much memory beforehand but still...

Regards,

Nigel

2010-07-30 22:20:04

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 2010-07-31 at 08:05 +1000, Nigel Cunningham wrote:
> In general, it looks good. Is an order 6 allocation really necessary,
> though?

To be honest, I don't really know. I kinda guessed that the more data
compress routine has, the more it's going to be able to squeeze it down.
I can do some tests with 5 or even 4. Will let you know.

--
Bojan

2010-07-30 23:22:58

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 2010-07-31 at 08:05 +1000, Nigel Cunningham wrote:
> Is an order 6 allocation really necessary,
> though?

I tried with 5 and 4 and got slightly lower throughput numbers. With 6,
I was getting 145 to 150 MB/s, with 4 I'm getting around 130 MB/s (this
is all on hibernate).

So, here is one with 4.

PS. I guess with this, read_sync can simply disappear as well.

--
Bojan


Attachments:
hibernate-lzo.patch (6.38 kB)

2010-07-30 23:40:27

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi again.

On 31/07/10 09:22, Bojan Smojver wrote:
> On Sat, 2010-07-31 at 08:05 +1000, Nigel Cunningham wrote:
>> Is an order 6 allocation really necessary,
>> though?
>
> I tried with 5 and 4 and got slightly lower throughput numbers. With 6,
> I was getting 145 to 150 MB/s, with 4 I'm getting around 130 MB/s (this
> is all on hibernate).
>
> So, here is one with 4.

How about vmallocing the cmp as well? That would greatly reduce the
potential for page allocation failures while still letting you use an
order 6 area.

I might try this for TuxOnIce too - I'm only using order 0 allocations
at the moment and have been wondering how I can get the higher write
speed I think should be possible on my system :) I was thinking along
the lines of locking, but perhaps that was the wrong area to pursue! :)

> PS. I guess with this, read_sync can simply disappear as well.

I haven't looked at the code for a while, but it might still be needed
for the header? I know that in TuxOnIce, I need to read the first page
synchronously when bootstrapping reading the image (can't read the next
page until you know where it is, and its location is on the first page).
Since swsusp uses those index pages, I think it would have the same
issue - they would need to be read before it could read the following
pages. Of course I'm going off memory :)

Nigel

2010-07-31 01:03:11

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 2010-07-31 at 09:40 +1000, Nigel Cunningham wrote:

> How about vmallocing the cmp as well? That would greatly reduce the
> potential for page allocation failures while still letting you use an
> order 6 area.

In save_image(), that worked. In load_image() it would cause a crash
(something about kernel not being able to satisfy paging request). So, I
just made it __get_free_pages() instead. But, yeah good point.

Keep in mind that I have absolutely no idea how kernel memory allocation
works. I'm kinda coping and pasting code and hoping it doesn't crash :-)

> > PS. I guess with this, read_sync can simply disappear as well.
>
> I haven't looked at the code for a while, but it might still be needed
> for the header? I know that in TuxOnIce, I need to read the first page
> synchronously when bootstrapping reading the image (can't read the next
> page until you know where it is, and its location is on the first page).
> Since swsusp uses those index pages, I think it would have the same
> issue - they would need to be read before it could read the following
> pages. Of course I'm going off memory :)

I think it can go, because the header is already read/written with &bio
set to NULL (sync read). See patch to remove read_sync.

--
Bojan


Attachments:
hibernate-no-sync_read.patch (1.48 kB)

2010-07-31 01:19:00

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi.

On 31/07/10 11:03, Bojan Smojver wrote:
> On Sat, 2010-07-31 at 09:40 +1000, Nigel Cunningham wrote:
>
>> How about vmallocing the cmp as well? That would greatly reduce the
>> potential for page allocation failures while still letting you use an
>> order 6 area.
>
> In save_image(), that worked. In load_image() it would cause a crash
> (something about kernel not being able to satisfy paging request). So, I
> just made it __get_free_pages() instead. But, yeah good point.
>
> Keep in mind that I have absolutely no idea how kernel memory allocation
> works. I'm kinda coping and pasting code and hoping it doesn't crash :-)

It should be possible to do the allocation at that point. I might see if
I can take a look later on.

>>> PS. I guess with this, read_sync can simply disappear as well.
>>
>> I haven't looked at the code for a while, but it might still be needed
>> for the header? I know that in TuxOnIce, I need to read the first page
>> synchronously when bootstrapping reading the image (can't read the next
>> page until you know where it is, and its location is on the first page).
>> Since swsusp uses those index pages, I think it would have the same
>> issue - they would need to be read before it could read the following
>> pages. Of course I'm going off memory :)
>
> I think it can go, because the header is already read/written with&bio
> set to NULL (sync read). See patch to remove read_sync.

Okee doke.

By the way, please inline your patches. It makes it much easier to read
and comment on them.

Oh, and I've said it privately but not publicly: great work!

Regards,

Nigel

2010-07-31 01:33:58

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 2010-07-31 at 11:18 +1000, Nigel Cunningham wrote:
> It should be possible to do the allocation at that point. I might see
> if I can take a look later on.

Maybe it's related to the fact that with vmalloc() you get a different
thing then with __get_free_pages() in terms of using it with devices.

If found this in Linux Device Drivers:
---------------
Addresses allocated by vmalloc can't be used outside of the
microprocessor, because they make sense only on top of the processor's
MMU. When a driver needs a real physical address (such as a DMA address,
used by peripheral hardware to drive the system's bus), you can't easily
use vmalloc. The right time to call vmalloc is when you are allocating
memory for a large sequential buffer that exists only in software.
---------------

I'm guessing the page allocated by vmalloc() eventually gets passed down
the I/O chain and is supposed to be read into by some hardware, which
then causes a crash. (He, he... kernel according to Bojan - please
ignore :-).

In the first iteration of the patch, I allocated just one page using
__get_free_page() and used that for I/O operations. The only overhead
there was memcpy() from the page to vmalloc()-ed cmp. I can go back to
that easily. It didn't actually make any significant difference to
performance.

--
Bojan

2010-07-31 04:41:30

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 2010-07-31 at 11:33 +1000, Bojan Smojver wrote:
> I can go back to that easily.

So, here is that whole enchilada one more time (it includes sync_read
removal patch as well).

I did 3 hibernate/thaw cycles with it. Images varied from about 850 MB,
1.1 GB to 750 MB. I was getting 156/141 MB/s, 121/118 MBs and 141/130
MBs speeds. Obviously, these things depend on compression ratios
achieved etc.

I guess the number of pages (i.e. LZO_UNC_PAGES) could be made
configurable as well.

PS. Inline, as requested.

diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
index ca6066a..cb57eb9 100644
--- a/kernel/power/Kconfig
+++ b/kernel/power/Kconfig
@@ -137,6 +137,8 @@ config SUSPEND_FREEZER
config HIBERNATION
bool "Hibernation (aka 'suspend to disk')"
depends on PM && SWAP && ARCH_HIBERNATION_POSSIBLE
+ select LZO_COMPRESS
+ select LZO_DECOMPRESS
select SUSPEND_NVS if HAS_IOMEM
---help---
Enable the suspend to disk (STD) functionality, which is usually
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 006270f..a760cf8 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -103,10 +103,6 @@ struct snapshot_handle {
void *buffer; /* address of the block to read from
* or write to
*/
- int sync_read; /* Set to one to notify the caller of
- * snapshot_write_next() that it may
- * need to call wait_on_bio_chain()
- */
};

/* This macro returns the address from/to which the caller of
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 25ce010..f24ee24 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -2135,8 +2135,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages)
return 0;

- handle->sync_read = 1;
-
if (!handle->cur) {
if (!buffer)
/* This makes the buffer be freed by swsusp_free() */
@@ -2169,7 +2167,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
memory_bm_position_reset(&orig_bm);
restore_pblist = NULL;
handle->buffer = get_buffer(&orig_bm, &ca);
- handle->sync_read = 0;
if (IS_ERR(handle->buffer))
return PTR_ERR(handle->buffer);
}
@@ -2178,8 +2175,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
handle->buffer = get_buffer(&orig_bm, &ca);
if (IS_ERR(handle->buffer))
return PTR_ERR(handle->buffer);
- if (handle->buffer != buffer)
- handle->sync_read = 0;
}
handle->cur++;
return PAGE_SIZE;
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index b0bb217..49a9d30 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -24,6 +24,7 @@
#include <linux/swapops.h>
#include <linux/pm.h>
#include <linux/slab.h>
+#include <linux/lzo.h>

#include "power.h"

@@ -357,6 +358,13 @@ static int swap_writer_finish(struct swap_map_handle *handle,
return error;
}

+#define LZO_HEADER sizeof(size_t)
+#define LZO_UNC_PAGES 64
+#define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE)
+#define LZO_CMP_PAGES DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + \
+ LZO_HEADER, PAGE_SIZE)
+#define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE)
+
/**
* save_image - save the suspend image data
*/
@@ -372,6 +380,38 @@ static int save_image(struct swap_map_handle *handle,
struct bio *bio;
struct timeval start;
struct timeval stop;
+ size_t ul, cl;
+ unsigned char *unc, *cmp, *wrk, *page;
+
+ page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
+ if (!page) {
+ printk(KERN_ERR "PM: Failed to allocate LZO page\n");
+ return -ENOMEM;
+ }
+
+ wrk = vmalloc(LZO1X_1_MEM_COMPRESS);
+ if (!wrk) {
+ printk(KERN_ERR "PM: Failed to allocate LZO workspace\n");
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }
+
+ unc = vmalloc(LZO_UNC_SIZE);
+ if (!unc) {
+ printk(KERN_ERR "PM: Failed to allocate LZO uncompressed\n");
+ vfree(wrk);
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }
+
+ cmp = vmalloc(LZO_CMP_SIZE);
+ if (!cmp) {
+ printk(KERN_ERR "PM: Failed to allocate LZO compressed\n");
+ vfree(unc);
+ vfree(wrk);
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }

printk(KERN_INFO "PM: Saving image data pages (%u pages) ... ",
nr_to_write);
@@ -382,16 +422,48 @@ static int save_image(struct swap_map_handle *handle,
bio = NULL;
do_gettimeofday(&start);
while (1) {
- ret = snapshot_read_next(snapshot);
- if (ret <= 0)
+ for (ul = 0; ul < LZO_UNC_SIZE; ul += PAGE_SIZE) {
+ ret = snapshot_read_next(snapshot);
+ if (ret < 0)
+ goto out_finish;
+
+ if (ret == 0)
+ break;
+
+ memcpy(unc + ul, data_of(*snapshot), PAGE_SIZE);
+
+ if (!(nr_pages % m))
+ printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
+ nr_pages++;
+ }
+
+ if (ul == 0)
+ break;
+
+ ret = lzo1x_1_compress(unc, ul, cmp + LZO_HEADER, &cl, wrk);
+ if (ret < 0) {
+ printk(KERN_ERR "PM: LZO compression failed\n");
break;
- ret = swap_write_page(handle, data_of(*snapshot), &bio);
- if (ret)
+ }
+
+ if (unlikely(cl == 0 || LZO_HEADER + cl > LZO_CMP_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ ret = -1;
break;
- if (!(nr_pages % m))
- printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
- nr_pages++;
+ }
+
+ *(size_t *)cmp = cl;
+
+ for (ul = 0; ul < LZO_HEADER + cl; ul += PAGE_SIZE) {
+ memcpy(page, cmp + ul, PAGE_SIZE);
+
+ ret = swap_write_page(handle, page, &bio);
+ if (ret)
+ goto out_finish;
+ }
}
+
+out_finish:
err2 = hib_wait_on_bio_chain(&bio);
do_gettimeofday(&stop);
if (!ret)
@@ -401,6 +473,12 @@ static int save_image(struct swap_map_handle *handle,
else
printk(KERN_CONT "\n");
swsusp_show_speed(&start, &stop, nr_to_write, "Wrote");
+
+ vfree(cmp);
+ vfree(unc);
+ vfree(wrk);
+ free_page((unsigned long)page);
+
return ret;
}

@@ -416,7 +494,8 @@ static int enough_swap(unsigned int nr_pages)
unsigned int free_swap = count_swap_pages(root_swap, 1);

pr_debug("PM: Free swap pages: %u\n", free_swap);
- return free_swap > nr_pages + PAGES_FOR_IO;
+ return free_swap >
+ (nr_pages * LZO_CMP_PAGES) / LZO_UNC_PAGES + PAGES_FOR_IO;
}

/**
@@ -547,9 +626,30 @@ static int load_image(struct swap_map_handle *handle,
int error = 0;
struct timeval start;
struct timeval stop;
- struct bio *bio;
- int err2;
unsigned nr_pages;
+ size_t ul, cl;
+ unsigned char *unc, *cmp, *page;
+
+ page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
+ if (!page) {
+ printk(KERN_ERR "PM: Failed to allocate LZO page\n");
+ return -ENOMEM;
+ }
+
+ unc = vmalloc(LZO_UNC_SIZE);
+ if (!unc) {
+ printk(KERN_ERR "PM: Failed to allocate LZO uncompressed\n");
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }
+
+ cmp = vmalloc(LZO_CMP_SIZE);
+ if (!cmp) {
+ printk(KERN_ERR "PM: Failed to allocate LZO compressed\n");
+ vfree(unc);
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }

printk(KERN_INFO "PM: Loading image data pages (%u pages) ... ",
nr_to_read);
@@ -557,27 +657,60 @@ static int load_image(struct swap_map_handle *handle,
if (!m)
m = 1;
nr_pages = 0;
- bio = NULL;
do_gettimeofday(&start);
+
+ error = snapshot_write_next(snapshot);
+ if (error <= 0)
+ goto out_finish;
+
for ( ; ; ) {
- error = snapshot_write_next(snapshot);
- if (error <= 0)
- break;
- error = swap_read_page(handle, data_of(*snapshot), &bio);
+ error = swap_read_page(handle, page, NULL); /* sync */
if (error)
break;
- if (snapshot->sync_read)
- error = hib_wait_on_bio_chain(&bio);
- if (error)
+ memcpy(cmp, page, PAGE_SIZE);
+
+ cl = *(size_t *)cmp;
+ if (unlikely(cl == 0 || LZO_HEADER + cl > LZO_CMP_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ error = -1;
+ break;
+ }
+
+ for (ul = PAGE_SIZE; ul < LZO_HEADER + cl; ul += PAGE_SIZE) {
+ error = swap_read_page(handle, page, NULL); /* sync */
+ if (error)
+ goto out_finish;
+ memcpy(cmp + ul, page, PAGE_SIZE);
+ }
+
+ ul = LZO_UNC_SIZE;
+ error = lzo1x_decompress_safe(cmp + LZO_HEADER, cl, unc, &ul);
+ if (error < 0) {
+ printk(KERN_ERR "PM: LZO decompression failed\n");
break;
- if (!(nr_pages % m))
- printk("\b\b\b\b%3d%%", nr_pages / m);
- nr_pages++;
+ }
+
+ if (unlikely(ul == 0 || ul > LZO_UNC_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ error = -1;
+ break;
+ }
+
+ for (cl = 0; cl < ul; cl += PAGE_SIZE) {
+ memcpy(data_of(*snapshot), unc + cl, PAGE_SIZE);
+
+ if (!(nr_pages % m))
+ printk("\b\b\b\b%3d%%", nr_pages / m);
+ nr_pages++;
+
+ error = snapshot_write_next(snapshot);
+ if (error <= 0)
+ goto out_finish;
+ }
}
- err2 = hib_wait_on_bio_chain(&bio);
+
+out_finish:
do_gettimeofday(&stop);
- if (!error)
- error = err2;
if (!error) {
printk("\b\b\b\bdone\n");
snapshot_write_finalize(snapshot);
@@ -586,6 +719,11 @@ static int load_image(struct swap_map_handle *handle,
} else
printk("\n");
swsusp_show_speed(&start, &stop, nr_to_read, "Read");
+
+ vfree(cmp);
+ vfree(unc);
+ free_page((unsigned long)page);
+
return error;
}


--
Bojan

2010-07-31 05:03:28

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 2010-07-31 at 14:41 +1000, Bojan Smojver wrote:
> I did 3 hibernate/thaw cycles with it. Images varied from about 850
> MB, 1.1 GB to 750 MB. I was getting 156/141 MB/s, 121/118 MBs and
> 141/130 MBs speeds. Obviously, these things depend on compression
> ratios achieved etc.

For the record, my machine usually does around 32 MB/s on hibernate and
54 MB/s on thaw with regular in-kernel swsusp.

--
Bojan

2010-08-02 00:22:42

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Sat, 31 Jul 2010 14:41:26 +1000
Bojan Smojver <[email protected]> wrote:

> On Sat, 2010-07-31 at 11:33 +1000, Bojan Smojver wrote:
> > I can go back to that easily.
>
> So, here is that whole enchilada one more time (it includes sync_read
> removal patch as well).
>
> I did 3 hibernate/thaw cycles with it. Images varied from about 850 MB,
> 1.1 GB to 750 MB. I was getting 156/141 MB/s, 121/118 MBs and 141/130
> MBs speeds. Obviously, these things depend on compression ratios
> achieved etc.
>
> I guess the number of pages (i.e. LZO_UNC_PAGES) could be made
> configurable as well.
>
> PS. Inline, as requested.
>


I'm sorry if I miss something.


> + wrk = vmalloc(LZO1X_1_MEM_COMPRESS);
> + if (!wrk) {
> + printk(KERN_ERR "PM: Failed to allocate LZO workspace\n");
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }
> +
> + unc = vmalloc(LZO_UNC_SIZE);
> + if (!unc) {
> + printk(KERN_ERR "PM: Failed to allocate LZO uncompressed\n");
> + vfree(wrk);
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }

Now, vmallc() is used here. Then, following will happen.

1. vmalloc()
-> vmalloc adds vmap objects and set page table entries.

2. saving image
-> At taking snapshot of memory to the disk, above vmalloc() area is
saved to disk as it is.
...
3. At restore
Because you dont't remember which vmalloc() area was used for creating
snapshot, you can't free it at swsusp_free().

memory leak ?

Thanks,
-Kame

> +
> + cmp = vmalloc(LZO_CMP_SIZE);
> + if (!cmp) {
> + printk(KERN_ERR "PM: Failed to allocate LZO compressed\n");
> + vfree(unc);
> + vfree(wrk);
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }
>
> printk(KERN_INFO "PM: Saving image data pages (%u pages) ... ",
> nr_to_write);
> @@ -382,16 +422,48 @@ static int save_image(struct swap_map_handle *handle,
> bio = NULL;
> do_gettimeofday(&start);
> while (1) {
> - ret = snapshot_read_next(snapshot);
> - if (ret <= 0)
> + for (ul = 0; ul < LZO_UNC_SIZE; ul += PAGE_SIZE) {
> + ret = snapshot_read_next(snapshot);
> + if (ret < 0)
> + goto out_finish;
> +
> + if (ret == 0)
> + break;
> +
> + memcpy(unc + ul, data_of(*snapshot), PAGE_SIZE);
> +
> + if (!(nr_pages % m))
> + printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
> + nr_pages++;
> + }
> +
> + if (ul == 0)
> + break;
> +
> + ret = lzo1x_1_compress(unc, ul, cmp + LZO_HEADER, &cl, wrk);
> + if (ret < 0) {
> + printk(KERN_ERR "PM: LZO compression failed\n");
> break;
> - ret = swap_write_page(handle, data_of(*snapshot), &bio);
> - if (ret)
> + }
> +
> + if (unlikely(cl == 0 || LZO_HEADER + cl > LZO_CMP_SIZE)) {
> + printk(KERN_ERR "PM: Invalid LZO length\n");
> + ret = -1;
> break;
> - if (!(nr_pages % m))
> - printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
> - nr_pages++;
> + }
> +
> + *(size_t *)cmp = cl;
> +
> + for (ul = 0; ul < LZO_HEADER + cl; ul += PAGE_SIZE) {
> + memcpy(page, cmp + ul, PAGE_SIZE);
> +
> + ret = swap_write_page(handle, page, &bio);
> + if (ret)
> + goto out_finish;
> + }
> }
> +
> +out_finish:
> err2 = hib_wait_on_bio_chain(&bio);
> do_gettimeofday(&stop);
> if (!ret)
> @@ -401,6 +473,12 @@ static int save_image(struct swap_map_handle *handle,
> else
> printk(KERN_CONT "\n");
> swsusp_show_speed(&start, &stop, nr_to_write, "Wrote");
> +
> + vfree(cmp);
> + vfree(unc);
> + vfree(wrk);
> + free_page((unsigned long)page);
> +
> return ret;
> }
>
> @@ -416,7 +494,8 @@ static int enough_swap(unsigned int nr_pages)
> unsigned int free_swap = count_swap_pages(root_swap, 1);
>
> pr_debug("PM: Free swap pages: %u\n", free_swap);
> - return free_swap > nr_pages + PAGES_FOR_IO;
> + return free_swap >
> + (nr_pages * LZO_CMP_PAGES) / LZO_UNC_PAGES + PAGES_FOR_IO;
> }
>
> /**
> @@ -547,9 +626,30 @@ static int load_image(struct swap_map_handle *handle,
> int error = 0;
> struct timeval start;
> struct timeval stop;
> - struct bio *bio;
> - int err2;
> unsigned nr_pages;
> + size_t ul, cl;
> + unsigned char *unc, *cmp, *page;
> +
> + page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
> + if (!page) {
> + printk(KERN_ERR "PM: Failed to allocate LZO page\n");
> + return -ENOMEM;
> + }
> +
> + unc = vmalloc(LZO_UNC_SIZE);
> + if (!unc) {
> + printk(KERN_ERR "PM: Failed to allocate LZO uncompressed\n");
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }
> +
> + cmp = vmalloc(LZO_CMP_SIZE);
> + if (!cmp) {
> + printk(KERN_ERR "PM: Failed to allocate LZO compressed\n");
> + vfree(unc);
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }
>
> printk(KERN_INFO "PM: Loading image data pages (%u pages) ... ",
> nr_to_read);
> @@ -557,27 +657,60 @@ static int load_image(struct swap_map_handle *handle,
> if (!m)
> m = 1;
> nr_pages = 0;
> - bio = NULL;
> do_gettimeofday(&start);
> +
> + error = snapshot_write_next(snapshot);
> + if (error <= 0)
> + goto out_finish;
> +
> for ( ; ; ) {
> - error = snapshot_write_next(snapshot);
> - if (error <= 0)
> - break;
> - error = swap_read_page(handle, data_of(*snapshot), &bio);
> + error = swap_read_page(handle, page, NULL); /* sync */
> if (error)
> break;
> - if (snapshot->sync_read)
> - error = hib_wait_on_bio_chain(&bio);
> - if (error)
> + memcpy(cmp, page, PAGE_SIZE);
> +
> + cl = *(size_t *)cmp;
> + if (unlikely(cl == 0 || LZO_HEADER + cl > LZO_CMP_SIZE)) {
> + printk(KERN_ERR "PM: Invalid LZO length\n");
> + error = -1;
> + break;
> + }
> +
> + for (ul = PAGE_SIZE; ul < LZO_HEADER + cl; ul += PAGE_SIZE) {
> + error = swap_read_page(handle, page, NULL); /* sync */
> + if (error)
> + goto out_finish;
> + memcpy(cmp + ul, page, PAGE_SIZE);
> + }
> +
> + ul = LZO_UNC_SIZE;
> + error = lzo1x_decompress_safe(cmp + LZO_HEADER, cl, unc, &ul);
> + if (error < 0) {
> + printk(KERN_ERR "PM: LZO decompression failed\n");
> break;
> - if (!(nr_pages % m))
> - printk("\b\b\b\b%3d%%", nr_pages / m);
> - nr_pages++;
> + }
> +
> + if (unlikely(ul == 0 || ul > LZO_UNC_SIZE)) {
> + printk(KERN_ERR "PM: Invalid LZO length\n");
> + error = -1;
> + break;
> + }
> +
> + for (cl = 0; cl < ul; cl += PAGE_SIZE) {
> + memcpy(data_of(*snapshot), unc + cl, PAGE_SIZE);
> +
> + if (!(nr_pages % m))
> + printk("\b\b\b\b%3d%%", nr_pages / m);
> + nr_pages++;
> +
> + error = snapshot_write_next(snapshot);
> + if (error <= 0)
> + goto out_finish;
> + }
> }
> - err2 = hib_wait_on_bio_chain(&bio);
> +
> +out_finish:
> do_gettimeofday(&stop);
> - if (!error)
> - error = err2;
> if (!error) {
> printk("\b\b\b\bdone\n");
> snapshot_write_finalize(snapshot);
> @@ -586,6 +719,11 @@ static int load_image(struct swap_map_handle *handle,
> } else
> printk("\n");
> swsusp_show_speed(&start, &stop, nr_to_read, "Read");
> +
> + vfree(cmp);
> + vfree(unc);
> + free_page((unsigned long)page);
> +
> return error;
> }
>
>
> --
> Bojan
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2010-08-02 00:54:18

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Mon, 2010-08-02 at 09:17 +0900, KAMEZAWA Hiroyuki wrote:
> Now, vmallc() is used here. Then, following will happen.
>
> 1. vmalloc()
> -> vmalloc adds vmap objects and set page table entries.
>
> 2. saving image
> -> At taking snapshot of memory to the disk, above vmalloc() area
> is
> saved to disk as it is.
> ...
> 3. At restore
> Because you dont't remember which vmalloc() area was used for
> creating
> snapshot, you can't free it at swsusp_free().
>
> memory leak ?

To be honest, I'm not sure.

However, I thought that by the time save_image() is called, snapshot has
already been taken, no?
------------------
error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
if (error)
goto Thaw;

if (in_suspend) {
unsigned int flags = 0;

if (hibernation_mode == HIBERNATION_PLATFORM)
flags |= SF_PLATFORM_MODE;
pr_debug("PM: writing image.\n");
error = swsusp_write(flags); <--- this calls save_image()
------------------

So, me thinks that these allocations will not be in the snapshot image.

PS. Take everything I take with a grain (or two) of salt. I'm just a
regular Linux user trying to make my Fedora hibernate/thaw process suck
less.

--
Bojan

2010-08-02 01:15:48

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Mon, 02 Aug 2010 10:54:13 +1000
Bojan Smojver <[email protected]> wrote:

> On Mon, 2010-08-02 at 09:17 +0900, KAMEZAWA Hiroyuki wrote:
> > Now, vmallc() is used here. Then, following will happen.
> >
> > 1. vmalloc()
> > -> vmalloc adds vmap objects and set page table entries.
> >
> > 2. saving image
> > -> At taking snapshot of memory to the disk, above vmalloc() area
> > is
> > saved to disk as it is.
> > ...
> > 3. At restore
> > Because you dont't remember which vmalloc() area was used for
> > creating
> > snapshot, you can't free it at swsusp_free().
> >
> > memory leak ?
>
> To be honest, I'm not sure.
>
> However, I thought that by the time save_image() is called, snapshot has
> already been taken, no?
> ------------------
> error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
> if (error)
> goto Thaw;
>
> if (in_suspend) {
> unsigned int flags = 0;
>
> if (hibernation_mode == HIBERNATION_PLATFORM)
> flags |= SF_PLATFORM_MODE;
> pr_debug("PM: writing image.\n");
> error = swsusp_write(flags); <--- this calls save_image()
> ------------------
>
> So, me thinks that these allocations will not be in the snapshot image.
>
I'm a very newbie to snapshot ...(I'm now studying it because I got a report
that my patch corrupts it.) So, don't trust my words.

Looking into swsusp_write().
==
swsusp_write()
-> save_image()
->
while () {
snapshot_read_next()
swap_write_page()
}
==
This routine writes a buffer which is gotten by snapshot_read_next() to the disk.

Then, what snapshot_read_next() pass is.
==
} else {
struct page *page;

page = pfn_to_page(memory_bm_next_pfn(&copy_bm));
if (PageHighMem(page)) {
/* Highmem pages are copied to the buffer,
* because we can't return with a kmapped
* highmem page (we may not be called again).
*/
void *kaddr;

kaddr = kmap_atomic(page, KM_USER0);
memcpy(buffer, kaddr, PAGE_SIZE);
kunmap_atomic(kaddr, KM_USER0);
handle->buffer = buffer;
} else {
handle->buffer = page_address(page);
}
}
==
The physical memory address of a page to be saved.

So, I thought "system memory image" itself is not a snapshot but it's changing
while it runs. Why swsusp can avoid memory leak is that it records which
pages should be freed after resume in the bitmap, which will be saved to
image header(?) And, even if this snapshot saves the image of buddy-allocator,
the save routine itself uses a fixed buffer which can be freed after restore.

Thanks,
-Kame











2010-08-02 01:21:12

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Mon, 2010-08-02 at 10:10 +0900, KAMEZAWA Hiroyuki wrote:
> Why swsusp can avoid memory leak is that it records which
> pages should be freed after resume in the bitmap, which will be saved
> to image header(?)

Right. So, are you saying that all allocations in save_image() should be
done using __get_free_page() or __get_free_pages() and not with
vmalloc()?

--
Bojan

2010-08-02 01:32:41

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Mon, 02 Aug 2010 11:21:08 +1000
Bojan Smojver <[email protected]> wrote:

> On Mon, 2010-08-02 at 10:10 +0900, KAMEZAWA Hiroyuki wrote:
> > Why swsusp can avoid memory leak is that it records which
> > pages should be freed after resume in the bitmap, which will be saved
> > to image header(?)
>
> Right. So, are you saying that all allocations in save_image() should be
> done using __get_free_page() or __get_free_pages() and not with
> vmalloc()?
>
I don't say so but a consideration about following is required.
(And it's good to write "we're safe because...as comment")

1. Information about pointers used for vmalloc are saved into image.
2. Information(1) is properly recovered after resume and we can free it.
3. No more allocation will happen once we start wriritng to the disk.

Then, vmalloc() area itself's information will be saved as
"this vmalloc area is used"

and, at resume, recoreved as
"this vmalloc area is used"

Then, you can free it because you remember pointers.

Then, you should make

@@ -372,6 +380,38 @@ static int save_image(struct swap_map_handle *handle,
struct bio *bio;
struct timeval start;
struct timeval stop;
+ size_t ul, cl;
+ unsigned char *unc, *cmp, *wrk, *page;

as global variable. Because global variables will be saved as it is,
you can find it after resume and free used vmalloc() buffers.

Maybe freeing it at swsusp_free() will be clean.


Thanks,
-Kame

2010-08-02 01:43:07

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Mon, 2010-08-02 at 10:27 +0900, KAMEZAWA Hiroyuki wrote:
> Then, you can free it because you remember pointers.

OK, get it. Will rework.

Thank you for you input.

--
Bojan

2010-08-03 01:59:14

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Mon, 2010-08-02 at 11:43 +1000, Bojan Smojver wrote:
> OK, get it. Will rework.

What I did today was:

1. Introduced a global variable in swap.c called swsusp_lzo_buffers. It
was declared in kernel/power/power.h as:
---------------------
extern void *swsusp_lzo_buffers;
---------------------

2. Allocation in save_image() then went like this:
---------------------
swsusp_lzo_buffers = vmalloc(LZO_WRK_SIZE + LZO_UNC_SIZE + LZO_OVH_SIZE)
;
if (!swsusp_lzo_buffers) {
printk(KERN_ERR "PM: Failed to allocate LZO buffers\n");
free_page((unsigned long)page);
return -ENOMEM;
}

wrk = swsusp_lzo_buffers;
buf = swsusp_lzo_buffers + LZO_WRK_SIZE;
---------------------

3. Deallocation in save_image() had (this is after everything has been
written to disk):
---------------------
vfree(swsusp_lzo_buffers);
swsusp_lzo_buffers = NULL;
---------------------

4. swsusp_free() had (note memset(), which would crash the kernel if
this was already freed, but pointer not NULL):
---------------------
printk (KERN_ERR "In swsusp_free().\n");
if (swsusp_lzo_buffers) {
printk (KERN_ERR "Freeing vmalloc() buffers.\n");
memset(swsusp_lzo_buffers, 0, 80 * PAGE_SIZE);
vfree(swsusp_lzo_buffers);
}
---------------------

>From all this, I only got "In swsusp_free()" printed on resume. So, it
seems that save_image() does indeed free those vmalloc()-ed buffers and
they are not saved in the image.

I even put this in hibernate.c:
---------------------
/* Restore control flow magically appears here */
restore_processor_state();
if (!in_suspend)
platform_leave(platform_mode);

printk(KERN_ERR "Resumed, checking swsusp_lzo_buffers.\n");
if (swsusp_lzo_buffers) {
printk (KERN_ERR "Setting vmalloc() buffers.\n");
memset(swsusp_lzo_buffers, 0, 80 * PAGE_SIZE);
}
---------------------

This printed just "Resumed, checking swsusp_lzo_buffers.", meaning it
was already set to NULL.

Any further comments on this? Nigel, what do you reckon?

PS. I also enhanced the patch to use overlapping compression in order to
save memory. Looks like that's causing it to be slower on compression
(we go down from 130 - 150 MB/s to around 105 - 110 MB/s), but still
over 3 times faster than regular swsusp code. Decompression remains
roughly the same around 100+ MB/s (this is double the speed of current
swsusp code). I will post this a bit later on.

--
Bojan

2010-08-03 02:30:45

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Tue, 2010-08-03 at 11:59 +1000, Bojan Smojver wrote:
> PS. I also enhanced the patch to use overlapping compression in order
> to save memory. Looks like that's causing it to be slower on
> compression (we go down from 130 - 150 MB/s to around 105 - 110 MB/s),
> but still over 3 times faster than regular swsusp code. Decompression
> remains roughly the same around 100+ MB/s (this is double the speed of
> current swsusp code). I will post this a bit later on.

As promised.

diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
index ca6066a..cb57eb9 100644
--- a/kernel/power/Kconfig
+++ b/kernel/power/Kconfig
@@ -137,6 +137,8 @@ config SUSPEND_FREEZER
config HIBERNATION
bool "Hibernation (aka 'suspend to disk')"
depends on PM && SWAP && ARCH_HIBERNATION_POSSIBLE
+ select LZO_COMPRESS
+ select LZO_DECOMPRESS
select SUSPEND_NVS if HAS_IOMEM
---help---
Enable the suspend to disk (STD) functionality, which is usually
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 006270f..a760cf8 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -103,10 +103,6 @@ struct snapshot_handle {
void *buffer; /* address of the block to read from
* or write to
*/
- int sync_read; /* Set to one to notify the caller of
- * snapshot_write_next() that it may
- * need to call wait_on_bio_chain()
- */
};

/* This macro returns the address from/to which the caller of
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 25ce010..f24ee24 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -2135,8 +2135,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages)
return 0;

- handle->sync_read = 1;
-
if (!handle->cur) {
if (!buffer)
/* This makes the buffer be freed by swsusp_free() */
@@ -2169,7 +2167,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
memory_bm_position_reset(&orig_bm);
restore_pblist = NULL;
handle->buffer = get_buffer(&orig_bm, &ca);
- handle->sync_read = 0;
if (IS_ERR(handle->buffer))
return PTR_ERR(handle->buffer);
}
@@ -2178,8 +2175,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
handle->buffer = get_buffer(&orig_bm, &ca);
if (IS_ERR(handle->buffer))
return PTR_ERR(handle->buffer);
- if (handle->buffer != buffer)
- handle->sync_read = 0;
}
handle->cur++;
return PAGE_SIZE;
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index b0bb217..5a83fd3 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -24,6 +24,7 @@
#include <linux/swapops.h>
#include <linux/pm.h>
#include <linux/slab.h>
+#include <linux/lzo.h>

#include "power.h"

@@ -357,6 +358,18 @@ static int swap_writer_finish(struct swap_map_handle *handle,
return error;
}

+#define LZO_HEADER sizeof(size_t)
+#define LZO_UNC_PAGES 64
+#define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE)
+#define LZO_CMP_PAGES DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + \
+ LZO_HEADER, PAGE_SIZE)
+#define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE)
+#define LZO_OVH_PAGES DIV_ROUND_UP((LZO_UNC_SIZE > 0xBFFF ? \
+ 0xBFFF : LZO_UNC_SIZE) + \
+ LZO_UNC_SIZE / 16 + 64 + 3 + \
+ LZO_HEADER, PAGE_SIZE)
+#define LZO_OVH_SIZE (LZO_OVH_PAGES + PAGE_SIZE)
+
/**
* save_image - save the suspend image data
*/
@@ -372,6 +385,29 @@ static int save_image(struct swap_map_handle *handle,
struct bio *bio;
struct timeval start;
struct timeval stop;
+ size_t ul, cl;
+ unsigned char *buf, *wrk, *page;
+
+ page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
+ if (!page) {
+ printk(KERN_ERR "PM: Failed to allocate LZO page\n");
+ return -ENOMEM;
+ }
+
+ wrk = vmalloc(LZO1X_1_MEM_COMPRESS);
+ if (!wrk) {
+ printk(KERN_ERR "PM: Failed to allocate LZO workspace\n");
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }
+
+ buf = vmalloc(LZO_UNC_SIZE + LZO_OVH_SIZE);
+ if (!buf) {
+ printk(KERN_ERR "PM: Failed to allocate LZO buffer\n");
+ vfree(wrk);
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }

printk(KERN_INFO "PM: Saving image data pages (%u pages) ... ",
nr_to_write);
@@ -382,16 +418,50 @@ static int save_image(struct swap_map_handle *handle,
bio = NULL;
do_gettimeofday(&start);
while (1) {
- ret = snapshot_read_next(snapshot);
- if (ret <= 0)
+ for (ul = 0; ul < LZO_UNC_SIZE; ul += PAGE_SIZE) {
+ ret = snapshot_read_next(snapshot);
+ if (ret < 0)
+ goto out_finish;
+
+ if (ret == 0)
+ break;
+
+ memcpy(buf + LZO_OVH_SIZE + ul,
+ data_of(*snapshot), PAGE_SIZE);
+
+ if (!(nr_pages % m))
+ printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
+ nr_pages++;
+ }
+
+ if (ul == 0)
+ break;
+
+ ret = lzo1x_1_compress(buf + LZO_OVH_SIZE, ul,
+ buf + LZO_HEADER, &cl, wrk);
+ if (ret < 0) {
+ printk(KERN_ERR "PM: LZO compression failed\n");
break;
- ret = swap_write_page(handle, data_of(*snapshot), &bio);
- if (ret)
+ }
+
+ if (unlikely(cl == 0 || LZO_HEADER + cl > LZO_CMP_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ ret = -1;
break;
- if (!(nr_pages % m))
- printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
- nr_pages++;
+ }
+
+ *(size_t *)buf = cl;
+
+ for (ul = 0; ul < LZO_HEADER + cl; ul += PAGE_SIZE) {
+ memcpy(page, buf + ul, PAGE_SIZE);
+
+ ret = swap_write_page(handle, page, &bio);
+ if (ret)
+ goto out_finish;
+ }
}
+
+out_finish:
err2 = hib_wait_on_bio_chain(&bio);
do_gettimeofday(&stop);
if (!ret)
@@ -401,6 +471,11 @@ static int save_image(struct swap_map_handle *handle,
else
printk(KERN_CONT "\n");
swsusp_show_speed(&start, &stop, nr_to_write, "Wrote");
+
+ vfree(buf);
+ vfree(wrk);
+ free_page((unsigned long)page);
+
return ret;
}

@@ -416,7 +491,8 @@ static int enough_swap(unsigned int nr_pages)
unsigned int free_swap = count_swap_pages(root_swap, 1);

pr_debug("PM: Free swap pages: %u\n", free_swap);
- return free_swap > nr_pages + PAGES_FOR_IO;
+ return free_swap >
+ (nr_pages * LZO_CMP_PAGES) / LZO_UNC_PAGES + 1 + PAGES_FOR_IO;
}

/**
@@ -547,9 +623,22 @@ static int load_image(struct swap_map_handle *handle,
int error = 0;
struct timeval start;
struct timeval stop;
- struct bio *bio;
- int err2;
unsigned nr_pages;
+ size_t ul, cl, off;
+ unsigned char *buf, *page;
+
+ page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
+ if (!page) {
+ printk(KERN_ERR "PM: Failed to allocate LZO page\n");
+ return -ENOMEM;
+ }
+
+ buf = vmalloc(LZO_CMP_SIZE);
+ if (!buf) {
+ printk(KERN_ERR "PM: Failed to allocate LZO buffer\n");
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }

printk(KERN_INFO "PM: Loading image data pages (%u pages) ... ",
nr_to_read);
@@ -557,27 +646,63 @@ static int load_image(struct swap_map_handle *handle,
if (!m)
m = 1;
nr_pages = 0;
- bio = NULL;
do_gettimeofday(&start);
+
+ error = snapshot_write_next(snapshot);
+ if (error <= 0)
+ goto out_finish;
+
for ( ; ; ) {
- error = snapshot_write_next(snapshot);
- if (error <= 0)
- break;
- error = swap_read_page(handle, data_of(*snapshot), &bio);
+ error = swap_read_page(handle, page, NULL); /* sync */
if (error)
break;
- if (snapshot->sync_read)
- error = hib_wait_on_bio_chain(&bio);
- if (error)
+
+ cl = *(size_t *)page;
+ if (unlikely(cl == 0 || LZO_HEADER + cl > LZO_CMP_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ error = -1;
break;
- if (!(nr_pages % m))
- printk("\b\b\b\b%3d%%", nr_pages / m);
- nr_pages++;
+ }
+
+ off = LZO_CMP_SIZE - roundup(cl + LZO_HEADER, PAGE_SIZE);
+ memcpy(buf + off, page, PAGE_SIZE);
+
+ for (ul = PAGE_SIZE; ul < LZO_HEADER + cl; ul += PAGE_SIZE) {
+ error = swap_read_page(handle, page, NULL); /* sync */
+ if (error)
+ goto out_finish;
+ memcpy(buf + off + ul, page, PAGE_SIZE);
+ }
+
+ ul = LZO_UNC_SIZE;
+ error = lzo1x_decompress_safe(buf + off + LZO_HEADER, cl,
+ buf, &ul);
+ if (error < 0) {
+ printk(KERN_ERR "PM: LZO decompression failed\n");
+ break;
+ }
+
+ if (unlikely(ul == 0 || ul > LZO_UNC_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ error = -1;
+ break;
+ }
+
+ for (cl = 0; cl < ul; cl += PAGE_SIZE) {
+ memcpy(data_of(*snapshot), buf + cl, PAGE_SIZE);
+
+ if (!(nr_pages % m))
+ printk("\b\b\b\b%3d%%", nr_pages / m);
+ nr_pages++;
+
+ error = snapshot_write_next(snapshot);
+ if (error <= 0)
+ goto out_finish;
+ }
}
- err2 = hib_wait_on_bio_chain(&bio);
+
+out_finish:
do_gettimeofday(&stop);
- if (!error)
- error = err2;
if (!error) {
printk("\b\b\b\bdone\n");
snapshot_write_finalize(snapshot);
@@ -586,6 +711,10 @@ static int load_image(struct swap_map_handle *handle,
} else
printk("\n");
swsusp_show_speed(&start, &stop, nr_to_read, "Read");
+
+ vfree(buf);
+ free_page((unsigned long)page);
+
return error;
}


--
Bojan

2010-08-03 06:34:08

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Tue, 2010-08-03 at 11:59 +1000, Bojan Smojver wrote:
> I also enhanced the patch to use overlapping compression in order to
> save memory. Looks like that's causing it to be slower on compression
> (we go down from 130 - 150 MB/s to around 105 - 110 MB/s), but still
> over 3 times faster than regular swsusp code.

Actually, that is a lie. Speed is not related to overlapping
compression/decompression, but to what is put in /sys/power/image_size.

--
Bojan

2010-08-04 01:50:46

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi.

Sorry for the delay in replying.

On 03/08/10 11:59, Bojan Smojver wrote:
>> From all this, I only got "In swsusp_free()" printed on resume. So, it
> seems that save_image() does indeed free those vmalloc()-ed buffers and
> they are not saved in the image.
>
> I even put this in hibernate.c:
> ---------------------
> /* Restore control flow magically appears here */
> restore_processor_state();
> if (!in_suspend)
> platform_leave(platform_mode);
>
> printk(KERN_ERR "Resumed, checking swsusp_lzo_buffers.\n");
> if (swsusp_lzo_buffers) {
> printk (KERN_ERR "Setting vmalloc() buffers.\n");
> memset(swsusp_lzo_buffers, 0, 80 * PAGE_SIZE);
> }
> ---------------------
>
> This printed just "Resumed, checking swsusp_lzo_buffers.", meaning it
> was already set to NULL.
>
> Any further comments on this? Nigel, what do you reckon?

I don't see what all the fuss was about. save_image is called after the
snapshot is made (hibernate called hibernation_snapshot to create the
image, then swsusp_write which in turn calls save_image), so there's no
possibility of the memory allocated by it being included in the image or
of a memory leak ocuring as a result.

Regards,

Nigel

2010-08-04 01:58:45

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 2010-08-04 at 11:50 +1000, Nigel Cunningham wrote:
> I don't see what all the fuss was about. save_image is called after
> the snapshot is made (hibernate called hibernation_snapshot to create
> the image, then swsusp_write which in turn calls save_image), so
> there's no possibility of the memory allocated by it being included
> in the image or of a memory leak ocuring as a result.

OK, thanks for your input.

I think what was in question was whether the snapshot can change after
it was made. I was under the impression that it cannot (hence it is
called snapshot - but don't believe me - I have no idea what I'm talking
about). Kame seems to think that it can, so all routines that are
allocating memory after that point should make sure that pages are
marked to be freed upon resume or explicitly freed on resume by
remembering pointers into global variables (if I understood correctly).

But, if you say that snapshot is just that (i.e. static after beeing
made), that's that then.

PS. I'm polishing the patch a bit in terms of overlapping/in-place
compression/decompression (saves memory). Hopefully I'll have a final
version to post today.

--
Bojan

2010-08-04 02:07:42

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 04 Aug 2010 11:50:41 +1000
Nigel Cunningham <[email protected]> wrote:

> Hi.
>
> Sorry for the delay in replying.
>
> On 03/08/10 11:59, Bojan Smojver wrote:
> >> From all this, I only got "In swsusp_free()" printed on resume. So, it
> > seems that save_image() does indeed free those vmalloc()-ed buffers and
> > they are not saved in the image.
> >
> > I even put this in hibernate.c:
> > ---------------------
> > /* Restore control flow magically appears here */
> > restore_processor_state();
> > if (!in_suspend)
> > platform_leave(platform_mode);
> >
> > printk(KERN_ERR "Resumed, checking swsusp_lzo_buffers.\n");
> > if (swsusp_lzo_buffers) {
> > printk (KERN_ERR "Setting vmalloc() buffers.\n");
> > memset(swsusp_lzo_buffers, 0, 80 * PAGE_SIZE);
> > }
> > ---------------------
> >
> > This printed just "Resumed, checking swsusp_lzo_buffers.", meaning it
> > was already set to NULL.
> >
> > Any further comments on this? Nigel, what do you reckon?
>
> I don't see what all the fuss was about. save_image is called after the
> snapshot is made (hibernate called hibernation_snapshot to create the
> image, then swsusp_write which in turn calls save_image), so there's no
> possibility of the memory allocated by it being included in the image or
> of a memory leak ocuring as a result.
>

I'm sorry if I'm wrong. Here is logic in my mind. (I'm reading hibernate(void).
not sure about user.c )
(linenumber is from 2.6.34 global, plz ignore)

624 error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
625 if (error)
626 goto Thaw;
627
628 if (in_suspend) {
629 unsigned int flags = 0;
630
631 if (hibernation_mode == HIBERNATION_PLATFORM)
632 flags |= SF_PLATFORM_MODE;
633 pr_debug("PM: writing image.\n");
634 error = swsusp_write(flags);

Then, swsusp_write() actually writes image to the disk. Right ?

It finally calls save_image(). save_image() executes following loop.

433 while (1) {
434 ret = snapshot_read_next(snapshot, PAGE_SIZE);
435 if (ret <= 0)
436 break;
437 ret = swap_write_page(handle, data_of(*snapshot), &bio);
438 if (ret)
439 break;
440 if (!(nr_pages % m))
441 printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
442 nr_pages++;
443 }

Ok, here, the image written to disk is got by snapshot_read_next().
Then, what it does ?

==
1650 page = pfn_to_page(memory_bm_next_pfn(&copy_bm));
1651 if (PageHighMem(page)) {
1652 /* Highmem pages are copied to the buffer,
1653 * because we can't return with a kmapped
1654 * highmem page (we may not be called again).
1655 */
1656 void *kaddr;
1657
1658 kaddr = kmap_atomic(page, KM_USER0);
1659 memcpy(buffer, kaddr, PAGE_SIZE);
1660 kunmap_atomic(kaddr, KM_USER0);
1661 handle->buffer = buffer;
1662 } else {
1663 handle->buffer = page_address(page);
1664 }
==

Ok, what actually written to disk is an image of memory at save_image() is
called.

What happens at vmalloc ?
1. page tabels for vmalloc() is set up.
2. vm_struct object is allocated.
3. vmap is allcoated.

Above "3" states are all saved into the disk image, as "used".

Then, after resume, all vmalloc() area is resumed as "allocated".

Wrong ?

Thanks,
-Kame







2010-08-04 02:14:22

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 2010-08-04 at 11:02 +0900, KAMEZAWA Hiroyuki wrote:
> Then, after resume, all vmalloc() area is resumed as "allocated".
>
> Wrong ?

I actually tried remembering vmalloc() returned pointers into a global
variable as you suggested. On resume, they were always set to NULL,
which would suggest that what has gotten into the image was the state
before vmalloc() was called in save_image(). See:
http://lkml.org/lkml/2010/8/2/537.

Anyone else wants to comment here?

--
Bojan

2010-08-04 02:23:22

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 04 Aug 2010 12:14:19 +1000
Bojan Smojver <[email protected]> wrote:

> On Wed, 2010-08-04 at 11:02 +0900, KAMEZAWA Hiroyuki wrote:
> > Then, after resume, all vmalloc() area is resumed as "allocated".
> >
> > Wrong ?
>
> I actually tried remembering vmalloc() returned pointers into a global
> variable as you suggested. On resume, they were always set to NULL,
> which would suggest that what has gotten into the image was the state
> before vmalloc() was called in save_image(). See:
> http://lkml.org/lkml/2010/8/2/537.
>
> Anyone else wants to comment here?
>
Hmm, ok. let's see the result.

The reason I mention about the race is my patch corrupts saved image
by changing swap_map[] status and swap-cache radix-tree during save_image().

Maybe I don't understand something important.

Thanks,
-Kame

2010-08-04 02:25:01

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi.

On 04/08/10 12:02, KAMEZAWA Hiroyuki wrote:
> On Wed, 04 Aug 2010 11:50:41 +1000
> Nigel Cunningham<[email protected]> wrote:
>
>> Hi.
>>
>> Sorry for the delay in replying.
>>
>> On 03/08/10 11:59, Bojan Smojver wrote:
>>>> From all this, I only got "In swsusp_free()" printed on resume. So, it
>>> seems that save_image() does indeed free those vmalloc()-ed buffers and
>>> they are not saved in the image.
>>>
>>> I even put this in hibernate.c:
>>> ---------------------
>>> /* Restore control flow magically appears here */
>>> restore_processor_state();
>>> if (!in_suspend)
>>> platform_leave(platform_mode);
>>>
>>> printk(KERN_ERR "Resumed, checking swsusp_lzo_buffers.\n");
>>> if (swsusp_lzo_buffers) {
>>> printk (KERN_ERR "Setting vmalloc() buffers.\n");
>>> memset(swsusp_lzo_buffers, 0, 80 * PAGE_SIZE);
>>> }
>>> ---------------------
>>>
>>> This printed just "Resumed, checking swsusp_lzo_buffers.", meaning it
>>> was already set to NULL.
>>>
>>> Any further comments on this? Nigel, what do you reckon?
>>
>> I don't see what all the fuss was about. save_image is called after the
>> snapshot is made (hibernate called hibernation_snapshot to create the
>> image, then swsusp_write which in turn calls save_image), so there's no
>> possibility of the memory allocated by it being included in the image or
>> of a memory leak ocuring as a result.
>>
>
> I'm sorry if I'm wrong. Here is logic in my mind. (I'm reading hibernate(void).
> not sure about user.c )
> (linenumber is from 2.6.34 global, plz ignore)

Yeah, I'm ignoring uswsusp too. That shouldn't matter.

> 624 error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
> 625 if (error)
> 626 goto Thaw;
> 627
> 628 if (in_suspend) {
> 629 unsigned int flags = 0;
> 630
> 631 if (hibernation_mode == HIBERNATION_PLATFORM)
> 632 flags |= SF_PLATFORM_MODE;
> 633 pr_debug("PM: writing image.\n");
> 634 error = swsusp_write(flags);
>
> Then, swsusp_write() actually writes image to the disk. Right ?

Yes

> It finally calls save_image(). save_image() executes following loop.
>
> 433 while (1) {
> 434 ret = snapshot_read_next(snapshot, PAGE_SIZE);
> 435 if (ret<= 0)
> 436 break;
> 437 ret = swap_write_page(handle, data_of(*snapshot),&bio);
> 438 if (ret)
> 439 break;
> 440 if (!(nr_pages % m))
> 441 printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
> 442 nr_pages++;
> 443 }
>
> Ok, here, the image written to disk is got by snapshot_read_next().
> Then, what it does ?
>
> ==
> 1650 page = pfn_to_page(memory_bm_next_pfn(&copy_bm));
> 1651 if (PageHighMem(page)) {
> 1652 /* Highmem pages are copied to the buffer,
> 1653 * because we can't return with a kmapped
> 1654 * highmem page (we may not be called again).
> 1655 */
> 1656 void *kaddr;
> 1657
> 1658 kaddr = kmap_atomic(page, KM_USER0);
> 1659 memcpy(buffer, kaddr, PAGE_SIZE);
> 1660 kunmap_atomic(kaddr, KM_USER0);
> 1661 handle->buffer = buffer;
> 1662 } else {
> 1663 handle->buffer = page_address(page);
> 1664 }
> ==
>
> Ok, what actually written to disk is an image of memory at save_image() is
> called.
>
> What happens at vmalloc ?
> 1. page tabels for vmalloc() is set up.
> 2. vm_struct object is allocated.
> 3. vmap is allcoated.
>
> Above "3" states are all saved into the disk image, as "used".
>
> Then, after resume, all vmalloc() area is resumed as "allocated".
>
> Wrong ?

Yes, because what's being written is the snapshot that was created in
hibernation_snapshot. Any memory you allocate afterwards is irrelevant
because it's not part of that snapshot that was made earlier and is now
being written to disk. Think of the point where hibernation_snapshot is
called as being like taking a photo, and this later part as like
printing the photo. Once you've taken the photo, people in the photo can
move around without making any difference to the photo you've taken. So
here. Our vmallocs and so on after the snapshot don't affect it.

Regards,

Nigel

2010-08-04 02:29:37

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 04 Aug 2010 12:24:57 +1000
Nigel Cunningham <[email protected]> wrote:

> Yes, because what's being written is the snapshot that was created in
> hibernation_snapshot. Any memory you allocate afterwards is irrelevant
> because it's not part of that snapshot that was made earlier and is now
> being written to disk. Think of the point where hibernation_snapshot is
> called as being like taking a photo, and this later part as like
> printing the photo. Once you've taken the photo, people in the photo can
> move around without making any difference to the photo you've taken. So
> here. Our vmallocs and so on after the snapshot don't affect it.
>

I see. I misunderstood swsusp_save().
Sorry for tons of noises.

Thanks,
-Kame

2010-08-04 02:37:23

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi.

On 04/08/10 12:18, KAMEZAWA Hiroyuki wrote:
> On Wed, 04 Aug 2010 12:14:19 +1000
> Bojan Smojver<[email protected]> wrote:
>
>> On Wed, 2010-08-04 at 11:02 +0900, KAMEZAWA Hiroyuki wrote:
>>> Then, after resume, all vmalloc() area is resumed as "allocated".
>>>
>>> Wrong ?
>>
>> I actually tried remembering vmalloc() returned pointers into a global
>> variable as you suggested. On resume, they were always set to NULL,
>> which would suggest that what has gotten into the image was the state
>> before vmalloc() was called in save_image(). See:
>> http://lkml.org/lkml/2010/8/2/537.
>>
>> Anyone else wants to comment here?
>>
> Hmm, ok. let's see the result.
>
> The reason I mention about the race is my patch corrupts saved image
> by changing swap_map[] status and swap-cache radix-tree during save_image().
>
> Maybe I don't understand something important.

That's a different issue.

Remember that the snapshot includes more than just the running programs.
It includes structs recording filesystem info and the state of swap.
This is why we say you can't safely hibernate, use your filesystem from
another kernel or OS, then resume. The use of the filesystem in another
kernel/OS makes the state on disk inconsistent with the state in memory
that we saved in our image. (I'm assuming it's written to or at least
that the journal is replayed).

I'm not 100% sure, but it sounds like your issue is the same, but with
swap. If you free a swap page post-snapshot and it gets used for (say)
saving a page of the image, then you have a problem post-resume. The
resumed kernel will think the swap state is still as it originally was
and might try to swap back in the page of memory that was freed and used
for the snapshot, creating in-memory corruption.

One solution is to allocate the swap for the image before the snapshot.
This is what TuxOnIce does - it freezes processes, calculates the image
statistics and uses them to allocate storage and free memory as
necessary, writes the first part of the image then does the snapshot and
writes the remainder. By doing things in this order, the only unknown is
the amount of memory needed for drivers, and that can be handled pretty
easily.

Regards,

Nigel

2010-08-04 02:38:43

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi.

On 04/08/10 12:24, KAMEZAWA Hiroyuki wrote:
> On Wed, 04 Aug 2010 12:24:57 +1000
> Nigel Cunningham<[email protected]> wrote:
>
>> Yes, because what's being written is the snapshot that was created in
>> hibernation_snapshot. Any memory you allocate afterwards is irrelevant
>> because it's not part of that snapshot that was made earlier and is now
>> being written to disk. Think of the point where hibernation_snapshot is
>> called as being like taking a photo, and this later part as like
>> printing the photo. Once you've taken the photo, people in the photo can
>> move around without making any difference to the photo you've taken. So
>> here. Our vmallocs and so on after the snapshot don't affect it.
>>
>
> I see. I misunderstood swsusp_save().
> Sorry for tons of noises.

No problem at all! You learn how it works and I get to make sure I do
understand how it works! :)

Nigel

2010-08-04 02:43:00

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi again.

On 03/08/10 12:30, Bojan Smojver wrote:
> On Tue, 2010-08-03 at 11:59 +1000, Bojan Smojver wrote:
>> PS. I also enhanced the patch to use overlapping compression in order
>> to save memory. Looks like that's causing it to be slower on
>> compression (we go down from 130 - 150 MB/s to around 105 - 110 MB/s),
>> but still over 3 times faster than regular swsusp code. Decompression
>> remains roughly the same around 100+ MB/s (this is double the speed of
>> current swsusp code). I will post this a bit later on.
>
> As promised.

Round here, people generally like to see improvements split up into
small patches that change just one thing. I'd therefore suggest
splitting the removal of sync_read from the rest of the patch. Is it
standalone? I'm not seeing the relationship between the two parts at the
moment.

> diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
> index ca6066a..cb57eb9 100644
> --- a/kernel/power/Kconfig
> +++ b/kernel/power/Kconfig
> @@ -137,6 +137,8 @@ config SUSPEND_FREEZER
> config HIBERNATION
> bool "Hibernation (aka 'suspend to disk')"
> depends on PM&& SWAP&& ARCH_HIBERNATION_POSSIBLE
> + select LZO_COMPRESS
> + select LZO_DECOMPRESS
> select SUSPEND_NVS if HAS_IOMEM
> ---help---
> Enable the suspend to disk (STD) functionality, which is usually
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index 006270f..a760cf8 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -103,10 +103,6 @@ struct snapshot_handle {
> void *buffer; /* address of the block to read from
> * or write to
> */
> - int sync_read; /* Set to one to notify the caller of
> - * snapshot_write_next() that it may
> - * need to call wait_on_bio_chain()
> - */
> };
>
> /* This macro returns the address from/to which the caller of
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index 25ce010..f24ee24 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -2135,8 +2135,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
> if (handle->cur> 1&& handle->cur> nr_meta_pages + nr_copy_pages)
> return 0;
>
> - handle->sync_read = 1;
> -
> if (!handle->cur) {
> if (!buffer)
> /* This makes the buffer be freed by swsusp_free() */
> @@ -2169,7 +2167,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
> memory_bm_position_reset(&orig_bm);
> restore_pblist = NULL;
> handle->buffer = get_buffer(&orig_bm,&ca);
> - handle->sync_read = 0;
> if (IS_ERR(handle->buffer))
> return PTR_ERR(handle->buffer);
> }
> @@ -2178,8 +2175,6 @@ int snapshot_write_next(struct snapshot_handle *handle)
> handle->buffer = get_buffer(&orig_bm,&ca);
> if (IS_ERR(handle->buffer))
> return PTR_ERR(handle->buffer);
> - if (handle->buffer != buffer)
> - handle->sync_read = 0;
> }
> handle->cur++;
> return PAGE_SIZE;
> diff --git a/kernel/power/swap.c b/kernel/power/swap.c
> index b0bb217..5a83fd3 100644
> --- a/kernel/power/swap.c
> +++ b/kernel/power/swap.c
> @@ -24,6 +24,7 @@
> #include<linux/swapops.h>
> #include<linux/pm.h>
> #include<linux/slab.h>
> +#include<linux/lzo.h>
>
> #include "power.h"
>
> @@ -357,6 +358,18 @@ static int swap_writer_finish(struct swap_map_handle *handle,
> return error;
> }
>
> +#define LZO_HEADER sizeof(size_t)
> +#define LZO_UNC_PAGES 64
> +#define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE)
> +#define LZO_CMP_PAGES DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + \
> + LZO_HEADER, PAGE_SIZE)
> +#define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE)
> +#define LZO_OVH_PAGES DIV_ROUND_UP((LZO_UNC_SIZE > 0xBFFF ? \
> + 0xBFFF : LZO_UNC_SIZE) + \
> + LZO_UNC_SIZE / 16 + 64 + 3 + \
> + LZO_HEADER, PAGE_SIZE)
> +#define LZO_OVH_SIZE (LZO_OVH_PAGES + PAGE_SIZE)
> +

Some commenting for these #defines would be good, especially the 'magic'
numbers 0xBFF, 16, 64 and 3. It's all a bit unreadable, too :) Maybe
something like (untested...):

#define LZO_HEADER sizeof(size_t)
#define LZO_UNC_PAGES 64
#define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE)
#define LZO_CMP_PAGES \
DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + LZO_HEADER,
PAGE_SIZE)
#define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE)
#define LZO_MAGIC_1 0xBFFF /* Rename to whatever this is */
#define LZO_MAGIC_2 16 /* Rename */
#define LZO_MAGIC_3 (64 + 3) /* Rename */
#define LZO_OVH_PAGES
DIV_ROUND_UP(min(LZO_UNC_SIZE, LZO_MAGIC_1) + \
LZO_UNC_SIZE / LZO_MAGIC2 + \
LZO_MAGIC_3 + \
LZO_HEADER,
PAGE_SIZE)
#define LZO_OVH_SIZE (LZO_OVH_PAGES + PAGE_SIZE)


> /**
> * save_image - save the suspend image data
> */
> @@ -372,6 +385,29 @@ static int save_image(struct swap_map_handle *handle,
> struct bio *bio;
> struct timeval start;
> struct timeval stop;
> + size_t ul, cl;

rename to bytes_in and bytes_out? I can guess they mean
uncompressed/compressed length, but it's a little cryptic.

> + unsigned char *buf, *wrk, *page;
> +
> + page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
> + if (!page) {
> + printk(KERN_ERR "PM: Failed to allocate LZO page\n");
> + return -ENOMEM;
> + }
> +
> + wrk = vmalloc(LZO1X_1_MEM_COMPRESS);
> + if (!wrk) {
> + printk(KERN_ERR "PM: Failed to allocate LZO workspace\n");
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }
> +
> + buf = vmalloc(LZO_UNC_SIZE + LZO_OVH_SIZE);
> + if (!buf) {
> + printk(KERN_ERR "PM: Failed to allocate LZO buffer\n");
> + vfree(wrk);
> + free_page((unsigned long)page);
> + return -ENOMEM;
> + }
>
> printk(KERN_INFO "PM: Saving image data pages (%u pages) ... ",
> nr_to_write);
> @@ -382,16 +418,50 @@ static int save_image(struct swap_map_handle *handle,
> bio = NULL;
> do_gettimeofday(&start);
> while (1) {
> - ret = snapshot_read_next(snapshot);
> - if (ret<= 0)
> + for (ul = 0; ul< LZO_UNC_SIZE; ul += PAGE_SIZE) {
> + ret = snapshot_read_next(snapshot);
> + if (ret< 0)

Not sure why my email client (Thunderbird) is shifting your comparison
operators :(

> + goto out_finish;
> +
> + if (ret == 0)

if (!ret)

(Further down, too)

> + break;
> +
> + memcpy(buf + LZO_OVH_SIZE + ul,
> + data_of(*snapshot), PAGE_SIZE);
> +
> + if (!(nr_pages % m))
> + printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
> + nr_pages++;
> + }
> +
> + if (ul == 0)
> + break;
> +
> + ret = lzo1x_1_compress(buf + LZO_OVH_SIZE, ul,
> + buf + LZO_HEADER,&cl, wrk);
> + if (ret< 0) {
> + printk(KERN_ERR "PM: LZO compression failed\n");
> break;
> - ret = swap_write_page(handle, data_of(*snapshot),&bio);
> - if (ret)
> + }
> +
> + if (unlikely(cl == 0 || LZO_HEADER + cl> LZO_CMP_SIZE)) {
> + printk(KERN_ERR "PM: Invalid LZO length\n");
> + ret = -1;
> break;
> - if (!(nr_pages % m))
> - printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
> - nr_pages++;
> + }
> +
> + *(size_t *)buf = cl;
> +
> + for (ul = 0; ul< LZO_HEADER + cl; ul += PAGE_SIZE) {
> + memcpy(page, buf + ul, PAGE_SIZE);
> +
> + ret = swap_write_page(handle, page,&bio);
> + if (ret)
> + goto out_finish;
> + }
> }
> +
> +out_finish:
> err2 = hib_wait_on_bio_chain(&bio);
> do_gettimeofday(&stop);
> if (!ret)
> @@ -401,6 +471,11 @@ static int save_image(struct swap_map_handle *handle,
> else
> printk(KERN_CONT "\n");
> swsusp_show_speed(&start,&stop, nr_to_write, "Wrote");
> +
> + vfree(buf);
> + vfree(wrk);
> + free_page((unsigned long)page);
> +
> return ret;
> }
>
> @@ -416,7 +491,8 @@ static int enough_swap(unsigned int nr_pages)
> unsigned int free_swap = count_swap_pages(root_swap, 1);
>
> pr_debug("PM: Free swap pages: %u\n", free_swap);
> - return free_swap> nr_pages + PAGES_FOR_IO;
> + return free_swap>
> + (nr_pages * LZO_CMP_PAGES) / LZO_UNC_PAGES + 1 + PAGES_FOR_IO;

PAGES_FOR_IO doesn't belong here, but that's not your problem :)

> }
>
> /**
> @@ -547,9 +623,22 @@ static int load_image(struct swap_map_handle *handle,
> int error = 0;
> struct timeval start;
> struct timeval stop;
> - struct bio *bio;
> - int err2;
> unsigned nr_pages;
> + size_t ul, cl, off;

As per comments on ul and cl above.

[...]

Regards,

Nigel

2010-08-04 02:47:53

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 2010-08-04 at 12:42 +1000, Nigel Cunningham wrote:
> Is it standalone? I'm not seeing the relationship between the two
> parts at the moment.

Once the compression/decompression gets applied, then it can be a
standalone patch, because sync_read is no longer used (we need the data
_now_ in order to decompress it, so async read would not give us much).

I already have it split on my box, so once I finalize, that's how I'll
send.

Points taken for all the other comments. Thank you.

PS. Right now, I'm seeing segfaults on resume with my
overlapping/in-place approach. So, I'll have to find out what is going
on first.

--
Bojan

2010-08-04 04:04:21

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 2010-08-04 at 12:47 +1000, Bojan Smojver wrote:
> Points taken for all the other comments. Thank you.

Style wise, does this look better?

PS. Don't worry about contents yet - still working out the kinks.

diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
index ca6066a..cb57eb9 100644
--- a/kernel/power/Kconfig
+++ b/kernel/power/Kconfig
@@ -137,6 +137,8 @@ config SUSPEND_FREEZER
config HIBERNATION
bool "Hibernation (aka 'suspend to disk')"
depends on PM && SWAP && ARCH_HIBERNATION_POSSIBLE
+ select LZO_COMPRESS
+ select LZO_DECOMPRESS
select SUSPEND_NVS if HAS_IOMEM
---help---
Enable the suspend to disk (STD) functionality, which is usually
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index b0bb217..1515e2c 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -24,6 +24,7 @@
#include <linux/swapops.h>
#include <linux/pm.h>
#include <linux/slab.h>
+#include <linux/lzo.h>

#include "power.h"

@@ -357,6 +358,30 @@ static int swap_writer_finish(struct swap_map_handle *handle,
return error;
}

+/* We need to remember how much compressed data we need to read. */
+#define LZO_HEADER sizeof(size_t)
+
+/* Number of pages/bytes we'll compress at a time. */
+#define LZO_UNC_PAGES 64
+#define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE)
+
+/* Number of pages/bytes we need for compressed data (worst case). */
+#define LZO_CMP_PAGES DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + \
+ LZO_HEADER, PAGE_SIZE)
+#define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE)
+
+/* As per examples/overlap.c in lzo distribution, we need to add more space
+ * to the end of the buffer in order to perform overlapping compression. We
+ * also add LZO_HEADER, because our compressed data starts at that offset.
+ * Then we round it off on PAGE_SIZE boundary, so that we can copy full pages
+ * safely. We call this overhead. Magic numbers and formula below.
+ */
+#define LZO_OVH_PAGES DIV_ROUND_UP((LZO_UNC_SIZE > 0xBFFF ? \
+ 0xBFFF : LZO_UNC_SIZE) + \
+ LZO_UNC_SIZE / 16 + 64 + 3 + \
+ LZO_HEADER, PAGE_SIZE)
+#define LZO_OVH_SIZE (LZO_OVH_PAGES + PAGE_SIZE)
+
/**
* save_image - save the suspend image data
*/
@@ -372,6 +397,29 @@ static int save_image(struct swap_map_handle *handle,
struct bio *bio;
struct timeval start;
struct timeval stop;
+ size_t i, unc_len, cmp_len;
+ unsigned char *buf, *wrk, *page;
+
+ page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
+ if (!page) {
+ printk(KERN_ERR "PM: Failed to allocate LZO page\n");
+ return -ENOMEM;
+ }
+
+ wrk = vmalloc(LZO1X_1_MEM_COMPRESS);
+ if (!wrk) {
+ printk(KERN_ERR "PM: Failed to allocate LZO workspace\n");
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }
+
+ buf = vmalloc(LZO_UNC_SIZE + LZO_OVH_SIZE);
+ if (!buf) {
+ printk(KERN_ERR "PM: Failed to allocate LZO buffer\n");
+ vfree(wrk);
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }

printk(KERN_INFO "PM: Saving image data pages (%u pages) ... ",
nr_to_write);
@@ -382,16 +430,59 @@ static int save_image(struct swap_map_handle *handle,
bio = NULL;
do_gettimeofday(&start);
while (1) {
- ret = snapshot_read_next(snapshot);
- if (ret <= 0)
+ for (unc_len = 0;
+ unc_len < LZO_UNC_SIZE; unc_len += PAGE_SIZE) {
+ ret = snapshot_read_next(snapshot);
+ if (ret < 0)
+ goto out_finish;
+
+ if (!ret)
+ break;
+
+ /* Copy uncompressed data to the top of the buffer. */
+ memcpy(buf + LZO_OVH_SIZE + unc_len,
+ data_of(*snapshot), PAGE_SIZE);
+
+ if (!(nr_pages % m))
+ printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
+ nr_pages++;
+ }
+
+ if (!unc_len)
break;
- ret = swap_write_page(handle, data_of(*snapshot), &bio);
- if (ret)
+
+ ret = lzo1x_1_compress(buf + LZO_OVH_SIZE, unc_len,
+ buf + LZO_HEADER, &cmp_len, wrk);
+ if (ret < 0) {
+ printk(KERN_ERR "PM: LZO compression failed\n");
+ break;
+ }
+
+ if (unlikely(!cmp_len ||
+ cmp_len > lzo1x_worst_compress(unc_len))) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ ret = -1;
break;
- if (!(nr_pages % m))
- printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
- nr_pages++;
+ }
+
+ *(size_t *)buf = cmp_len;
+
+ /* Given we are writing one page at a time to disk, we copy
+ * that much from the buffer, although the last bit will likely
+ * be smaller than full page. This is OK - we saved the length
+ * of the compressed data, so any garbage at the end will be
+ * discarded when we read it.
+ */
+ for (i = 0; i < LZO_HEADER + cmp_len; i += PAGE_SIZE) {
+ memcpy(page, buf + i, PAGE_SIZE);
+
+ ret = swap_write_page(handle, page, &bio);
+ if (ret)
+ goto out_finish;
+ }
}
+
+out_finish:
err2 = hib_wait_on_bio_chain(&bio);
do_gettimeofday(&stop);
if (!ret)
@@ -401,6 +492,11 @@ static int save_image(struct swap_map_handle *handle,
else
printk(KERN_CONT "\n");
swsusp_show_speed(&start, &stop, nr_to_write, "Wrote");
+
+ vfree(buf);
+ vfree(wrk);
+ free_page((unsigned long)page);
+
return ret;
}

@@ -416,7 +512,8 @@ static int enough_swap(unsigned int nr_pages)
unsigned int free_swap = count_swap_pages(root_swap, 1);

pr_debug("PM: Free swap pages: %u\n", free_swap);
- return free_swap > nr_pages + PAGES_FOR_IO;
+ return free_swap >
+ (nr_pages * LZO_CMP_PAGES) / LZO_UNC_PAGES + 1 + PAGES_FOR_IO;
}

/**
@@ -547,9 +644,22 @@ static int load_image(struct swap_map_handle *handle,
int error = 0;
struct timeval start;
struct timeval stop;
- struct bio *bio;
- int err2;
unsigned nr_pages;
+ size_t i, off, unc_len, cmp_len;
+ unsigned char *buf, *page;
+
+ page = (void *)__get_free_page(__GFP_WAIT | __GFP_HIGH);
+ if (!page) {
+ printk(KERN_ERR "PM: Failed to allocate LZO page\n");
+ return -ENOMEM;
+ }
+
+ buf = vmalloc(LZO_CMP_SIZE);
+ if (!buf) {
+ printk(KERN_ERR "PM: Failed to allocate LZO buffer\n");
+ free_page((unsigned long)page);
+ return -ENOMEM;
+ }

printk(KERN_INFO "PM: Loading image data pages (%u pages) ... ",
nr_to_read);
@@ -557,27 +667,73 @@ static int load_image(struct swap_map_handle *handle,
if (!m)
m = 1;
nr_pages = 0;
- bio = NULL;
do_gettimeofday(&start);
+
+ error = snapshot_write_next(snapshot);
+ if (error <= 0)
+ goto out_finish;
+
for ( ; ; ) {
- error = snapshot_write_next(snapshot);
- if (error <= 0)
- break;
- error = swap_read_page(handle, data_of(*snapshot), &bio);
+ error = swap_read_page(handle, page, NULL); /* sync */
if (error)
break;
- if (snapshot->sync_read)
- error = hib_wait_on_bio_chain(&bio);
- if (error)
+
+ cmp_len = *(size_t *)page;
+ if (unlikely(!cmp_len ||
+ cmp_len > lzo1x_worst_compress(LZO_UNC_SIZE))) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ error = -1;
break;
- if (!(nr_pages % m))
- printk("\b\b\b\b%3d%%", nr_pages / m);
- nr_pages++;
+ }
+
+ /* As per examples/overlap.c in lzo distribution, we copy
+ * compressed data to the top of the in-place decompression
+ * buffer. */
+ off = LZO_CMP_SIZE - cmp_len;
+ memcpy(buf + off, page + LZO_HEADER,
+ (cmp_len < PAGE_SIZE - LZO_HEADER ?
+ cmp_len : PAGE_SIZE - LZO_HEADER));
+
+ for (i = PAGE_SIZE - LZO_HEADER; i < cmp_len; i += PAGE_SIZE) {
+ error = swap_read_page(handle, page, NULL); /* sync */
+ if (error)
+ goto out_finish;
+
+ memcpy(buf + off + i, page,
+ (cmp_len - unc_len < PAGE_SIZE ?
+ cmp_len - unc_len : PAGE_SIZE));
+ }
+
+ unc_len = LZO_UNC_SIZE;
+ error = lzo1x_decompress_safe(buf + off, cmp_len,
+ buf, &unc_len);
+ if (error < 0) {
+ printk(KERN_ERR "PM: LZO decompression failed\n");
+ break;
+ }
+
+ if (unlikely(!unc_len ||
+ unc_len > LZO_UNC_SIZE || unc_len % PAGE_SIZE)) {
+ printk(KERN_ERR "PM: Invalid LZO length\n");
+ error = -1;
+ break;
+ }
+
+ for (i = 0; i < unc_len; i += PAGE_SIZE) {
+ memcpy(data_of(*snapshot), buf + i, PAGE_SIZE);
+
+ if (!(nr_pages % m))
+ printk("\b\b\b\b%3d%%", nr_pages / m);
+ nr_pages++;
+
+ error = snapshot_write_next(snapshot);
+ if (error <= 0)
+ goto out_finish;
+ }
}
- err2 = hib_wait_on_bio_chain(&bio);
+
+out_finish:
do_gettimeofday(&stop);
- if (!error)
- error = err2;
if (!error) {
printk("\b\b\b\bdone\n");
snapshot_write_finalize(snapshot);
@@ -586,6 +742,10 @@ static int load_image(struct swap_map_handle *handle,
} else
printk("\n");
swsusp_show_speed(&start, &stop, nr_to_read, "Read");
+
+ vfree(buf);
+ free_page((unsigned long)page);
+
return error;
}


--
Bojan

2010-08-04 04:23:33

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi.

On 04/08/10 14:04, Bojan Smojver wrote:
> On Wed, 2010-08-04 at 12:47 +1000, Bojan Smojver wrote:
>> Points taken for all the other comments. Thank you.
>
> Style wise, does this look better?
>
> PS. Don't worry about contents yet - still working out the kinks.

Okay.

> diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
> index ca6066a..cb57eb9 100644
> --- a/kernel/power/Kconfig
> +++ b/kernel/power/Kconfig
> @@ -137,6 +137,8 @@ config SUSPEND_FREEZER
> config HIBERNATION
> bool "Hibernation (aka 'suspend to disk')"
> depends on PM&& SWAP&& ARCH_HIBERNATION_POSSIBLE
> + select LZO_COMPRESS
> + select LZO_DECOMPRESS

But, having said 'okay' above, I should point out that you might want to
select CRYPTO too, since LZO depends on it. (And perhaps I should ask
whether you should be using cryptoapi rather than LZO directly?)

> select SUSPEND_NVS if HAS_IOMEM
> ---help---
> Enable the suspend to disk (STD) functionality, which is usually
> diff --git a/kernel/power/swap.c b/kernel/power/swap.c
> index b0bb217..1515e2c 100644
> --- a/kernel/power/swap.c
> +++ b/kernel/power/swap.c
> @@ -24,6 +24,7 @@
> #include<linux/swapops.h>
> #include<linux/pm.h>
> #include<linux/slab.h>
> +#include<linux/lzo.h>
>
> #include "power.h"
>
> @@ -357,6 +358,30 @@ static int swap_writer_finish(struct swap_map_handle *handle,
> return error;
> }
>
> +/* We need to remember how much compressed data we need to read. */
> +#define LZO_HEADER sizeof(size_t)
> +
> +/* Number of pages/bytes we'll compress at a time. */
> +#define LZO_UNC_PAGES 64
> +#define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE)
> +
> +/* Number of pages/bytes we need for compressed data (worst case). */
> +#define LZO_CMP_PAGES DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + \
> + LZO_HEADER, PAGE_SIZE)
> +#define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE)
> +
> +/* As per examples/overlap.c in lzo distribution, we need to add more space
> + * to the end of the buffer in order to perform overlapping compression. We
> + * also add LZO_HEADER, because our compressed data starts at that offset.
> + * Then we round it off on PAGE_SIZE boundary, so that we can copy full pages
> + * safely. We call this overhead. Magic numbers and formula below.
> + */

Yeah, that's much better; thanks.

Nigel

2010-08-04 05:13:25

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

> But, having said 'okay' above, I should point out that you might want to
> select CRYPTO too, since LZO depends on it. (And perhaps I should ask
> whether you should be using cryptoapi rather than LZO directly?)

OK, I will check. I kinda remember the docs just saying to select the
above.

I looked at crypto briefly, but then decided to use straight compression
because it looked a straightforward match for the task at hand.

--
Bojan

2010-08-04 05:58:25

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 2010-08-04 at 15:12 +1000, Bojan Smojver wrote:
> OK, I will check. I kinda remember the docs just saying to select the
> above.

Looks to me the dependencies go the other way around (i.e. crypto
depends on LZO, not LZO on crypto):
-----------------
config CRYPTO_LZO
tristate "LZO compression algorithm"
select CRYPTO_ALGAPI
select LZO_COMPRESS
select LZO_DECOMPRESS
help
This is the LZO algorithm.
-----------------

So, one needs LZO_COMPRESS/DECOMPRESS for CYPTO_LZO. But, given we just
need straight compression/decompression, this should not be needed, if
I'm understanding things correctly.

--
Bojan

2010-08-05 01:26:26

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Wed, 2010-08-04 at 12:47 +1000, Bojan Smojver wrote:
> PS. Right now, I'm seeing segfaults on resume with my
> overlapping/in-place approach. So, I'll have to find out what is going
> on first.

Well, either I'm doing something idiotic when setting up
overlapping/in-place compression/decompression buffers or the routines
do not work as advertised. In any event, the image is subtly corrupt on
resume using this technique (checksum verified), hence the segfaults.
Please do not use this code on your systems - it may corrupt your file
systems.

For now, I will finalize the patch using two buffers, but I'll drop the
number of pages down to 32, so that we use roughly the same amount of
memory as with overlapping/in-place compression/decompression. Slightly
slower than with 64 pages, but still much faster than original code.

--
Bojan

2010-08-05 06:28:24

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

Hi!

> This patch speeds up hibernate/thaw operations (in kernel ones) by
> compressing the pages using LZO. Tests on my ThinkPad T510 show that
> hibernation times can be cut by the factor of about 4 for an image of
> around 1 GB (from about a minute down to about 15 seconds).

Why do it in kernel? s2disk should already be faster...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2010-08-05 06:55:40

by Bojan Smojver

[permalink] [raw]
Subject: Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

On Thu, 2010-08-05 at 08:26 +0200, Pavel Machek wrote:
> Why do it in kernel? s2disk should already be faster...

Because LZO support is already there, the patch is straightforward and
not everyone wants to set up userspace tools to hibernate?

If we are relying on s2disk as the only solution, then in-kernel
hibernation should be removed. If not, I don't see why in-kernel
hibernation should suck that much, when we can make it suck less.

--
Bojan