2006-10-23 04:14:25

by Nigel Cunningham

[permalink] [raw]
Subject: [PATCH] Use extents for recording what swap is allocated.

Switch from bitmaps to using extents to record what swap is allocated;
they make more efficient use of memory, particularly where the allocated
storage is small and the swap space is large.

This is also part of the ground work for implementing support for
supporting multiple swap devices.

Signed-off-by: Nigel Cunningham <[email protected]>

diff --git a/kernel/power/Makefile b/kernel/power/Makefile
index 38725f5..d772521 100644
--- a/kernel/power/Makefile
+++ b/kernel/power/Makefile
@@ -5,6 +5,6 @@ endif

obj-y := main.o process.o console.o
obj-$(CONFIG_PM_LEGACY) += pm.o
-obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o
+obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o extent.o

obj-$(CONFIG_MAGIC_SYSRQ) += poweroff.o
diff --git a/kernel/power/extent.c b/kernel/power/extent.c
new file mode 100644
index 0000000..b769956
--- /dev/null
+++ b/kernel/power/extent.c
@@ -0,0 +1,119 @@
+/*
+ * kernel/power/extent.c
+ *
+ * Copyright (C) 2003-2006 Nigel Cunningham <[email protected]>
+ * Copyright (C) 2006 Red Hat, inc.
+ *
+ * Distributed under GPLv2.
+ *
+ * These functions encapsulate the manipulation of storage metadata.
+ */
+
+#include <linux/suspend.h>
+#include "extent.h"
+
+/* suspend_get_extent
+ *
+ * Returns a free extent. May fail, returning NULL instead.
+ */
+static struct extent *suspend_get_extent(void)
+{
+ struct extent *result;
+
+ if (!(result = kmalloc(sizeof(struct extent), GFP_ATOMIC)))
+ return NULL;
+
+ result->minimum = result->maximum = 0;
+ result->next = NULL;
+
+ return result;
+}
+
+/* suspend_put_extent_chain.
+ *
+ * Frees a whole chain of extents.
+ */
+void suspend_put_extent_chain(struct extent_chain *chain)
+{
+ struct extent *this;
+
+ this = chain->first;
+
+ while(this) {
+ struct extent *next = this->next;
+ kfree(this);
+ chain->num_extents--;
+ this = next;
+ }
+
+ BUG_ON(chain->num_extents);
+ chain->first = chain->last_touched = NULL;
+ chain->size = 0;
+}
+
+/*
+ * suspend_add_to_extent_chain
+ *
+ * Add an extent to an existing chain.
+ */
+int suspend_add_to_extent_chain(struct extent_chain *chain,
+ unsigned long minimum, unsigned long maximum)
+{
+ struct extent *new_extent = NULL, *start_at;
+
+ /* Find the right place in the chain */
+ start_at = (chain->last_touched &&
+ (chain->last_touched->minimum < minimum)) ?
+ chain->last_touched : NULL;
+
+ if (!start_at && chain->first && chain->first->minimum < minimum)
+ start_at = chain->first;
+
+ while (start_at && start_at->next && start_at->next->minimum < minimum)
+ start_at = start_at->next;
+
+ if (start_at && start_at->maximum == (minimum - 1)) {
+ start_at->maximum = maximum;
+
+ /* Merge with the following one? */
+ if (start_at->next &&
+ start_at->maximum + 1 == start_at->next->minimum) {
+ struct extent *to_free = start_at->next;
+ start_at->maximum = start_at->next->maximum;
+ start_at->next = start_at->next->next;
+ chain->num_extents--;
+ kfree(to_free);
+ }
+
+ chain->last_touched = start_at;
+ chain->size+= (maximum - minimum + 1);
+
+ return 0;
+ }
+
+ new_extent = suspend_get_extent();
+ if (!new_extent) {
+ printk("Error unable to append a new extent to the chain.\n");
+ return 2;
+ }
+
+ chain->num_extents++;
+ chain->size+= (maximum - minimum + 1);
+ new_extent->minimum = minimum;
+ new_extent->maximum = maximum;
+ new_extent->next = NULL;
+
+ chain->last_touched = new_extent;
+
+ if (start_at) {
+ struct extent *next = start_at->next;
+ start_at->next = new_extent;
+ new_extent->next = next;
+ } else {
+ if (chain->first)
+ new_extent->next = chain->first;
+ chain->first = new_extent;
+ }
+
+ return 0;
+}
diff --git a/kernel/power/extent.h b/kernel/power/extent.h
new file mode 100644
index 0000000..062b4c1
--- /dev/null
+++ b/kernel/power/extent.h
@@ -0,0 +1,50 @@
+/*
+ * kernel/power/extent.h
+ *
+ * Copyright (C) 2004-2006 Nigel Cunningham <[email protected]>
+ * Copyright (C) 2006 Red Hat, inc.
+ *
+ * This file is released under the GPLv2.
+ *
+ * It contains declarations related to extents. Extents are
+ * suspend's method of storing some of the metadata for the image.
+ * See extent.c for more info.
+ *
+ */
+
+#ifndef EXTENT_H
+#define EXTENT_H
+
+struct extent {
+ unsigned long minimum, maximum;
+ struct extent *next;
+};
+
+struct extent_chain {
+ int size; /* size of the chain ie sum (max-min+1) */
+ int num_extents;
+ struct extent *first, *last_touched;
+};
+
+/* Simplify iterating through all the values in an extent chain */
+#define suspend_extent_for_each(extent_chain, extentpointer, value) \
+if ((extent_chain)->first) \
+ for ((extentpointer) = (extent_chain)->first, (value) = \
+ (extentpointer)->minimum; \
+ ((extentpointer) && ((extentpointer)->next || (value) <= \
+ (extentpointer)->maximum)); \
+ (((value) == (extentpointer)->maximum) ? \
+ ((extentpointer) = (extentpointer)->next, (value) = \
+ ((extentpointer) ? (extentpointer)->minimum : 0)) : \
+ (value)++))
+
+void suspend_put_extent_chain(struct extent_chain *chain);
+int suspend_add_to_extent_chain(struct extent_chain *chain,
+ unsigned long minimum, unsigned long maximum);
+
+/* swap_entry_to_extent_val & extent_val_to_swap_entry:
+ * We are putting offset in the low bits so consecutive swap entries
+ * make consecutive extent values */
+#define swap_entry_to_extent_val(swp_entry) (swp_entry.val)
+#define extent_val_to_swap_entry(val) (swp_entry_t) { (val) }
+#endif
diff --git a/kernel/power/power.h b/kernel/power/power.h
index bfe999f..a78c391 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -1,6 +1,8 @@
#include <linux/suspend.h>
#include <linux/utsname.h>

+#include "extent.h"
+
struct swsusp_info {
struct new_utsname uts;
u32 version_code;
@@ -119,30 +121,8 @@ #define SNAPSHOT_SET_SWAP_FILE _IOW(SNA
#define SNAPSHOT_S2RAM _IO(SNAPSHOT_IOC_MAGIC, 11)
#define SNAPSHOT_IOC_MAXNR 11

-/**
- * The bitmap is used for tracing allocated swap pages
- *
- * The entire bitmap consists of a number of bitmap_page
- * structures linked with the help of the .next member.
- * Thus each page can be allocated individually, so we only
- * need to make 0-order memory allocations to create
- * the bitmap.
- */
-
-#define BITMAP_PAGE_SIZE (PAGE_SIZE - sizeof(void *))
-#define BITMAP_PAGE_CHUNKS (BITMAP_PAGE_SIZE / sizeof(long))
-#define BITS_PER_CHUNK (sizeof(long) * 8)
-#define BITMAP_PAGE_BITS (BITMAP_PAGE_CHUNKS * BITS_PER_CHUNK)
-
-struct bitmap_page {
- unsigned long chunks[BITMAP_PAGE_CHUNKS];
- struct bitmap_page *next;
-};
-
-extern void free_bitmap(struct bitmap_page *bitmap);
-extern struct bitmap_page *alloc_bitmap(unsigned int nr_bits);
-extern unsigned long alloc_swap_page(int swap, struct bitmap_page *bitmap);
-extern void free_all_swap_pages(int swap, struct bitmap_page *bitmap);
+extern unsigned long alloc_swap_page(int swap, struct extent_chain *extents);
+extern void free_all_swap_pages(int swap, struct extent_chain *extents);

extern int swsusp_check(void);
extern int swsusp_shrink_memory(void);
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 1a3b0dd..fc713d5 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -152,7 +152,7 @@ struct swap_map_page {
struct swap_map_handle {
struct swap_map_page *cur;
unsigned long cur_swap;
- struct bitmap_page *bitmap;
+ struct extent_chain extents;
unsigned int k;
};

@@ -161,9 +161,6 @@ static void release_swap_writer(struct s
if (handle->cur)
free_page((unsigned long)handle->cur);
handle->cur = NULL;
- if (handle->bitmap)
- free_bitmap(handle->bitmap);
- handle->bitmap = NULL;
}

static void show_speed(struct timeval *start, struct timeval *stop,
@@ -191,12 +188,8 @@ static int get_swap_writer(struct swap_m
handle->cur = (struct swap_map_page *)get_zeroed_page(GFP_KERNEL);
if (!handle->cur)
return -ENOMEM;
- handle->bitmap = alloc_bitmap(count_swap_pages(root_swap, 0));
- if (!handle->bitmap) {
- release_swap_writer(handle);
- return -ENOMEM;
- }
- handle->cur_swap = alloc_swap_page(root_swap, handle->bitmap);
+ memset(&handle->extents, 0, sizeof(struct extent_chain));
+ handle->cur_swap = alloc_swap_page(root_swap, &handle->extents);
if (!handle->cur_swap) {
release_swap_writer(handle);
return -ENOSPC;
@@ -241,7 +234,7 @@ static int swap_write_page(struct swap_m

if (!handle->cur)
return -EINVAL;
- offset = alloc_swap_page(root_swap, handle->bitmap);
+ offset = alloc_swap_page(root_swap, &handle->extents);
error = write_page(buf, offset, bio_chain);
if (error)
return error;
@@ -250,7 +243,7 @@ static int swap_write_page(struct swap_m
error = wait_on_bio_chain(bio_chain);
if (error)
goto out;
- offset = alloc_swap_page(root_swap, handle->bitmap);
+ offset = alloc_swap_page(root_swap, &handle->extents);
if (!offset)
return -ENOSPC;
handle->cur->next_swap = offset;
@@ -379,7 +372,7 @@ int swsusp_write(void)
}
}
if (error)
- free_all_swap_pages(root_swap, handle.bitmap);
+ free_all_swap_pages(root_swap, &handle.extents);
release_swap_writer(&handle);
return error;
}
diff --git a/kernel/power/swsusp.c b/kernel/power/swsusp.c
index 0b66659..aa8205c 100644
--- a/kernel/power/swsusp.c
+++ b/kernel/power/swsusp.c
@@ -51,6 +51,7 @@ #include <linux/syscalls.h>
#include <linux/highmem.h>

#include "power.h"
+#include "extent.h"

/*
* Preferred image size in bytes (tunable via /sys/power/image_size).
@@ -72,96 +73,30 @@ static inline int restore_highmem(void)
static inline unsigned int count_highmem_pages(void) { return 0; }
#endif

-/**
- * The following functions are used for tracing the allocated
- * swap pages, so that they can be freed in case of an error.
- *
- * The functions operate on a linked bitmap structure defined
- * in power.h
- */
-
-void free_bitmap(struct bitmap_page *bitmap)
+unsigned long alloc_swap_page(int swap, struct extent_chain *extents)
{
- struct bitmap_page *bp;
-
- while (bitmap) {
- bp = bitmap->next;
- free_page((unsigned long)bitmap);
- bitmap = bp;
+ swp_entry_t entry = get_swap_page_of_type(swap);
+ if (entry.val) {
+ unsigned long new_value = swap_entry_to_extent_val(entry);
+ suspend_add_to_extent_chain(extents, new_value, new_value);
}
+ return swp_offset(entry);
}

-struct bitmap_page *alloc_bitmap(unsigned int nr_bits)
+void free_all_swap_pages(int swap, struct extent_chain *extents)
{
- struct bitmap_page *bitmap, *bp;
- unsigned int n;
-
- if (!nr_bits)
- return NULL;
-
- bitmap = (struct bitmap_page *)get_zeroed_page(GFP_KERNEL);
- bp = bitmap;
- for (n = BITMAP_PAGE_BITS; n < nr_bits; n += BITMAP_PAGE_BITS) {
- bp->next = (struct bitmap_page *)get_zeroed_page(GFP_KERNEL);
- bp = bp->next;
- if (!bp) {
- free_bitmap(bitmap);
- return NULL;
+ if (extents->first) {
+ /* Free swap entries */
+ struct extent *extentpointer;
+ unsigned long extentvalue;
+ swp_entry_t entry;
+ suspend_extent_for_each(extents, extentpointer,
+ extentvalue) {
+ entry = extent_val_to_swap_entry(extentvalue);
+ swap_free(entry);
}
- }
- return bitmap;
-}
-
-static int bitmap_set(struct bitmap_page *bitmap, unsigned long bit)
-{
- unsigned int n;

- n = BITMAP_PAGE_BITS;
- while (bitmap && n <= bit) {
- n += BITMAP_PAGE_BITS;
- bitmap = bitmap->next;
- }
- if (!bitmap)
- return -EINVAL;
- n -= BITMAP_PAGE_BITS;
- bit -= n;
- n = 0;
- while (bit >= BITS_PER_CHUNK) {
- bit -= BITS_PER_CHUNK;
- n++;
- }
- bitmap->chunks[n] |= (1UL << bit);
- return 0;
-}
-
-unsigned long alloc_swap_page(int swap, struct bitmap_page *bitmap)
-{
- unsigned long offset;
-
- offset = swp_offset(get_swap_page_of_type(swap));
- if (offset) {
- if (bitmap_set(bitmap, offset)) {
- swap_free(swp_entry(swap, offset));
- offset = 0;
- }
- }
- return offset;
-}
-
-void free_all_swap_pages(int swap, struct bitmap_page *bitmap)
-{
- unsigned int bit, n;
- unsigned long test;
-
- bit = 0;
- while (bitmap) {
- for (n = 0; n < BITMAP_PAGE_CHUNKS; n++)
- for (test = 1UL; test; test <<= 1) {
- if (bitmap->chunks[n] & test)
- swap_free(swp_entry(swap, bit));
- bit++;
- }
- bitmap = bitmap->next;
+ suspend_put_extent_chain(extents);
}
}

diff --git a/kernel/power/user.c b/kernel/power/user.c
index 3653b22..79aa849 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -32,7 +32,7 @@ #define SNAPSHOT_MINOR 231
static struct snapshot_data {
struct snapshot_handle handle;
int swap;
- struct bitmap_page *bitmap;
+ struct extent_chain extents;
int mode;
char frozen;
char ready;
@@ -61,7 +61,7 @@ static int snapshot_open(struct inode *i
data->swap = -1;
data->mode = O_WRONLY;
}
- data->bitmap = NULL;
+ memset(&data->extents, 0, sizeof(struct extent_chain));
data->frozen = 0;
data->ready = 0;

@@ -74,8 +74,7 @@ static int snapshot_release(struct inode

swsusp_free();
data = filp->private_data;
- free_all_swap_pages(data->swap, data->bitmap);
- free_bitmap(data->bitmap);
+ free_all_swap_pages(data->swap, &data->extents);
if (data->frozen) {
down(&pm_sem);
thaw_processes();
@@ -232,14 +231,7 @@ static int snapshot_ioctl(struct inode *
error = -ENODEV;
break;
}
- if (!data->bitmap) {
- data->bitmap = alloc_bitmap(count_swap_pages(data->swap, 0));
- if (!data->bitmap) {
- error = -ENOMEM;
- break;
- }
- }
- offset = alloc_swap_page(data->swap, data->bitmap);
+ offset = alloc_swap_page(data->swap, &data->extents);
if (offset) {
offset <<= PAGE_SHIFT;
error = put_user(offset, (loff_t __user *)arg);
@@ -253,13 +245,11 @@ static int snapshot_ioctl(struct inode *
error = -ENODEV;
break;
}
- free_all_swap_pages(data->swap, data->bitmap);
- free_bitmap(data->bitmap);
- data->bitmap = NULL;
+ free_all_swap_pages(data->swap, &data->extents);
break;

case SNAPSHOT_SET_SWAP_FILE:
- if (!data->bitmap) {
+ if (!data->extents.size) {
/*
* User space encodes device types as two-byte values,
* so we need to recode them



2006-10-23 15:34:29

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> Switch from bitmaps to using extents to record what swap is allocated;
> they make more efficient use of memory, particularly where the allocated
> storage is small and the swap space is large.

> This is also part of the ground work for implementing support for
> supporting multiple swap devices.

bitmaps were more efficient and longer than original code... I did not
_like_ them, but they are in now. I'd hate to change the code again,
for what, 0.5% gain?

...and this is still longer than bitmaps.

And SNAPSHOT_GET_SWAP_PAGE seems to support multiple swap spaces
already.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-23 23:04:15

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Mon, 2006-10-23 at 17:32 +0200, Pavel Machek wrote:
> Hi!
>
> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
>
> > This is also part of the ground work for implementing support for
> > supporting multiple swap devices.
>
> bitmaps were more efficient and longer than original code... I did not
> _like_ them, but they are in now. I'd hate to change the code again,
> for what, 0.5% gain?

0.5% of what? This is part of what is needed to implement support for
multiple swap devices. You could extend what you already have,
implementing bitmaps that require a bit for every page of swap the user
has swapon'd, but that doesn't seem efficient to me.

> ...and this is still longer than bitmaps.
>
> And SNAPSHOT_GET_SWAP_PAGE seems to support multiple swap spaces
> already.

It may do, but swap.c certainly doesn't. It supports exactly one device.

Regards,

Nigel

2006-10-24 08:43:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Tuesday, 24 October 2006 01:04, Nigel Cunningham wrote:
> Hi.
>
> On Mon, 2006-10-23 at 17:32 +0200, Pavel Machek wrote:
> > Hi!
> >
> > > Switch from bitmaps to using extents to record what swap is allocated;
> > > they make more efficient use of memory, particularly where the allocated
> > > storage is small and the swap space is large.
> >
> > > This is also part of the ground work for implementing support for
> > > supporting multiple swap devices.
> >
> > bitmaps were more efficient and longer than original code... I did not
> > _like_ them, but they are in now. I'd hate to change the code again,
> > for what, 0.5% gain?
>
> 0.5% of what? This is part of what is needed to implement support for
> multiple swap devices. You could extend what you already have,
> implementing bitmaps that require a bit for every page of swap the user
> has swapon'd, but that doesn't seem efficient to me.

I agree. Further, I though I would change the bitmaps thing at some point,
but there still are more urgent things to do. ;-)

> > ...and this is still longer than bitmaps.

Yes, it is. Still I'd like to have a thorough look at it. The idea is fine by me.

> > And SNAPSHOT_GET_SWAP_PAGE seems to support multiple swap spaces
> > already.
>
> It may do, but swap.c certainly doesn't. It supports exactly one device.

That's correct.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-10-24 20:09:44

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Monday, 23 October 2006 06:14, Nigel Cunningham wrote:
> Switch from bitmaps to using extents to record what swap is allocated;
> they make more efficient use of memory, particularly where the allocated
> storage is small and the swap space is large.

As I said before, I like the overall idea, but I have a bunch of comments.

> This is also part of the ground work for implementing support for
> supporting multiple swap devices.
>
> Signed-off-by: Nigel Cunningham <[email protected]>
>
> diff --git a/kernel/power/Makefile b/kernel/power/Makefile
> index 38725f5..d772521 100644
> --- a/kernel/power/Makefile
> +++ b/kernel/power/Makefile
> @@ -5,6 +5,6 @@ endif
>
> obj-y := main.o process.o console.o
> obj-$(CONFIG_PM_LEGACY) += pm.o
> -obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o
> +obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o extent.o
>
> obj-$(CONFIG_MAGIC_SYSRQ) += poweroff.o
> diff --git a/kernel/power/extent.c b/kernel/power/extent.c
> new file mode 100644
> index 0000000..b769956
> --- /dev/null
> +++ b/kernel/power/extent.c
> @@ -0,0 +1,119 @@
> +/*
> + * kernel/power/extent.c
> + *
> + * Copyright (C) 2003-2006 Nigel Cunningham <[email protected]>
> + * Copyright (C) 2006 Red Hat, inc.
> + *
> + * Distributed under GPLv2.
> + *
> + * These functions encapsulate the manipulation of storage metadata.
> + */
> +
> +#include <linux/suspend.h>
> +#include "extent.h"
> +
> +/* suspend_get_extent
> + *
> + * Returns a free extent. May fail, returning NULL instead.
> + */
> +static struct extent *suspend_get_extent(void)
> +{
> + struct extent *result;

I'd call the type 'struct suspend_extent', for clarity.

> +
> + if (!(result = kmalloc(sizeof(struct extent), GFP_ATOMIC)))
> + return NULL;

if you use kzalloc() here, the initializations below won't be necessary and
the function will reduce to a one-liner (in which case it can be inlined).

> +
> + result->minimum = result->maximum = 0;
> + result->next = NULL;
> +
> + return result;
> +}
> +
> +/* suspend_put_extent_chain.
> + *
> + * Frees a whole chain of extents.
> + */
> +void suspend_put_extent_chain(struct extent_chain *chain)

I'd call it suspend_free_all_extents().

> +{
> + struct extent *this;
> +
> + this = chain->first;
> +
> + while(this) {
> + struct extent *next = this->next;
> + kfree(this);
> + chain->num_extents--;
> + this = next;
> + }
> +
> + BUG_ON(chain->num_extents);

I'd drop this BUG_ON(). Once the code has been debugged, it's no longer
of any use.

> + chain->first = chain->last_touched = NULL;

This is against the coding style AFAICT. Please do

chain->first = NULL;
chain->last_touched = NULL;

> + chain->size = 0;
> +}
> +
> +/*
> + * suspend_add_to_extent_chain
> + *
> + * Add an extent to an existing chain.
> + */
> +int suspend_add_to_extent_chain(struct extent_chain *chain,
> + unsigned long minimum, unsigned long maximum)

I'd use shorter names for these arguments, like 'start', 'end'.

> +{
> + struct extent *new_extent = NULL, *start_at;

The names are not the best here too, IMHO. I'd use something like
*new_ext, *cur_ext.

> +
> + /* Find the right place in the chain */
> + start_at = (chain->last_touched &&
> + (chain->last_touched->minimum < minimum)) ?
> + chain->last_touched : NULL;
> +
> + if (!start_at && chain->first && chain->first->minimum < minimum)
> + start_at = chain->first;

The above two statements can be simplified, like

cur_ext = NULL;
if (chain->last_touched && chain->last_touched->minimum < start)
cur_ext = chain->last_touched;
else if (chain->first && chain->first->minimum < start)
cur_ext = chain->first;

> +
> + while (start_at && start_at->next && start_at->next->minimum < minimum)
> + start_at = start_at->next;

You don't need to check if start_at is nonzero in every step. It's sufficient to
check this once at the beginning.

Moreover, the below is also only executed if start_at is nonzero,
so you can put it under one if () along with the above loop, like this:

if (cur_ext) {
while (cur_ext->next && cur_ext->next->minimum < start)
cur_ext = cur_ext->next;

if (cur_ext-> maximum == (start - 1)) {
struct suspend_extent *next_ext;

cur_ext->maximum = end;
next_ext = cur_ext->next;
if (next_ext && cur_ext->maximum == next_ext->minimum - 1) {
cur_ext->maximum = next_ext->maximum;
cur_ext->next = next_ext->next;
kfree(next_ext);
chain->num_extents--;
}

etc.
> +
> + if (start_at && start_at->maximum == (minimum - 1)) {
> + start_at->maximum = maximum;
> +
> + /* Merge with the following one? */
> + if (start_at->next &&
> + start_at->maximum + 1 == start_at->next->minimum) {
> + struct extent *to_free = start_at->next;
> + start_at->maximum = start_at->next->maximum;
> + start_at->next = start_at->next->next;
> + chain->num_extents--;
> + kfree(to_free);
> + }
> +
> + chain->last_touched = start_at;
> + chain->size+= (maximum - minimum + 1);
> +
> + return 0;
> + }
> +
> + new_extent = suspend_get_extent();
> + if (!new_extent) {
> + printk("Error unable to append a new extent to the chain.\n");
> + return 2;

return -ENOMEM;

> + }
> +
> + chain->num_extents++;
> + chain->size+= (maximum - minimum + 1);
> + new_extent->minimum = minimum;
> + new_extent->maximum = maximum;
> + new_extent->next = NULL;

The last one won't be necessary if you use kzalloc() to allocate extents.

> +
> + chain->last_touched = new_extent;
> +
> + if (start_at) {
> + struct extent *next = start_at->next;
> + start_at->next = new_extent;
> + new_extent->next = next;

*next is unnecessary here. You can do it like this:

new_ext->next = cur_ext->next;
cur_ext->next = new_ext;

because new_ext->next is initially NULL.

> + } else {
> + if (chain->first)
> + new_extent->next = chain->first;
> + chain->first = new_extent;
> + }
> +
> + return 0;
> +}
> diff --git a/kernel/power/extent.h b/kernel/power/extent.h
> new file mode 100644
> index 0000000..062b4c1
> --- /dev/null
> +++ b/kernel/power/extent.h
> @@ -0,0 +1,50 @@
> +/*
> + * kernel/power/extent.h
> + *
> + * Copyright (C) 2004-2006 Nigel Cunningham <[email protected]>
> + * Copyright (C) 2006 Red Hat, inc.
> + *
> + * This file is released under the GPLv2.
> + *
> + * It contains declarations related to extents. Extents are
> + * suspend's method of storing some of the metadata for the image.
> + * See extent.c for more info.
> + *
> + */
> +
> +#ifndef EXTENT_H
> +#define EXTENT_H
> +
> +struct extent {
> + unsigned long minimum, maximum;

Well, I'd use shorter names, but whatever.

> + struct extent *next;
> +};
> +
> +struct extent_chain {
> + int size; /* size of the chain ie sum (max-min+1) */
> + int num_extents;
> + struct extent *first, *last_touched;
> +};
> +
> +/* Simplify iterating through all the values in an extent chain */
> +#define suspend_extent_for_each(extent_chain, extentpointer, value) \
> +if ((extent_chain)->first) \
> + for ((extentpointer) = (extent_chain)->first, (value) = \
> + (extentpointer)->minimum; \
> + ((extentpointer) && ((extentpointer)->next || (value) <= \
> + (extentpointer)->maximum)); \
> + (((value) == (extentpointer)->maximum) ? \
> + ((extentpointer) = (extentpointer)->next, (value) = \
> + ((extentpointer) ? (extentpointer)->minimum : 0)) : \
> + (value)++))

This macro doesn't look very nice and is used only once, so I think you
can drop it and just write the loop where it belongs.

> +
> +void suspend_put_extent_chain(struct extent_chain *chain);
> +int suspend_add_to_extent_chain(struct extent_chain *chain,
> + unsigned long minimum, unsigned long maximum);
> +
> +/* swap_entry_to_extent_val & extent_val_to_swap_entry:
> + * We are putting offset in the low bits so consecutive swap entries
> + * make consecutive extent values */
> +#define swap_entry_to_extent_val(swp_entry) (swp_entry.val)
> +#define extent_val_to_swap_entry(val) (swp_entry_t) { (val) }

These two macros are also used only once each. I'd just use the values
directly.

> +#endif

That's all. :-)

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-10-24 20:42:43

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Mon, Oct 23, 2006 at 02:14:17PM +1000, Nigel Cunningham wrote:
> Switch from bitmaps to using extents to record what swap is allocated;
> they make more efficient use of memory, particularly where the allocated
> storage is small and the swap space is large.
>
> This is also part of the ground work for implementing support for
> supporting multiple swap devices.

In addition to the very useful comments from Rafael there's some observations
of my own:

- there's an awful lot of opencoded list manipulation, any chance you
could use list.h instead?
- what unit are the extent values in? The usage of unsigned long rings
warning bells for me, shouldn't this be something like pgoff_t or
sector_t depending on what you describe with it?

2006-10-24 20:59:33

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Tuesday, 24 October 2006 22:42, Christoph Hellwig wrote:
> On Mon, Oct 23, 2006 at 02:14:17PM +1000, Nigel Cunningham wrote:
> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
> >
> > This is also part of the ground work for implementing support for
> > supporting multiple swap devices.
>
> In addition to the very useful comments from Rafael there's some observations
> of my own:
>
> - there's an awful lot of opencoded list manipulation, any chance you
> could use list.h instead?
> - what unit are the extent values in? The usage of unsigned long rings
> warning bells for me, shouldn't this be something like pgoff_t or
> sector_t depending on what you describe with it?

These are swap offsets as returned by swp_offset().

2006-10-24 21:34:14

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
>
> As I said before, I like the overall idea, but I have a bunch of
> comments.

Okay, if Rafael likes it... lets take a look.

First... what is the _worst case_ overhead? AFAICT extents are very
good at the best case, but tend to suck for the worst case...?

> > +#include <linux/suspend.h>
> > +#include "extent.h"
> > +
> > +/* suspend_get_extent
> > + *
> > + * Returns a free extent. May fail, returning NULL instead.
> > + */

Your comments are nice, and quite close to linuxdoc... Can we make
them proper linuxdoc?

> > +/* suspend_put_extent_chain.
> > + *
> > + * Frees a whole chain of extents.
> > + */
> > +void suspend_put_extent_chain(struct extent_chain *chain)
>
> I'd call it suspend_free_all_extents().

This is actually important. As it does undocditional free(), it may
not be called "put".

> > +#ifndef EXTENT_H
> > +#define EXTENT_H
> > +
> > +struct extent {
> > + unsigned long minimum, maximum;
>
> Well, I'd use shorter names, but whatever.

Actually, minimum and
maximum look too similar. start/end are really better names.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-24 21:59:59

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Tuesday, 24 October 2006 23:34, Pavel Machek wrote:
> Hi!
>
> > > Switch from bitmaps to using extents to record what swap is allocated;
> > > they make more efficient use of memory, particularly where the allocated
> > > storage is small and the swap space is large.
> >
> > As I said before, I like the overall idea, but I have a bunch of
> > comments.
>
> Okay, if Rafael likes it... lets take a look.
>
> First... what is the _worst case_ overhead? AFAICT extents are very
> good at the best case, but tend to suck for the worst case...?

I think we'll usually be close to the best case, as we tend to allocate
large consecutive blocks of swap space.

2006-10-24 22:06:39

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Tue, 2006-10-24 at 21:42 +0100, Christoph Hellwig wrote:
> On Mon, Oct 23, 2006 at 02:14:17PM +1000, Nigel Cunningham wrote:
> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
> >
> > This is also part of the ground work for implementing support for
> > supporting multiple swap devices.
>
> In addition to the very useful comments from Rafael there's some observations
> of my own:
>
> - there's an awful lot of opencoded list manipulation, any chance you
> could use list.h instead?

IIRC, I avoided list.h because I only wanted a singly linked list (it
never gets traversed backwards). List.h looks to me like all doubly
linked lists. Do you know if there are any other singly linked list
implementations I could piggy-back?

That said, since there's normally not that many extents, I could switch
quite easily and it wouldn't normally waste much memory.

> - what unit are the extent values in? The usage of unsigned long rings
> warning bells for me, shouldn't this be something like pgoff_t or
> sector_t depending on what you describe with it?

For this use, they're swap extents. For another use I have in suspend2,
they're sector_t >> block_device_size. That's why I picked ul; something
generic in what's supposed to be a generic implementation.

Thanks for the comments.

Nigel

2006-10-24 22:13:04

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Tue, 2006-10-24 at 22:08 +0200, Rafael J. Wysocki wrote:
> On Monday, 23 October 2006 06:14, Nigel Cunningham wrote:
> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
>
> As I said before, I like the overall idea, but I have a bunch of comments.

Thanks for them. Just a quick reply for the moment to say they're
appreciated and I will revise accordingly.

I should also mention that this isn't the only use of these functions in
Suspend2. There I also use extents to record the blocks to which the
image will be written. I hope to submit modifications to swsusp to do
that too in the near future.

> > +/* Simplify iterating through all the values in an extent chain */
> > +#define suspend_extent_for_each(extent_chain, extentpointer, value) \
> > +if ((extent_chain)->first) \
> > + for ((extentpointer) = (extent_chain)->first, (value) = \
> > + (extentpointer)->minimum; \
> > + ((extentpointer) && ((extentpointer)->next || (value) <= \
> > + (extentpointer)->maximum)); \
> > + (((value) == (extentpointer)->maximum) ? \
> > + ((extentpointer) = (extentpointer)->next, (value) = \
> > + ((extentpointer) ? (extentpointer)->minimum : 0)) : \
> > + (value)++))
>
> This macro doesn't look very nice and is used only once, so I think you
> can drop it and just write the loop where it belongs.

With the modifications I mentioned just above, this would also be used
for getting the blocks which match each swap extent. I can remove the
macro, but just want to make you aware that it does serve a purpose,
you're just not seeing it fully yet.

> > +
> > +void suspend_put_extent_chain(struct extent_chain *chain);
> > +int suspend_add_to_extent_chain(struct extent_chain *chain,
> > + unsigned long minimum, unsigned long maximum);
> > +
> > +/* swap_entry_to_extent_val & extent_val_to_swap_entry:
> > + * We are putting offset in the low bits so consecutive swap entries
> > + * make consecutive extent values */
> > +#define swap_entry_to_extent_val(swp_entry) (swp_entry.val)
> > +#define extent_val_to_swap_entry(val) (swp_entry_t) { (val) }
>
> These two macros are also used only once each. I'd just use the values
> directly.

Ok. Thanks. I think they're a leftover from 2.4 support :)

Nigel

2006-10-24 22:15:54

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Tue, 2006-10-24 at 23:34 +0200, Pavel Machek wrote:
> Hi!
>
> > > Switch from bitmaps to using extents to record what swap is allocated;
> > > they make more efficient use of memory, particularly where the allocated
> > > storage is small and the swap space is large.
> >
> > As I said before, I like the overall idea, but I have a bunch of
> > comments.
>
> Okay, if Rafael likes it... lets take a look.
>
> First... what is the _worst case_ overhead? AFAICT extents are very
> good at the best case, but tend to suck for the worst case...?

That's right. In using this, we're relying on the fact that the swap
allocator tries to act sensibly. I've only seen worse case performance
when a user had two swap devices with the same priority (striped), but
that was a bug. :)

> > > +#include <linux/suspend.h>
> > > +#include "extent.h"
> > > +
> > > +/* suspend_get_extent
> > > + *
> > > + * Returns a free extent. May fail, returning NULL instead.
> > > + */
>
> Your comments are nice, and quite close to linuxdoc... Can we make
> them proper linuxdoc?

Ok.

> > > +/* suspend_put_extent_chain.
> > > + *
> > > + * Frees a whole chain of extents.
> > > + */
> > > +void suspend_put_extent_chain(struct extent_chain *chain)
> >
> > I'd call it suspend_free_all_extents().
>
> This is actually important. As it does undocditional free(), it may
> not be called "put".

Ok.

> > > +#ifndef EXTENT_H
> > > +#define EXTENT_H
> > > +
> > > +struct extent {
> > > + unsigned long minimum, maximum;
> >
> > Well, I'd use shorter names, but whatever.
>
> Actually, minimum and
> maximum look too similar. start/end are really better names.

Ok.

Thanks!

Nigel

2006-10-24 22:20:08

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> > > > Switch from bitmaps to using extents to record what swap is allocated;
> > > > they make more efficient use of memory, particularly where the allocated
> > > > storage is small and the swap space is large.
> > >
> > > As I said before, I like the overall idea, but I have a bunch of
> > > comments.
> >
> > Okay, if Rafael likes it... lets take a look.
> >
> > First... what is the _worst case_ overhead? AFAICT extents are very
> > good at the best case, but tend to suck for the worst case...?
>
> That's right. In using this, we're relying on the fact that the swap
> allocator tries to act sensibly. I've only seen worse case performance
> when a user had two swap devices with the same priority (striped), but
> that was a bug. :)

Ok, but if the allocator somehow manages to stripe between two swap
devices, what happens?

IIRC original code was something like .1% overhead (8bytes per 4K, or
something?), bitmaps should be even better. If it is 1% in worst case,
that's probably okay, but it would be bad if it had overhead bigger
than 10times original code (worst case).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-24 22:30:28

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Wed, 2006-10-25 at 00:19 +0200, Pavel Machek wrote:
> Hi!
>
> > > > > Switch from bitmaps to using extents to record what swap is allocated;
> > > > > they make more efficient use of memory, particularly where the allocated
> > > > > storage is small and the swap space is large.
> > > >
> > > > As I said before, I like the overall idea, but I have a bunch of
> > > > comments.
> > >
> > > Okay, if Rafael likes it... lets take a look.
> > >
> > > First... what is the _worst case_ overhead? AFAICT extents are very
> > > good at the best case, but tend to suck for the worst case...?
> >
> > That's right. In using this, we're relying on the fact that the swap
> > allocator tries to act sensibly. I've only seen worse case performance
> > when a user had two swap devices with the same priority (striped), but
> > that was a bug. :)
>
> Ok, but if the allocator somehow manages to stripe between two swap
> devices, what happens?
>
> IIRC original code was something like .1% overhead (8bytes per 4K, or
> something?), bitmaps should be even better. If it is 1% in worst case,
> that's probably okay, but it would be bad if it had overhead bigger
> than 10times original code (worst case).

With the code I have in Suspend2 (which is what I'm working towards),
the value includes the swap_type, so there's no overlap. Assuming the
swap allocator does it's normal thing and swap allocated is contiguous,
you'll probably end up with two extents: one containing the swap
allocated on the first device, and the other containing the swap
allocated on the second device. So (with the current version), striping
would use 6 * sizeof(unsigned long) instead of 3 * sizeof(unsigned
long).

Regards,

Nigel

2006-10-24 22:46:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi,

On Wednesday, 25 October 2006 00:13, Nigel Cunningham wrote:
> Hi.
>
> On Tue, 2006-10-24 at 22:08 +0200, Rafael J. Wysocki wrote:
> > On Monday, 23 October 2006 06:14, Nigel Cunningham wrote:
> > > Switch from bitmaps to using extents to record what swap is allocated;
> > > they make more efficient use of memory, particularly where the allocated
> > > storage is small and the swap space is large.
> >
> > As I said before, I like the overall idea, but I have a bunch of comments.
>
> Thanks for them. Just a quick reply for the moment to say they're
> appreciated and I will revise accordingly.
>
> I should also mention that this isn't the only use of these functions in
> Suspend2.

Could we please focus on things that are on the table _now_?. You are
submitting the patch aganist the current code and I can only review it
in this context. I can't say if I like your _future_ patches at this moment! :-)

> There I also use extents to record the blocks to which the
> image will be written. I hope to submit modifications to swsusp to do
> that too in the near future.
>
> > > +/* Simplify iterating through all the values in an extent chain */
> > > +#define suspend_extent_for_each(extent_chain, extentpointer, value) \
> > > +if ((extent_chain)->first) \
> > > + for ((extentpointer) = (extent_chain)->first, (value) = \
> > > + (extentpointer)->minimum; \
> > > + ((extentpointer) && ((extentpointer)->next || (value) <= \
> > > + (extentpointer)->maximum)); \
> > > + (((value) == (extentpointer)->maximum) ? \
> > > + ((extentpointer) = (extentpointer)->next, (value) = \
> > > + ((extentpointer) ? (extentpointer)->minimum : 0)) : \
> > > + (value)++))
> >
> > This macro doesn't look very nice and is used only once, so I think you
> > can drop it and just write the loop where it belongs.
>
> With the modifications I mentioned just above, this would also be used
> for getting the blocks which match each swap extent. I can remove the
> macro, but just want to make you aware that it does serve a purpose,
> you're just not seeing it fully yet.

Can we just assume there are no other patches and proceed under this
assumption?

Could you please remove the macro for now? You can introduce it with the
other patches when you submit them (if it's still needed at that time).

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-10-24 22:47:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Wed, Oct 25, 2006 at 08:06:36AM +1000, Nigel Cunningham wrote:
> IIRC, I avoided list.h because I only wanted a singly linked list (it
> never gets traversed backwards). List.h looks to me like all doubly
> linked lists. Do you know if there are any other singly linked list
> implementations I could piggy-back?
>
> That said, since there's normally not that many extents, I could switch
> quite easily and it wouldn't normally waste much memory.

If the overhead doesn't matter for you (and I doubt it does) I'd say just
use list.h. Reusing existing code that doesn't need to be debugged and
is idiomatically readable to everyone is very helpfull.

2006-10-24 23:03:36

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Tue, 2006-10-24 at 23:47 +0100, Christoph Hellwig wrote:
> On Wed, Oct 25, 2006 at 08:06:36AM +1000, Nigel Cunningham wrote:
> > IIRC, I avoided list.h because I only wanted a singly linked list (it
> > never gets traversed backwards). List.h looks to me like all doubly
> > linked lists. Do you know if there are any other singly linked list
> > implementations I could piggy-back?
> >
> > That said, since there's normally not that many extents, I could switch
> > quite easily and it wouldn't normally waste much memory.
>
> If the overhead doesn't matter for you (and I doubt it does) I'd say just
> use list.h. Reusing existing code that doesn't need to be debugged and
> is idiomatically readable to everyone is very helpfull.

Ok.

Regards,

Nigel

2006-10-24 23:05:07

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Wed, 2006-10-25 at 00:45 +0200, Rafael J. Wysocki wrote:
> Hi,
>
> On Wednesday, 25 October 2006 00:13, Nigel Cunningham wrote:
> > Hi.
> >
> > On Tue, 2006-10-24 at 22:08 +0200, Rafael J. Wysocki wrote:
> > > On Monday, 23 October 2006 06:14, Nigel Cunningham wrote:
> > > > Switch from bitmaps to using extents to record what swap is allocated;
> > > > they make more efficient use of memory, particularly where the allocated
> > > > storage is small and the swap space is large.
> > >
> > > As I said before, I like the overall idea, but I have a bunch of comments.
> >
> > Thanks for them. Just a quick reply for the moment to say they're
> > appreciated and I will revise accordingly.
> >
> > I should also mention that this isn't the only use of these functions in
> > Suspend2.
>
> Could we please focus on things that are on the table _now_?. You are
> submitting the patch aganist the current code and I can only review it
> in this context. I can't say if I like your _future_ patches at this moment! :-)

I understand that, but some things won't make sense or seem as useful if
I don't give you the extra information.

> > There I also use extents to record the blocks to which the
> > image will be written. I hope to submit modifications to swsusp to do
> > that too in the near future.
> >
> > > > +/* Simplify iterating through all the values in an extent chain */
> > > > +#define suspend_extent_for_each(extent_chain, extentpointer, value) \
> > > > +if ((extent_chain)->first) \
> > > > + for ((extentpointer) = (extent_chain)->first, (value) = \
> > > > + (extentpointer)->minimum; \
> > > > + ((extentpointer) && ((extentpointer)->next || (value) <= \
> > > > + (extentpointer)->maximum)); \
> > > > + (((value) == (extentpointer)->maximum) ? \
> > > > + ((extentpointer) = (extentpointer)->next, (value) = \
> > > > + ((extentpointer) ? (extentpointer)->minimum : 0)) : \
> > > > + (value)++))
> > >
> > > This macro doesn't look very nice and is used only once, so I think you
> > > can drop it and just write the loop where it belongs.
> >
> > With the modifications I mentioned just above, this would also be used
> > for getting the blocks which match each swap extent. I can remove the
> > macro, but just want to make you aware that it does serve a purpose,
> > you're just not seeing it fully yet.
>
> Can we just assume there are no other patches and proceed under this
> assumption?
>
> Could you please remove the macro for now? You can introduce it with the
> other patches when you submit them (if it's still needed at that time).

Ok.

Nigel

2006-10-25 08:11:50

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> > > That's right. In using this, we're relying on the fact that the swap
> > > allocator tries to act sensibly. I've only seen worse case performance
> > > when a user had two swap devices with the same priority (striped), but
> > > that was a bug. :)
> >
> > Ok, but if the allocator somehow manages to stripe between two swap
> > devices, what happens?
> >
> > IIRC original code was something like .1% overhead (8bytes per 4K, or
> > something?), bitmaps should be even better. If it is 1% in worst case,
> > that's probably okay, but it would be bad if it had overhead bigger
> > than 10times original code (worst case).
>
> With the code I have in Suspend2 (which is what I'm working towards),
> the value includes the swap_type, so there's no overlap. Assuming the
> swap allocator does it's normal thing and swap allocated is contiguous,
> you'll probably end up with two extents: one containing the swap
> allocated on the first device, and the other containing the swap
> allocated on the second device. So (with the current version), striping
> would use 6 * sizeof(unsigned long) instead of 3 * sizeof(unsigned
> long).

And now, can you do same computation assuming the swap allocator goes
completely crazy, and free space is in 1-page chunks?

In particular, how much swap space can we have before we run out of
low memory? What is the overhead compared to bitmaps?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-25 08:28:33

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Wed, 2006-10-25 at 10:11 +0200, Pavel Machek wrote:
> Hi!
>
> > > > That's right. In using this, we're relying on the fact that the swap
> > > > allocator tries to act sensibly. I've only seen worse case performance
> > > > when a user had two swap devices with the same priority (striped), but
> > > > that was a bug. :)
> > >
> > > Ok, but if the allocator somehow manages to stripe between two swap
> > > devices, what happens?
> > >
> > > IIRC original code was something like .1% overhead (8bytes per 4K, or
> > > something?), bitmaps should be even better. If it is 1% in worst case,
> > > that's probably okay, but it would be bad if it had overhead bigger
> > > than 10times original code (worst case).
> >
> > With the code I have in Suspend2 (which is what I'm working towards),
> > the value includes the swap_type, so there's no overlap. Assuming the
> > swap allocator does it's normal thing and swap allocated is contiguous,
> > you'll probably end up with two extents: one containing the swap
> > allocated on the first device, and the other containing the swap
> > allocated on the second device. So (with the current version), striping
> > would use 6 * sizeof(unsigned long) instead of 3 * sizeof(unsigned
> > long).
>
> And now, can you do same computation assuming the swap allocator goes
> completely crazy, and free space is in 1-page chunks?

The worst case is 3 * sizeof(unsigned long) *
number_of_swap_extents_allocated bytes.

> In particular, how much swap space can we have before we run out of
> low memory? What is the overhead compared to bitmaps?

That depends on a lot of things: the size of swap space, the amount of
low memory available to begin with and so on.

I would simply say that

1) the likelihood of the worst case is extremely low, particularly in an
age where swapspace is generally useless except for in suspending to
disk. If it was higher, your question would be much more poignant;

2) if the worst case happens, we should handle it properly, backing out,
freeing whatever we've allocated and returning control to the user, just
as we should if allocating a bitmap were to fail.

Regards,

Nigel

2006-10-25 08:43:25

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> > > With the code I have in Suspend2 (which is what I'm working towards),
> > > the value includes the swap_type, so there's no overlap. Assuming the
> > > swap allocator does it's normal thing and swap allocated is contiguous,
> > > you'll probably end up with two extents: one containing the swap
> > > allocated on the first device, and the other containing the swap
> > > allocated on the second device. So (with the current version), striping
> > > would use 6 * sizeof(unsigned long) instead of 3 * sizeof(unsigned
> > > long).
> >
> > And now, can you do same computation assuming the swap allocator goes
> > completely crazy, and free space is in 1-page chunks?
>
> The worst case is 3 * sizeof(unsigned long) *
> number_of_swap_extents_allocated bytes.

Okay, so if we got 4GB of swap space, thats 1MB swap pages, worst case
is you have one extent per page, on x86-64 that's 24MB. +kmalloc
overhead, I assume?

And you do linear walks over those extents, leading to O(n^2)
algorithm, no? That has bitten us before...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-25 09:01:07

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Wed, 2006-10-25 at 10:42 +0200, Pavel Machek wrote:
> Hi!
>
> > > > With the code I have in Suspend2 (which is what I'm working towards),
> > > > the value includes the swap_type, so there's no overlap. Assuming the
> > > > swap allocator does it's normal thing and swap allocated is contiguous,
> > > > you'll probably end up with two extents: one containing the swap
> > > > allocated on the first device, and the other containing the swap
> > > > allocated on the second device. So (with the current version), striping
> > > > would use 6 * sizeof(unsigned long) instead of 3 * sizeof(unsigned
> > > > long).
> > >
> > > And now, can you do same computation assuming the swap allocator goes
> > > completely crazy, and free space is in 1-page chunks?
> >
> > The worst case is 3 * sizeof(unsigned long) *
> > number_of_swap_extents_allocated bytes.
>
> Okay, so if we got 4GB of swap space, thats 1MB swap pages, worst case
> is you have one extent per page, on x86-64 that's 24MB. +kmalloc
> overhead, I assume?

Sounds right.

> And you do linear walks over those extents, leading to O(n^2)
> algorithm, no? That has bitten us before...

We start from where we last added an extent on the chain by default.

You're not going to respond to the other bit of my reply? I was
beginning to think you were being more reasonable this time. Oh well.

Regards,

Nigel

2006-10-25 09:10:40

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> > > > And now, can you do same computation assuming the swap allocator goes
> > > > completely crazy, and free space is in 1-page chunks?
> > >
> > > The worst case is 3 * sizeof(unsigned long) *
> > > number_of_swap_extents_allocated bytes.
> >
> > Okay, so if we got 4GB of swap space, thats 1MB swap pages, worst case
> > is you have one extent per page, on x86-64 that's 24MB. +kmalloc
> > overhead, I assume?
>
> Sounds right.

Ok, 24-50MB per 4GB of swap space is not _that_ bad...

> > And you do linear walks over those extents, leading to O(n^2)
> > algorithm, no? That has bitten us before...
>
> We start from where we last added an extent on the chain by default.

...but linear search through 24MB _is_ going to hurt.

> You're not going to respond to the other bit of my reply? I was
> beginning to think you were being more reasonable this time. Oh well.

Rafael likes your code, and that's a big plus, but do you have to
insult me?!
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-25 09:15:51

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Wed, Oct 25 2006, Nigel Cunningham wrote:
> IIRC, I avoided list.h because I only wanted a singly linked list (it
> never gets traversed backwards). List.h looks to me like all doubly
> linked lists. Do you know if there are any other singly linked list
> implementations I could piggy-back?

Look closer in list.h, more specifically at the hlist_ entries.

--
Jens Axboe

2006-10-25 10:05:52

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Wed, 2006-10-25 at 11:10 +0200, Pavel Machek wrote:
> Hi!
>
> > > > > And now, can you do same computation assuming the swap allocator goes
> > > > > completely crazy, and free space is in 1-page chunks?
> > > >
> > > > The worst case is 3 * sizeof(unsigned long) *
> > > > number_of_swap_extents_allocated bytes.
> > >
> > > Okay, so if we got 4GB of swap space, thats 1MB swap pages, worst case
> > > is you have one extent per page, on x86-64 that's 24MB. +kmalloc
> > > overhead, I assume?
> >
> > Sounds right.
>
> Ok, 24-50MB per 4GB of swap space is not _that_ bad...

Other way round: 12MB for x86, 24 for x86_64 is the worst case.
Actually, come to think of it, that would be for 8GB of swap space. The
worst case would require every page of swap to be alternately free and
allocated, so for 4GB you'd only have 2GB of swap allocated = 1/2 the
number of extents and half the memory requirements.

> > > And you do linear walks over those extents, leading to O(n^2)
> > > algorithm, no? That has bitten us before...
> >
> > We start from where we last added an extent on the chain by default.
>
> ...but linear search through 24MB _is_ going to hurt.

If we did one, yes it would. But this will be O(1) since we start at the
last value, and get_swap_page goes through the space sequentially.

> > You're not going to respond to the other bit of my reply? I was
> > beginning to think you were being more reasonable this time. Oh well.
>
> Rafael likes your code, and that's a big plus, but do you have to
> insult me?!

Pavel, I never seek to insult you and I'm sorry that you felt insulted
by my comment. In this case, I was expressing frustration at the fact
that you seemed to be (in my opinion anyway) being unreasonable in
completely ignoring and deleting my points about the likelihood of this
worse case scenario, and instead focusing on calculating the
requirements for such a worse case scenario, for a machine with at least
4 times the swapspace that most machines have today. I fully agree with
making things robust and considering worse case scenarios, and making
sure that patches are good and useful, but you seem to treat anything
that comes from me as if it's infected with the plague.

How about if we just call it quits and try to be nice to each other?

Regards,

Nigel

2006-10-25 10:07:56

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Wed, 2006-10-25 at 11:17 +0200, Jens Axboe wrote:
> On Wed, Oct 25 2006, Nigel Cunningham wrote:
> > IIRC, I avoided list.h because I only wanted a singly linked list (it
> > never gets traversed backwards). List.h looks to me like all doubly
> > linked lists. Do you know if there are any other singly linked list
> > implementations I could piggy-back?
>
> Look closer in list.h, more specifically at the hlist_ entries.

Thanks for the pointer. I did look at it, but unless I'm misreading,
it's still doubly linked. No matter. I'll use the doubly linked list.

Regards,

Nigel

2006-10-25 10:19:43

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Wed, Oct 25 2006, Nigel Cunningham wrote:
> On Wed, 2006-10-25 at 11:17 +0200, Jens Axboe wrote:
> > On Wed, Oct 25 2006, Nigel Cunningham wrote:
> > > IIRC, I avoided list.h because I only wanted a singly linked list (it
> > > never gets traversed backwards). List.h looks to me like all doubly
> > > linked lists. Do you know if there are any other singly linked list
> > > implementations I could piggy-back?
> >
> > Look closer in list.h, more specifically at the hlist_ entries.
>
> Thanks for the pointer. I did look at it, but unless I'm misreading,
> it's still doubly linked. No matter. I'll use the doubly linked list.

It is, the head just takes up a pointer less. As hch mentioned, I doubt
it matters in this case at all.

--
Jens Axboe

2006-10-25 12:47:44

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

On Wednesday, 25 October 2006 12:05, Nigel Cunningham wrote:
> Hi.
>
> On Wed, 2006-10-25 at 11:10 +0200, Pavel Machek wrote:
> > Hi!
> >
> > > > > > And now, can you do same computation assuming the swap allocator goes
> > > > > > completely crazy, and free space is in 1-page chunks?
> > > > >
> > > > > The worst case is 3 * sizeof(unsigned long) *
> > > > > number_of_swap_extents_allocated bytes.
> > > >
> > > > Okay, so if we got 4GB of swap space, thats 1MB swap pages, worst case
> > > > is you have one extent per page, on x86-64 that's 24MB. +kmalloc
> > > > overhead, I assume?
> > >
> > > Sounds right.
> >
> > Ok, 24-50MB per 4GB of swap space is not _that_ bad...
>
> Other way round: 12MB for x86, 24 for x86_64 is the worst case.
> Actually, come to think of it, that would be for 8GB of swap space. The
> worst case would require every page of swap to be alternately free and
> allocated, so for 4GB you'd only have 2GB of swap allocated = 1/2 the
> number of extents and half the memory requirements.
>
> > > > And you do linear walks over those extents, leading to O(n^2)
> > > > algorithm, no? That has bitten us before...
> > >
> > > We start from where we last added an extent on the chain by default.
> >
> > ...but linear search through 24MB _is_ going to hurt.
>
> If we did one, yes it would. But this will be O(1) since we start at the
> last value, and get_swap_page goes through the space sequentially.

No, it doesn't, AFAICT, but I don't think it matters.

The question is whether there is a chance we'll get anywhere close to the
worst case and I think the answer is 'no' due to the way in which the swap
allocation works.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-10-27 11:38:30

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi!

> > > > > > And now, can you do same computation assuming the swap allocator goes
> > > > > > completely crazy, and free space is in 1-page chunks?
> > > > >
> > > > > The worst case is 3 * sizeof(unsigned long) *
> > > > > number_of_swap_extents_allocated bytes.
> > > >
> > > > Okay, so if we got 4GB of swap space, thats 1MB swap pages, worst case
> > > > is you have one extent per page, on x86-64 that's 24MB. +kmalloc
> > > > overhead, I assume?
> > >
> > > Sounds right.
> >
> > Ok, 24-50MB per 4GB of swap space is not _that_ bad...
>
> Other way round: 12MB for x86, 24 for x86_64 is the worst case.
> Actually, come to think of it, that would be for 8GB of swap space. The
> worst case would require every page of swap to be alternately free and
> allocated, so for 4GB you'd only have 2GB of swap allocated = 1/2 the
> number of extents and half the memory requirements.

Ok, I was trying to do the ballpark estimate.

> > > You're not going to respond to the other bit of my reply? I was
> > > beginning to think you were being more reasonable this time. Oh well.
> >
> > Rafael likes your code, and that's a big plus, but do you have to
> > insult me?!
>
> Pavel, I never seek to insult you and I'm sorry that you felt insulted
> by my comment. In this case, I was expressing frustration at the fact
> that you seemed to be (in my opinion anyway) being unreasonable in
> completely ignoring and deleting my points about the likelihood of this
> worse case scenario, and instead focusing on calculating the

Well, deleting the points meant I mostly agree with them. But knowing
how bad worst case is is still interesting (I do not think future swap
allocator may not suddenly start making lots of small holes, or
something). Anyway, that patch is probably okay...

> How about if we just call it quits and try to be nice to each other?

Quits.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-10-31 11:38:14

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Tue, 2006-10-24 at 21:42 +0100, Christoph Hellwig wrote:
> On Mon, Oct 23, 2006 at 02:14:17PM +1000, Nigel Cunningham wrote:
> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
> >
> > This is also part of the ground work for implementing support for
> > supporting multiple swap devices.
>
> In addition to the very useful comments from Rafael there's some observations
> of my own:
>
> - there's an awful lot of opencoded list manipulation, any chance you
> could use list.h instead?

Further to this, I gave using list.h a go. Unfortunately it doesn't look
to me like it is a good idea: in adding a range, I'm comparing the new
range to the maximum of one extent and the minimum of the next, so
finding the minimum of the next extent becomes a lot uglier than it
currently is. Currently it's just ->next->minimum, but with list.h, I'd
need container_of(current->list.next)->minimum. Or am I missing
something?

Regards,

Nigel

2006-10-31 11:40:01

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Tue, 2006-10-24 at 22:08 +0200, Rafael J. Wysocki wrote:
> On Monday, 23 October 2006 06:14, Nigel Cunningham wrote:
> > Switch from bitmaps to using extents to record what swap is allocated;
> > they make more efficient use of memory, particularly where the allocated
> > storage is small and the swap space is large.
>
> As I said before, I like the overall idea, but I have a bunch of comments.
>
> > This is also part of the ground work for implementing support for
> > supporting multiple swap devices.
> >
> > Signed-off-by: Nigel Cunningham <[email protected]>
> >
> > diff --git a/kernel/power/Makefile b/kernel/power/Makefile
> > index 38725f5..d772521 100644
> > --- a/kernel/power/Makefile
> > +++ b/kernel/power/Makefile
> > @@ -5,6 +5,6 @@ endif
> >
> > obj-y := main.o process.o console.o
> > obj-$(CONFIG_PM_LEGACY) += pm.o
> > -obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o
> > +obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o extent.o
> >
> > obj-$(CONFIG_MAGIC_SYSRQ) += poweroff.o
> > diff --git a/kernel/power/extent.c b/kernel/power/extent.c
> > new file mode 100644
> > index 0000000..b769956
> > --- /dev/null
> > +++ b/kernel/power/extent.c
> > @@ -0,0 +1,119 @@
> > +/*
> > + * kernel/power/extent.c
> > + *
> > + * Copyright (C) 2003-2006 Nigel Cunningham <[email protected]>
> > + * Copyright (C) 2006 Red Hat, inc.
> > + *
> > + * Distributed under GPLv2.
> > + *
> > + * These functions encapsulate the manipulation of storage metadata.
> > + */
> > +
> > +#include <linux/suspend.h>
> > +#include "extent.h"
> > +
> > +/* suspend_get_extent
> > + *
> > + * Returns a free extent. May fail, returning NULL instead.
> > + */
> > +static struct extent *suspend_get_extent(void)
> > +{
> > + struct extent *result;
>
> I'd call the type 'struct suspend_extent', for clarity.

Done, thanks for the suggestion.

> > +
> > + if (!(result = kmalloc(sizeof(struct extent), GFP_ATOMIC)))
> > + return NULL;
>
> if you use kzalloc() here, the initializations below won't be necessary and
> the function will reduce to a one-liner (in which case it can be inlined).

Done.

> > +
> > + result->minimum = result->maximum = 0;
> > + result->next = NULL;
> > +
> > + return result;
> > +}
> > +
> > +/* suspend_put_extent_chain.
> > + *
> > + * Frees a whole chain of extents.
> > + */
> > +void suspend_put_extent_chain(struct extent_chain *chain)
>
> I'd call it suspend_free_all_extents().

Renamed as suspend_free_extent_chain (I'm thinking of future use of
multiple chains for multiple swap devices).

> > +{
> > + struct extent *this;
> > +
> > + this = chain->first;
> > +
> > + while(this) {
> > + struct extent *next = this->next;
> > + kfree(this);
> > + chain->num_extents--;
> > + this = next;
> > + }
> > +
> > + BUG_ON(chain->num_extents);
>
> I'd drop this BUG_ON(). Once the code has been debugged, it's no longer
> of any use.

Done.

> > + chain->first = chain->last_touched = NULL;
>
> This is against the coding style AFAICT. Please do
>
> chain->first = NULL;
> chain->last_touched = NULL;

Done. Didn't know that, thanks.

> > + chain->size = 0;
> > +}
> > +
> > +/*
> > + * suspend_add_to_extent_chain
> > + *
> > + * Add an extent to an existing chain.
> > + */
> > +int suspend_add_to_extent_chain(struct extent_chain *chain,
> > + unsigned long minimum, unsigned long maximum)
>
> I'd use shorter names for these arguments, like 'start', 'end'.

min & max now used.

> > +{
> > + struct extent *new_extent = NULL, *start_at;
>
> The names are not the best here too, IMHO. I'd use something like
> *new_ext, *cur_ext.
>
> > +
> > + /* Find the right place in the chain */
> > + start_at = (chain->last_touched &&
> > + (chain->last_touched->minimum < minimum)) ?
> > + chain->last_touched : NULL;
> > +
> > + if (!start_at && chain->first && chain->first->minimum < minimum)
> > + start_at = chain->first;
>
> The above two statements can be simplified, like
>
> cur_ext = NULL;
> if (chain->last_touched && chain->last_touched->minimum < start)
> cur_ext = chain->last_touched;
> else if (chain->first && chain->first->minimum < start)
> cur_ext = chain->first;

Done, ta.

> > +
> > + while (start_at && start_at->next && start_at->next->minimum < minimum)
> > + start_at = start_at->next;
>
> You don't need to check if start_at is nonzero in every step. It's sufficient to
> check this once at the beginning.
>
> Moreover, the below is also only executed if start_at is nonzero,
> so you can put it under one if () along with the above loop, like this:

Done, ta.

> if (cur_ext) {
> while (cur_ext->next && cur_ext->next->minimum < start)
> cur_ext = cur_ext->next;
>
> if (cur_ext-> maximum == (start - 1)) {
> struct suspend_extent *next_ext;
>
> cur_ext->maximum = end;
> next_ext = cur_ext->next;
> if (next_ext && cur_ext->maximum == next_ext->minimum - 1) {
> cur_ext->maximum = next_ext->maximum;
> cur_ext->next = next_ext->next;
> kfree(next_ext);
> chain->num_extents--;
> }
>
> etc.
> > +
> > + if (start_at && start_at->maximum == (minimum - 1)) {
> > + start_at->maximum = maximum;
> > +
> > + /* Merge with the following one? */
> > + if (start_at->next &&
> > + start_at->maximum + 1 == start_at->next->minimum) {
> > + struct extent *to_free = start_at->next;
> > + start_at->maximum = start_at->next->maximum;
> > + start_at->next = start_at->next->next;
> > + chain->num_extents--;
> > + kfree(to_free);
> > + }
> > +
> > + chain->last_touched = start_at;
> > + chain->size+= (maximum - minimum + 1);
> > +
> > + return 0;
> > + }
> > +
> > + new_extent = suspend_get_extent();
> > + if (!new_extent) {
> > + printk("Error unable to append a new extent to the chain.\n");
> > + return 2;
>
> return -ENOMEM;

Mmm. Ta. Forgotten defunct debugging.

> > + }
> > +
> > + chain->num_extents++;
> > + chain->size+= (maximum - minimum + 1);
> > + new_extent->minimum = minimum;
> > + new_extent->maximum = maximum;
> > + new_extent->next = NULL;
>
> The last one won't be necessary if you use kzalloc() to allocate extents.

Ack.

> > +
> > + chain->last_touched = new_extent;
> > +
> > + if (start_at) {
> > + struct extent *next = start_at->next;
> > + start_at->next = new_extent;
> > + new_extent->next = next;
>
> *next is unnecessary here. You can do it like this:
>
> new_ext->next = cur_ext->next;
> cur_ext->next = new_ext;
>
> because new_ext->next is initially NULL.

k.

> > + } else {
> > + if (chain->first)
> > + new_extent->next = chain->first;
> > + chain->first = new_extent;
> > + }
> > +
> > + return 0;
> > +}
> > diff --git a/kernel/power/extent.h b/kernel/power/extent.h
> > new file mode 100644
> > index 0000000..062b4c1
> > --- /dev/null
> > +++ b/kernel/power/extent.h
> > @@ -0,0 +1,50 @@
> > +/*
> > + * kernel/power/extent.h
> > + *
> > + * Copyright (C) 2004-2006 Nigel Cunningham <[email protected]>
> > + * Copyright (C) 2006 Red Hat, inc.
> > + *
> > + * This file is released under the GPLv2.
> > + *
> > + * It contains declarations related to extents. Extents are
> > + * suspend's method of storing some of the metadata for the image.
> > + * See extent.c for more info.
> > + *
> > + */
> > +
> > +#ifndef EXTENT_H
> > +#define EXTENT_H
> > +
> > +struct extent {
> > + unsigned long minimum, maximum;
>
> Well, I'd use shorter names, but whatever.
>
> > + struct extent *next;
> > +};
> > +
> > +struct extent_chain {
> > + int size; /* size of the chain ie sum (max-min+1) */
> > + int num_extents;
> > + struct extent *first, *last_touched;
> > +};
> > +
> > +/* Simplify iterating through all the values in an extent chain */
> > +#define suspend_extent_for_each(extent_chain, extentpointer, value) \
> > +if ((extent_chain)->first) \
> > + for ((extentpointer) = (extent_chain)->first, (value) = \
> > + (extentpointer)->minimum; \
> > + ((extentpointer) && ((extentpointer)->next || (value) <= \
> > + (extentpointer)->maximum)); \
> > + (((value) == (extentpointer)->maximum) ? \
> > + ((extentpointer) = (extentpointer)->next, (value) = \
> > + ((extentpointer) ? (extentpointer)->minimum : 0)) : \
> > + (value)++))
>
> This macro doesn't look very nice and is used only once, so I think you
> can drop it and just write the loop where it belongs.

Sort of done - expanded to two loops. 'tis more readable that way
anyway.

> > +
> > +void suspend_put_extent_chain(struct extent_chain *chain);
> > +int suspend_add_to_extent_chain(struct extent_chain *chain,
> > + unsigned long minimum, unsigned long maximum);
> > +
> > +/* swap_entry_to_extent_val & extent_val_to_swap_entry:
> > + * We are putting offset in the low bits so consecutive swap entries
> > + * make consecutive extent values */
> > +#define swap_entry_to_extent_val(swp_entry) (swp_entry.val)
> > +#define extent_val_to_swap_entry(val) (swp_entry_t) { (val) }
>
> These two macros are also used only once each. I'd just use the values
> directly.

ack

> > +#endif
>
> That's all. :-)

Thanks.

> You never change things by fighting the existing reality.
> R. Buckminster Fuller

Hmm. Maybe I shouldn't bother with this then :)

Nigel

2006-11-01 12:36:07

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

> Hi.
>
> On Tue, 2006-10-24 at 21:42 +0100, Christoph Hellwig wrote:
> > On Mon, Oct 23, 2006 at 02:14:17PM +1000, Nigel Cunningham wrote:
> > > Switch from bitmaps to using extents to record what swap is allocated;
> > > they make more efficient use of memory, particularly where the allocated
> > > storage is small and the swap space is large.
> > >
> > > This is also part of the ground work for implementing support for
> > > supporting multiple swap devices.
> >
> > In addition to the very useful comments from Rafael there's some observations
> > of my own:
> >
> > - there's an awful lot of opencoded list manipulation, any chance you
> > could use list.h instead?
>
> Further to this, I gave using list.h a go. Unfortunately it doesn't look
> to me like it is a good idea: in adding a range, I'm comparing the new
> range to the maximum of one extent and the minimum of the next, so
> finding the minimum of the next extent becomes a lot uglier than it
> currently is. Currently it's just ->next->minimum, but with list.h, I'd
> need container_of(current->list.next)->minimum. Or am I missing
> something?

That does not look that scary... just do it.
Pavel

--
Thanks, Sharp!

2006-11-01 21:19:56

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH] Use extents for recording what swap is allocated.

Hi.

On Wed, 2006-11-01 at 13:36 +0100, Pavel Machek wrote:
> > Hi.
> >
> > On Tue, 2006-10-24 at 21:42 +0100, Christoph Hellwig wrote:
> > > On Mon, Oct 23, 2006 at 02:14:17PM +1000, Nigel Cunningham wrote:
> > > > Switch from bitmaps to using extents to record what swap is allocated;
> > > > they make more efficient use of memory, particularly where the allocated
> > > > storage is small and the swap space is large.
> > > >
> > > > This is also part of the ground work for implementing support for
> > > > supporting multiple swap devices.
> > >
> > > In addition to the very useful comments from Rafael there's some observations
> > > of my own:
> > >
> > > - there's an awful lot of opencoded list manipulation, any chance you
> > > could use list.h instead?
> >
> > Further to this, I gave using list.h a go. Unfortunately it doesn't look
> > to me like it is a good idea: in adding a range, I'm comparing the new
> > range to the maximum of one extent and the minimum of the next, so
> > finding the minimum of the next extent becomes a lot uglier than it
> > currently is. Currently it's just ->next->minimum, but with list.h, I'd
> > need container_of(current->list.next)->minimum. Or am I missing
> > something?
>
> That does not look that scary... just do it.

It currently makes more sense to me to stick with what I already have
because it's simpler and more readable.

Regards,

Nigel