2010-01-05 03:20:40

by Nitin Gupta

[permalink] [raw]
Subject: [PATCH 1/2] [mmotm] Add swap slot free callback to block_device_operations

This callback is required when RAM based devices are
used as swap disks. One such device is ramzswap[1] which
is used as compressed in-memory swap disk. For such
devices, we need a callback as soon as a swap slot is no
longer used to allow freeing memory allocated for this
slot. Without this callback, stale data can quickly
accumulate in memory defeating the whole purpose of
such devices.

Another user of this callback will be "preswap" as
introduced by "Transcendent Memory" patches:
http://lwn.net/Articles/367286/
(I intend to integrade preswap with ramzswap).

[1] ramzswap: http://code.google.com/p/compcache/

Signed-off-by: Nitin Gupta <[email protected]>
---
include/linux/blkdev.h | 1 +
mm/swapfile.c | 3 +++
2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 784a919..14b95a3 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1295,6 +1295,7 @@ struct block_device_operations {
unsigned long long);
int (*revalidate_disk) (struct gendisk *);
int (*getgeo)(struct block_device *, struct hd_geometry *);
+ void (*swap_slot_free_notify) (struct block_device *, unsigned long);
struct module *owner;
};

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6c0585b..0c373bc 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -574,6 +574,7 @@ static unsigned char swap_entry_free(struct swap_info_struct *p,

/* free if no reference */
if (!usage) {
+ struct gendisk *disk = p->bdev->bd_disk;
if (offset < p->lowest_bit)
p->lowest_bit = offset;
if (offset > p->highest_bit)
@@ -583,6 +584,8 @@ static unsigned char swap_entry_free(struct swap_info_struct *p,
swap_list.next = p->type;
nr_swap_pages++;
p->inuse_pages--;
+ if (disk->fops->swap_slot_free_notify)
+ disk->fops->swap_slot_free_notify(p->bdev, offset);
}

return usage;
--
1.6.2.5


2010-01-05 03:20:47

by Nitin Gupta

[permalink] [raw]
Subject: [PATCH 2/2] [mmotm] ramzswap: use slot free callback to eliminate stale data

ramzswap driver creates RAM backed block devices which are
used as swap disks. Pages swapped to these disks are compressed
and stored in memory itself. However, when a swap page becomes
stale i.e. it is no longer referenced by any process (say, when
owning process exits), the driver does not get any notification
about this. So, it has to keep such pages in memory until kernel
swaps to the same swap slot again thereby overwriting previous
(stale) page.

Often, a large number of such stale pages accumulate which defeats
the whole purpose of in-memory compressed swapping and it begins
to have a negative impact on system performance.

To overcome this problem, we now register a callback function
which is called as soon as a swap slot is freed which allows
us to free corresponding memory, eliminating any stale data
in (compressed) memory.

Signed-off-by: Nitin Gupta <[email protected]>
---
drivers/staging/ramzswap/ramzswap_drv.c | 15 ++++++++++++++-
drivers/staging/ramzswap/ramzswap_drv.h | 1 +
drivers/staging/ramzswap/ramzswap_ioctl.h | 1 +
3 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/ramzswap/ramzswap_drv.c b/drivers/staging/ramzswap/ramzswap_drv.c
index b839f05..bada4ae 100644
--- a/drivers/staging/ramzswap/ramzswap_drv.c
+++ b/drivers/staging/ramzswap/ramzswap_drv.c
@@ -276,6 +276,7 @@ void ramzswap_ioctl_get_stats(struct ramzswap *rzs,
s->failed_reads = rs->failed_reads;
s->failed_writes = rs->failed_writes;
s->invalid_io = rs->invalid_io;
+ s->notify_free = rs->notify_free;
s->pages_zero = rs->pages_zero;

s->good_compress_pct = good_compress_perc;
@@ -1310,9 +1311,21 @@ out:
return ret;
}

+void ramzswap_slot_free_notify(struct block_device *bdev, unsigned long index)
+{
+ struct ramzswap *rzs;
+
+ rzs = bdev->bd_disk->private_data;
+ ramzswap_free_page(rzs, index);
+ stat_inc(rzs->stats.notify_free);
+
+ return;
+}
+
static struct block_device_operations ramzswap_devops = {
.ioctl = ramzswap_ioctl,
- .owner = THIS_MODULE,
+ .swap_slot_free_notify = ramzswap_slot_free_notify,
+ .owner = THIS_MODULE
};

static void create_device(struct ramzswap *rzs, int device_id)
diff --git a/drivers/staging/ramzswap/ramzswap_drv.h b/drivers/staging/ramzswap/ramzswap_drv.h
index a6ea240..adc841a 100644
--- a/drivers/staging/ramzswap/ramzswap_drv.h
+++ b/drivers/staging/ramzswap/ramzswap_drv.h
@@ -124,6 +124,7 @@ struct ramzswap_stats {
u64 failed_reads; /* can happen when memory is too low */
u64 failed_writes; /* should NEVER! happen */
u64 invalid_io; /* non-swap I/O requests */
+ u64 notify_free; /* no. of swap slot free notifications */
u32 pages_zero; /* no. of zero filled pages */
u32 pages_stored; /* no. of pages currently stored */
u32 good_compress; /* % of pages with compression ratio<=50% */
diff --git a/drivers/staging/ramzswap/ramzswap_ioctl.h b/drivers/staging/ramzswap/ramzswap_ioctl.h
index c713a09..ec50416 100644
--- a/drivers/staging/ramzswap/ramzswap_ioctl.h
+++ b/drivers/staging/ramzswap/ramzswap_ioctl.h
@@ -27,6 +27,7 @@ struct ramzswap_ioctl_stats {
u64 failed_reads; /* can happen when memory is too low */
u64 failed_writes; /* should NEVER! happen */
u64 invalid_io; /* non-swap I/O requests */
+ u64 notify_free; /* no. of swap slot free notifications */
u32 pages_zero; /* no. of zero filled pages */
u32 good_compress_pct; /* no. of pages with compression ratio<=50% */
u32 pages_expand_pct; /* no. of incompressible pages */
--
1.6.2.5

2010-01-05 10:58:13

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 1/2] [mmotm] Add swap slot free callback to block_device_operations

On Tue, 5 Jan 2010, Nitin Gupta wrote:

> This callback is required when RAM based devices are
> used as swap disks. One such device is ramzswap[1] which
> is used as compressed in-memory swap disk. For such
> devices, we need a callback as soon as a swap slot is no
> longer used to allow freeing memory allocated for this
> slot. Without this callback, stale data can quickly
> accumulate in memory defeating the whole purpose of
> such devices.
>
> Another user of this callback will be "preswap" as
> introduced by "Transcendent Memory" patches:
> http://lwn.net/Articles/367286/
> (I intend to integrade preswap with ramzswap).
>
> [1] ramzswap: http://code.google.com/p/compcache/
>
> Signed-off-by: Nitin Gupta <[email protected]>

Yes, thanks for your patience in putting this together:
I much prefer this version to all the other ways considered.

(Though you should later add a comment, pointing out that this
callback is with swap_lock and often page table lock also held.)

We know that's unsatisfactory, but it is the nature of swap:
cutting some corners to get the job done without allocating
more memory. Having tried myself to do proper discard at
swap_free time, I know that is hard: and I believe you've
made the right decision to do it in this way for ramzswap.

Acked-by: Hugh Dickins <[email protected]>

> ---
> include/linux/blkdev.h | 1 +
> mm/swapfile.c | 3 +++
> 2 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 784a919..14b95a3 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1295,6 +1295,7 @@ struct block_device_operations {
> unsigned long long);
> int (*revalidate_disk) (struct gendisk *);
> int (*getgeo)(struct block_device *, struct hd_geometry *);
> + void (*swap_slot_free_notify) (struct block_device *, unsigned long);
> struct module *owner;
> };
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6c0585b..0c373bc 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -574,6 +574,7 @@ static unsigned char swap_entry_free(struct swap_info_struct *p,
>
> /* free if no reference */
> if (!usage) {
> + struct gendisk *disk = p->bdev->bd_disk;
> if (offset < p->lowest_bit)
> p->lowest_bit = offset;
> if (offset > p->highest_bit)
> @@ -583,6 +584,8 @@ static unsigned char swap_entry_free(struct swap_info_struct *p,
> swap_list.next = p->type;
> nr_swap_pages++;
> p->inuse_pages--;
> + if (disk->fops->swap_slot_free_notify)
> + disk->fops->swap_slot_free_notify(p->bdev, offset);
> }
>
> return usage;
> --
> 1.6.2.5

2010-01-05 10:59:09

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 2/2] [mmotm] ramzswap: use slot free callback to eliminate stale data

On Tue, 5 Jan 2010, Nitin Gupta wrote:

> ramzswap driver creates RAM backed block devices which are
> used as swap disks. Pages swapped to these disks are compressed
> and stored in memory itself. However, when a swap page becomes
> stale i.e. it is no longer referenced by any process (say, when
> owning process exits), the driver does not get any notification
> about this. So, it has to keep such pages in memory until kernel
> swaps to the same swap slot again thereby overwriting previous
> (stale) page.
>
> Often, a large number of such stale pages accumulate which defeats
> the whole purpose of in-memory compressed swapping and it begins
> to have a negative impact on system performance.
>
> To overcome this problem, we now register a callback function
> which is called as soon as a swap slot is freed which allows
> us to free corresponding memory, eliminating any stale data
> in (compressed) memory.
>
> Signed-off-by: Nitin Gupta <[email protected]>

Acked-by: Hugh Dickins <[email protected]>

> ---
> drivers/staging/ramzswap/ramzswap_drv.c | 15 ++++++++++++++-
> drivers/staging/ramzswap/ramzswap_drv.h | 1 +
> drivers/staging/ramzswap/ramzswap_ioctl.h | 1 +
> 3 files changed, 16 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/staging/ramzswap/ramzswap_drv.c b/drivers/staging/ramzswap/ramzswap_drv.c
> index b839f05..bada4ae 100644
> --- a/drivers/staging/ramzswap/ramzswap_drv.c
> +++ b/drivers/staging/ramzswap/ramzswap_drv.c
> @@ -276,6 +276,7 @@ void ramzswap_ioctl_get_stats(struct ramzswap *rzs,
> s->failed_reads = rs->failed_reads;
> s->failed_writes = rs->failed_writes;
> s->invalid_io = rs->invalid_io;
> + s->notify_free = rs->notify_free;
> s->pages_zero = rs->pages_zero;
>
> s->good_compress_pct = good_compress_perc;
> @@ -1310,9 +1311,21 @@ out:
> return ret;
> }
>
> +void ramzswap_slot_free_notify(struct block_device *bdev, unsigned long index)
> +{
> + struct ramzswap *rzs;
> +
> + rzs = bdev->bd_disk->private_data;
> + ramzswap_free_page(rzs, index);
> + stat_inc(rzs->stats.notify_free);
> +
> + return;
> +}
> +
> static struct block_device_operations ramzswap_devops = {
> .ioctl = ramzswap_ioctl,
> - .owner = THIS_MODULE,
> + .swap_slot_free_notify = ramzswap_slot_free_notify,
> + .owner = THIS_MODULE
> };
>
> static void create_device(struct ramzswap *rzs, int device_id)
> diff --git a/drivers/staging/ramzswap/ramzswap_drv.h b/drivers/staging/ramzswap/ramzswap_drv.h
> index a6ea240..adc841a 100644
> --- a/drivers/staging/ramzswap/ramzswap_drv.h
> +++ b/drivers/staging/ramzswap/ramzswap_drv.h
> @@ -124,6 +124,7 @@ struct ramzswap_stats {
> u64 failed_reads; /* can happen when memory is too low */
> u64 failed_writes; /* should NEVER! happen */
> u64 invalid_io; /* non-swap I/O requests */
> + u64 notify_free; /* no. of swap slot free notifications */
> u32 pages_zero; /* no. of zero filled pages */
> u32 pages_stored; /* no. of pages currently stored */
> u32 good_compress; /* % of pages with compression ratio<=50% */
> diff --git a/drivers/staging/ramzswap/ramzswap_ioctl.h b/drivers/staging/ramzswap/ramzswap_ioctl.h
> index c713a09..ec50416 100644
> --- a/drivers/staging/ramzswap/ramzswap_ioctl.h
> +++ b/drivers/staging/ramzswap/ramzswap_ioctl.h
> @@ -27,6 +27,7 @@ struct ramzswap_ioctl_stats {
> u64 failed_reads; /* can happen when memory is too low */
> u64 failed_writes; /* should NEVER! happen */
> u64 invalid_io; /* non-swap I/O requests */
> + u64 notify_free; /* no. of swap slot free notifications */
> u32 pages_zero; /* no. of zero filled pages */
> u32 good_compress_pct; /* no. of pages with compression ratio<=50% */
> u32 pages_expand_pct; /* no. of incompressible pages */
> --
> 1.6.2.5

2010-01-06 13:42:13

by Nitin Gupta

[permalink] [raw]
Subject: Re: [PATCH 1/2] [mmotm] Add swap slot free callback to block_device_operations


On 01/05/2010 04:28 PM, Hugh Dickins wrote:
> On Tue, 5 Jan 2010, Nitin Gupta wrote:
>
>> This callback is required when RAM based devices are
>> used as swap disks. One such device is ramzswap[1] which
>> is used as compressed in-memory swap disk. For such
>> devices, we need a callback as soon as a swap slot is no
>> longer used to allow freeing memory allocated for this
>> slot. Without this callback, stale data can quickly
>> accumulate in memory defeating the whole purpose of
>> such devices.
>>
>> Another user of this callback will be "preswap" as
>> introduced by "Transcendent Memory" patches:
>> http://lwn.net/Articles/367286/
>> (I intend to integrade preswap with ramzswap).
>>
>> [1] ramzswap: http://code.google.com/p/compcache/
>>
>> Signed-off-by: Nitin Gupta <[email protected]>
>
> Yes, thanks for your patience in putting this together:
> I much prefer this version to all the other ways considered.
>

> (Though you should later add a comment, pointing out that this
> callback is with swap_lock and often page table lock also held.)
>

I will send another patch for this.

> We know that's unsatisfactory, but it is the nature of swap:
> cutting some corners to get the job done without allocating
> more memory. Having tried myself to do proper discard at
> swap_free time, I know that is hard: and I believe you've
> made the right decision to do it in this way for ramzswap.
>
> Acked-by: Hugh Dickins <[email protected]>
>

Thanks for the review.
So, is this now going in mmotm tree?

Nitin