2014-02-24 05:51:05

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH] zram: support REQ_DISCARD

zram is ram based block device and can be used by backend of filesystem.
When filesystem deletes a file, it normally doesn't do anything on data
block of that file. It just marks on metadata of that file. This behavior
has no problem on disk based block device, but has problems on ram based
block device, since we can't free memory used for data block. To overcome
this disadvantage, there is REQ_DISCARD functionality. If block device
support REQ_DISCARD and filesystem is mounted with discard option,
filesystem sends REQ_DISCARD to block device whenever some data blocks are
discarded. All we have to do is to handle this request.

This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
REQ_DISCARD request. With it, we can free memory used by zram if it isn't
used.

Signed-off-by: Joonsoo Kim <[email protected]>
---
This patch is based on master branch of linux-next tree.

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 5ec61be..cff2c0e 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
return ret;
}

+static void zram_bio_discard(struct zram *zram, struct bio *bio)
+{
+ u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
+ size_t n = bio->bi_iter.bi_size;
+
+ while (n >= PAGE_SIZE) {
+ write_lock(&zram->meta->tb_lock);
+ zram_free_page(zram, index);
+ write_unlock(&zram->meta->tb_lock);
+ index++;
+ n -= PAGE_SIZE;
+ }
+}
+
static void zram_reset_device(struct zram *zram, bool reset_capacity)
{
size_t index;
@@ -618,6 +632,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio)
struct bio_vec bvec;
struct bvec_iter iter;

+ if (unlikely(bio->bi_rw & REQ_DISCARD)) {
+ zram_bio_discard(zram, bio);
+ bio_endio(bio, 0);
+ return;
+ }
+
index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
offset = (bio->bi_iter.bi_sector &
(SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT;
@@ -784,6 +804,10 @@ static int create_device(struct zram *zram, int device_id)
ZRAM_LOGICAL_BLOCK_SIZE);
blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
+ zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
+ zram->disk->queue->limits.max_discard_sectors = UINT_MAX;
+ zram->disk->queue->limits.discard_zeroes_data = 1;
+ queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue);

add_disk(zram->disk);

--
1.7.9.5


2014-02-24 13:38:15

by Jerome Marchand

[permalink] [raw]
Subject: Re: [PATCH] zram: support REQ_DISCARD

On 02/24/2014 06:51 AM, Joonsoo Kim wrote:
> zram is ram based block device and can be used by backend of filesystem.
> When filesystem deletes a file, it normally doesn't do anything on data
> block of that file. It just marks on metadata of that file. This behavior
> has no problem on disk based block device, but has problems on ram based
> block device, since we can't free memory used for data block. To overcome
> this disadvantage, there is REQ_DISCARD functionality. If block device
> support REQ_DISCARD and filesystem is mounted with discard option,
> filesystem sends REQ_DISCARD to block device whenever some data blocks are
> discarded. All we have to do is to handle this request.
>
> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
> REQ_DISCARD request. With it, we can free memory used by zram if it isn't
> used.
>
> Signed-off-by: Joonsoo Kim <[email protected]>
> ---
> This patch is based on master branch of linux-next tree.
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 5ec61be..cff2c0e 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
> return ret;
> }
>
> +static void zram_bio_discard(struct zram *zram, struct bio *bio)
> +{
> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;

Hi Joonsoo,

If bi_sector is not aligned on a page size, we might end up discarding
a page that still contain valid data.


> + size_t n = bio->bi_iter.bi_size;
> +
> + while (n >= PAGE_SIZE) {
> + write_lock(&zram->meta->tb_lock);
> + zram_free_page(zram, index);
> + write_unlock(&zram->meta->tb_lock);
> + index++;
> + n -= PAGE_SIZE;
> + }
> +}
> +
> static void zram_reset_device(struct zram *zram, bool reset_capacity)
> {
> size_t index;
> @@ -618,6 +632,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio)
> struct bio_vec bvec;
> struct bvec_iter iter;
>
> + if (unlikely(bio->bi_rw & REQ_DISCARD)) {
> + zram_bio_discard(zram, bio);
> + bio_endio(bio, 0);
> + return;
> + }
> +
> index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
> offset = (bio->bi_iter.bi_sector &
> (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT;
> @@ -784,6 +804,10 @@ static int create_device(struct zram *zram, int device_id)
> ZRAM_LOGICAL_BLOCK_SIZE);
> blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
> blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
> + zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
> + zram->disk->queue->limits.max_discard_sectors = UINT_MAX;
> + zram->disk->queue->limits.discard_zeroes_data = 1;
> + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue);
>
> add_disk(zram->disk);
>
>

2014-02-24 15:02:20

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH] zram: support REQ_DISCARD

2014-02-24 22:36 GMT+09:00 Jerome Marchand <[email protected]>:
> On 02/24/2014 06:51 AM, Joonsoo Kim wrote:
>> zram is ram based block device and can be used by backend of filesystem.
>> When filesystem deletes a file, it normally doesn't do anything on data
>> block of that file. It just marks on metadata of that file. This behavior
>> has no problem on disk based block device, but has problems on ram based
>> block device, since we can't free memory used for data block. To overcome
>> this disadvantage, there is REQ_DISCARD functionality. If block device
>> support REQ_DISCARD and filesystem is mounted with discard option,
>> filesystem sends REQ_DISCARD to block device whenever some data blocks are
>> discarded. All we have to do is to handle this request.
>>
>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't
>> used.
>>
>> Signed-off-by: Joonsoo Kim <[email protected]>
>> ---
>> This patch is based on master branch of linux-next tree.
>>
>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> index 5ec61be..cff2c0e 100644
>> --- a/drivers/block/zram/zram_drv.c
>> +++ b/drivers/block/zram/zram_drv.c
>> @@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>> return ret;
>> }
>>
>> +static void zram_bio_discard(struct zram *zram, struct bio *bio)
>> +{
>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
>
> Hi Joonsoo,
>
> If bi_sector is not aligned on a page size, we might end up discarding
> a page that still contain valid data.
>
>

Hello, Jerome.

Is it possible that request isn't aligned on a page size if
logical/physical block size
is PAGE_SIZE? When I tested it, I didn't find any invalid io.
If we meet any misaligned request, it would be filtered by
valid_io_request(). :)

Thanks.

2014-02-24 15:16:27

by Jerome Marchand

[permalink] [raw]
Subject: Re: [PATCH] zram: support REQ_DISCARD

On 02/24/2014 04:02 PM, Joonsoo Kim wrote:
> 2014-02-24 22:36 GMT+09:00 Jerome Marchand <[email protected]>:
>> On 02/24/2014 06:51 AM, Joonsoo Kim wrote:
>>> zram is ram based block device and can be used by backend of filesystem.
>>> When filesystem deletes a file, it normally doesn't do anything on data
>>> block of that file. It just marks on metadata of that file. This behavior
>>> has no problem on disk based block device, but has problems on ram based
>>> block device, since we can't free memory used for data block. To overcome
>>> this disadvantage, there is REQ_DISCARD functionality. If block device
>>> support REQ_DISCARD and filesystem is mounted with discard option,
>>> filesystem sends REQ_DISCARD to block device whenever some data blocks are
>>> discarded. All we have to do is to handle this request.
>>>
>>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
>>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't
>>> used.
>>>
>>> Signed-off-by: Joonsoo Kim <[email protected]>
>>> ---
>>> This patch is based on master branch of linux-next tree.
>>>
>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>>> index 5ec61be..cff2c0e 100644
>>> --- a/drivers/block/zram/zram_drv.c
>>> +++ b/drivers/block/zram/zram_drv.c
>>> @@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>>> return ret;
>>> }
>>>
>>> +static void zram_bio_discard(struct zram *zram, struct bio *bio)
>>> +{
>>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
>>
>> Hi Joonsoo,
>>
>> If bi_sector is not aligned on a page size, we might end up discarding
>> a page that still contain valid data.
>>
>>
>
> Hello, Jerome.
>
> Is it possible that request isn't aligned on a page size if
> logical/physical block size
> is PAGE_SIZE?

Yes, zram has an logical block size of 4k (ZRAM_LOGICAL_BLOCK_SIZE),
while its physical block size, which is a page size, can be bigger.

> When I tested it, I didn't find any invalid io.
> If we meet any misaligned request, it would be filtered by
> valid_io_request(). :)

zram accepts request aligned on logical blocks. So valid_io_request()
wouldn't filter misaligned requests out as long as they are aligned
on logical blocks.
If your system use 4k pages, your tests would never trigger the issue,
but on a system which uses 64k pages, it could.

Jerome

>
> Thanks.
>

2014-02-24 15:56:22

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH] zram: support REQ_DISCARD

2014-02-25 0:15 GMT+09:00 Jerome Marchand <[email protected]>:
> On 02/24/2014 04:02 PM, Joonsoo Kim wrote:
>> 2014-02-24 22:36 GMT+09:00 Jerome Marchand <[email protected]>:
>>> On 02/24/2014 06:51 AM, Joonsoo Kim wrote:
>>>> zram is ram based block device and can be used by backend of filesystem.
>>>> When filesystem deletes a file, it normally doesn't do anything on data
>>>> block of that file. It just marks on metadata of that file. This behavior
>>>> has no problem on disk based block device, but has problems on ram based
>>>> block device, since we can't free memory used for data block. To overcome
>>>> this disadvantage, there is REQ_DISCARD functionality. If block device
>>>> support REQ_DISCARD and filesystem is mounted with discard option,
>>>> filesystem sends REQ_DISCARD to block device whenever some data blocks are
>>>> discarded. All we have to do is to handle this request.
>>>>
>>>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
>>>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't
>>>> used.
>>>>
>>>> Signed-off-by: Joonsoo Kim <[email protected]>
>>>> ---
>>>> This patch is based on master branch of linux-next tree.
>>>>
>>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>>>> index 5ec61be..cff2c0e 100644
>>>> --- a/drivers/block/zram/zram_drv.c
>>>> +++ b/drivers/block/zram/zram_drv.c
>>>> @@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>>>> return ret;
>>>> }
>>>>
>>>> +static void zram_bio_discard(struct zram *zram, struct bio *bio)
>>>> +{
>>>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
>>>
>>> Hi Joonsoo,
>>>
>>> If bi_sector is not aligned on a page size, we might end up discarding
>>> a page that still contain valid data.
>>>
>>>
>>
>> Hello, Jerome.
>>
>> Is it possible that request isn't aligned on a page size if
>> logical/physical block size
>> is PAGE_SIZE?
>
> Yes, zram has an logical block size of 4k (ZRAM_LOGICAL_BLOCK_SIZE),
> while its physical block size, which is a page size, can be bigger.
>
>> When I tested it, I didn't find any invalid io.
>> If we meet any misaligned request, it would be filtered by
>> valid_io_request(). :)
>
> zram accepts request aligned on logical blocks. So valid_io_request()
> wouldn't filter misaligned requests out as long as they are aligned
> on logical blocks.
> If your system use 4k pages, your tests would never trigger the issue,
> but on a system which uses 64k pages, it could.

Okay. I got it.
So, how about using PAGE_SIZE as ZRAM_LOGICAL_BLOCK_SIZE?
Is there any reason to set 4096 to ZRAM_LOGICAL_BLOCK_SIZE,
instead of setting PAGE_SIZE to ZRAM_LOGICAL_BLOCK_SIZE?

Thanks.

2014-02-24 16:07:24

by Jerome Marchand

[permalink] [raw]
Subject: Re: [PATCH] zram: support REQ_DISCARD

On 02/24/2014 04:56 PM, Joonsoo Kim wrote:
> 2014-02-25 0:15 GMT+09:00 Jerome Marchand <[email protected]>:
>> On 02/24/2014 04:02 PM, Joonsoo Kim wrote:
>>> 2014-02-24 22:36 GMT+09:00 Jerome Marchand <[email protected]>:
>>>> On 02/24/2014 06:51 AM, Joonsoo Kim wrote:
>>>>> zram is ram based block device and can be used by backend of filesystem.
>>>>> When filesystem deletes a file, it normally doesn't do anything on data
>>>>> block of that file. It just marks on metadata of that file. This behavior
>>>>> has no problem on disk based block device, but has problems on ram based
>>>>> block device, since we can't free memory used for data block. To overcome
>>>>> this disadvantage, there is REQ_DISCARD functionality. If block device
>>>>> support REQ_DISCARD and filesystem is mounted with discard option,
>>>>> filesystem sends REQ_DISCARD to block device whenever some data blocks are
>>>>> discarded. All we have to do is to handle this request.
>>>>>
>>>>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
>>>>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't
>>>>> used.
>>>>>
>>>>> Signed-off-by: Joonsoo Kim <[email protected]>
>>>>> ---
>>>>> This patch is based on master branch of linux-next tree.
>>>>>
>>>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>>>>> index 5ec61be..cff2c0e 100644
>>>>> --- a/drivers/block/zram/zram_drv.c
>>>>> +++ b/drivers/block/zram/zram_drv.c
>>>>> @@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>>>>> return ret;
>>>>> }
>>>>>
>>>>> +static void zram_bio_discard(struct zram *zram, struct bio *bio)
>>>>> +{
>>>>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
>>>>
>>>> Hi Joonsoo,
>>>>
>>>> If bi_sector is not aligned on a page size, we might end up discarding
>>>> a page that still contain valid data.
>>>>
>>>>
>>>
>>> Hello, Jerome.
>>>
>>> Is it possible that request isn't aligned on a page size if
>>> logical/physical block size
>>> is PAGE_SIZE?
>>
>> Yes, zram has an logical block size of 4k (ZRAM_LOGICAL_BLOCK_SIZE),
>> while its physical block size, which is a page size, can be bigger.
>>
>>> When I tested it, I didn't find any invalid io.
>>> If we meet any misaligned request, it would be filtered by
>>> valid_io_request(). :)
>>
>> zram accepts request aligned on logical blocks. So valid_io_request()
>> wouldn't filter misaligned requests out as long as they are aligned
>> on logical blocks.
>> If your system use 4k pages, your tests would never trigger the issue,
>> but on a system which uses 64k pages, it could.
>
> Okay. I got it.
> So, how about using PAGE_SIZE as ZRAM_LOGICAL_BLOCK_SIZE?
> Is there any reason to set 4096 to ZRAM_LOGICAL_BLOCK_SIZE,
> instead of setting PAGE_SIZE to ZRAM_LOGICAL_BLOCK_SIZE?
>

ZRAM_LOGICAL_BLOCK_SIZE was introduced in commit 7b19b8d because the
block layer couldn't handle 64k logical block. Also, some filesytems
(including FAT IRC), can't cope with 64k block either.


> Thanks.
>

2014-02-24 16:11:33

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH] zram: support REQ_DISCARD

2014-02-25 1:06 GMT+09:00 Jerome Marchand <[email protected]>:
> On 02/24/2014 04:56 PM, Joonsoo Kim wrote:
>> 2014-02-25 0:15 GMT+09:00 Jerome Marchand <[email protected]>:
>>> On 02/24/2014 04:02 PM, Joonsoo Kim wrote:
>>>> 2014-02-24 22:36 GMT+09:00 Jerome Marchand <[email protected]>:
>>>>> On 02/24/2014 06:51 AM, Joonsoo Kim wrote:
>>>>>> zram is ram based block device and can be used by backend of filesystem.
>>>>>> When filesystem deletes a file, it normally doesn't do anything on data
>>>>>> block of that file. It just marks on metadata of that file. This behavior
>>>>>> has no problem on disk based block device, but has problems on ram based
>>>>>> block device, since we can't free memory used for data block. To overcome
>>>>>> this disadvantage, there is REQ_DISCARD functionality. If block device
>>>>>> support REQ_DISCARD and filesystem is mounted with discard option,
>>>>>> filesystem sends REQ_DISCARD to block device whenever some data blocks are
>>>>>> discarded. All we have to do is to handle this request.
>>>>>>
>>>>>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
>>>>>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't
>>>>>> used.
>>>>>>
>>>>>> Signed-off-by: Joonsoo Kim <[email protected]>
>>>>>> ---
>>>>>> This patch is based on master branch of linux-next tree.
>>>>>>
>>>>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>>>>>> index 5ec61be..cff2c0e 100644
>>>>>> --- a/drivers/block/zram/zram_drv.c
>>>>>> +++ b/drivers/block/zram/zram_drv.c
>>>>>> @@ -501,6 +501,20 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>>>>>> return ret;
>>>>>> }
>>>>>>
>>>>>> +static void zram_bio_discard(struct zram *zram, struct bio *bio)
>>>>>> +{
>>>>>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
>>>>>
>>>>> Hi Joonsoo,
>>>>>
>>>>> If bi_sector is not aligned on a page size, we might end up discarding
>>>>> a page that still contain valid data.
>>>>>
>>>>>
>>>>
>>>> Hello, Jerome.
>>>>
>>>> Is it possible that request isn't aligned on a page size if
>>>> logical/physical block size
>>>> is PAGE_SIZE?
>>>
>>> Yes, zram has an logical block size of 4k (ZRAM_LOGICAL_BLOCK_SIZE),
>>> while its physical block size, which is a page size, can be bigger.
>>>
>>>> When I tested it, I didn't find any invalid io.
>>>> If we meet any misaligned request, it would be filtered by
>>>> valid_io_request(). :)
>>>
>>> zram accepts request aligned on logical blocks. So valid_io_request()
>>> wouldn't filter misaligned requests out as long as they are aligned
>>> on logical blocks.
>>> If your system use 4k pages, your tests would never trigger the issue,
>>> but on a system which uses 64k pages, it could.
>>
>> Okay. I got it.
>> So, how about using PAGE_SIZE as ZRAM_LOGICAL_BLOCK_SIZE?
>> Is there any reason to set 4096 to ZRAM_LOGICAL_BLOCK_SIZE,
>> instead of setting PAGE_SIZE to ZRAM_LOGICAL_BLOCK_SIZE?
>>
>
> ZRAM_LOGICAL_BLOCK_SIZE was introduced in commit 7b19b8d because the
> block layer couldn't handle 64k logical block. Also, some filesytems
> (including FAT IRC), can't cope with 64k block either.
>

Okay. I will check it more.

Thanks for nice comment!!