2014-01-11 07:44:32

by Cai Liu

[permalink] [raw]
Subject: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

zswap can support multiple swapfiles. So we need to check
all zbud pool pages in zswap.

Signed-off-by: Cai Liu <[email protected]>
---
mm/zswap.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index d93afa6..2438344 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -291,7 +291,6 @@ static void zswap_free_entry(struct zswap_tree *tree,
zbud_free(tree->pool, entry->handle);
zswap_entry_cache_free(entry);
atomic_dec(&zswap_stored_pages);
- zswap_pool_pages = zbud_get_pool_size(tree->pool);
}

/* caller must hold the tree lock */
@@ -405,10 +404,24 @@ cleanup:
/*********************************
* helpers
**********************************/
+static u64 get_zswap_pool_pages(void)
+{
+ int i;
+ u64 pool_pages = 0;
+
+ for (i = 0; i < MAX_SWAPFILES; i++) {
+ if (zswap_trees[i])
+ pool_pages += zbud_get_pool_size(zswap_trees[i]->pool);
+ }
+ zswap_pool_pages = pool_pages;
+
+ return pool_pages;
+}
+
static bool zswap_is_full(void)
{
return (totalram_pages * zswap_max_pool_percent / 100 <
- zswap_pool_pages);
+ get_zswap_pool_pages());
}

/*********************************
@@ -716,7 +729,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* update stats */
atomic_inc(&zswap_stored_pages);
- zswap_pool_pages = zbud_get_pool_size(tree->pool);

return 0;

--
1.7.10.4


2014-01-13 23:34:27

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

Hello,

On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
> zswap can support multiple swapfiles. So we need to check
> all zbud pool pages in zswap.

True but this patch is rather costly that we should iterate
zswap_tree[MAX_SWAPFILES] to check it. SIGH.

How about defining zswap_tress as linked list instead of static
array? Then, we could reduce unnecessary iteration too much.

Other question:
Why do we need to update zswap_pool_pages too frequently?
As I read the code, I think it's okay to update it only when user
want to see it by debugfs and zswap_is_full is called.
So could we optimize it out?

>
> Signed-off-by: Cai Liu <[email protected]>
> ---
> mm/zswap.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index d93afa6..2438344 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -291,7 +291,6 @@ static void zswap_free_entry(struct zswap_tree *tree,
> zbud_free(tree->pool, entry->handle);
> zswap_entry_cache_free(entry);
> atomic_dec(&zswap_stored_pages);
> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
> }
>
> /* caller must hold the tree lock */
> @@ -405,10 +404,24 @@ cleanup:
> /*********************************
> * helpers
> **********************************/
> +static u64 get_zswap_pool_pages(void)
> +{
> + int i;
> + u64 pool_pages = 0;
> +
> + for (i = 0; i < MAX_SWAPFILES; i++) {
> + if (zswap_trees[i])
> + pool_pages += zbud_get_pool_size(zswap_trees[i]->pool);
> + }
> + zswap_pool_pages = pool_pages;
> +
> + return pool_pages;
> +}
> +
> static bool zswap_is_full(void)
> {
> return (totalram_pages * zswap_max_pool_percent / 100 <
> - zswap_pool_pages);
> + get_zswap_pool_pages());
> }
>
> /*********************************
> @@ -716,7 +729,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>
> /* update stats */
> atomic_inc(&zswap_stored_pages);
> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
>
> return 0;
>
> --
> 1.7.10.4
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

--
Kind regards,
Minchan Kim

2014-01-14 01:19:43

by Bob Liu

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages


On 01/14/2014 07:35 AM, Minchan Kim wrote:
> Hello,
>
> On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
>> zswap can support multiple swapfiles. So we need to check
>> all zbud pool pages in zswap.
>
> True but this patch is rather costly that we should iterate
> zswap_tree[MAX_SWAPFILES] to check it. SIGH.
>
> How about defining zswap_tress as linked list instead of static
> array? Then, we could reduce unnecessary iteration too much.
>

But if use linked list, it might not easy to access the tree like this:
struct zswap_tree *tree = zswap_trees[type];

BTW: I'm still prefer to use dynamic pool size, instead of use
zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
which will be more flexible to support this feature and page migration
as well.

> Other question:
> Why do we need to update zswap_pool_pages too frequently?
> As I read the code, I think it's okay to update it only when user
> want to see it by debugfs and zswap_is_full is called.
> So could we optimize it out?
>
>>
>> Signed-off-by: Cai Liu <[email protected]>

Reviewed-by: Bob Liu <[email protected]>

>> ---
>> mm/zswap.c | 18 +++++++++++++++---
>> 1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/zswap.c b/mm/zswap.c
>> index d93afa6..2438344 100644
>> --- a/mm/zswap.c
>> +++ b/mm/zswap.c
>> @@ -291,7 +291,6 @@ static void zswap_free_entry(struct zswap_tree *tree,
>> zbud_free(tree->pool, entry->handle);
>> zswap_entry_cache_free(entry);
>> atomic_dec(&zswap_stored_pages);
>> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
>> }
>>
>> /* caller must hold the tree lock */
>> @@ -405,10 +404,24 @@ cleanup:
>> /*********************************
>> * helpers
>> **********************************/
>> +static u64 get_zswap_pool_pages(void)
>> +{
>> + int i;
>> + u64 pool_pages = 0;
>> +
>> + for (i = 0; i < MAX_SWAPFILES; i++) {
>> + if (zswap_trees[i])
>> + pool_pages += zbud_get_pool_size(zswap_trees[i]->pool);
>> + }
>> + zswap_pool_pages = pool_pages;
>> +
>> + return pool_pages;
>> +}
>> +
>> static bool zswap_is_full(void)
>> {
>> return (totalram_pages * zswap_max_pool_percent / 100 <
>> - zswap_pool_pages);
>> + get_zswap_pool_pages());
>> }
>>
>> /*********************************
>> @@ -716,7 +729,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>>
>> /* update stats */
>> atomic_inc(&zswap_stored_pages);
>> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
>>
>> return 0;
>>
>> --
>> 1.7.10.4
--
Regards,
-Bob

2014-01-14 04:49:42

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

Hello Bob,

On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
>
> On 01/14/2014 07:35 AM, Minchan Kim wrote:
> > Hello,
> >
> > On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
> >> zswap can support multiple swapfiles. So we need to check
> >> all zbud pool pages in zswap.
> >
> > True but this patch is rather costly that we should iterate
> > zswap_tree[MAX_SWAPFILES] to check it. SIGH.
> >
> > How about defining zswap_tress as linked list instead of static
> > array? Then, we could reduce unnecessary iteration too much.
> >
>
> But if use linked list, it might not easy to access the tree like this:
> struct zswap_tree *tree = zswap_trees[type];

struct zswap_tree {
..
..
struct list_head list;
}

zswap_frontswap_init()
{
..
..
zswap_trees[type] = tree;
list_add(&tree->list, &zswap_list);
}

get_zswap_pool_pages(void)
{
struct zswap_tree *cur;
list_for_each_entry(cur, &zswap_list, list) {
pool_pages += zbud_get_pool_size(cur->pool);
}
return pool_pages;
}


>
> BTW: I'm still prefer to use dynamic pool size, instead of use
> zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
> which will be more flexible to support this feature and page migration
> as well.
>
> > Other question:
> > Why do we need to update zswap_pool_pages too frequently?
> > As I read the code, I think it's okay to update it only when user
> > want to see it by debugfs and zswap_is_full is called.
> > So could we optimize it out?
> >
> >>
> >> Signed-off-by: Cai Liu <[email protected]>
>
> Reviewed-by: Bob Liu <[email protected]>

Hmm, I really suprised you are okay in this code piece where we have
unnecessary cost most of case(ie, most system has a swap device) in
*mm* part.

Anyway, I don't want to merge this patchset.
If Andrew merge it and anybody doesn't do right work, I will send a patch.
Cai, Could you redo a patch?
I don't want to intercept your credit.

Even, we could optimize to reduce the the number of call as I said in
previous reply.

Thanks.

>
> >> ---
> >> mm/zswap.c | 18 +++++++++++++++---
> >> 1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/zswap.c b/mm/zswap.c
> >> index d93afa6..2438344 100644
> >> --- a/mm/zswap.c
> >> +++ b/mm/zswap.c
> >> @@ -291,7 +291,6 @@ static void zswap_free_entry(struct zswap_tree *tree,
> >> zbud_free(tree->pool, entry->handle);
> >> zswap_entry_cache_free(entry);
> >> atomic_dec(&zswap_stored_pages);
> >> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
> >> }
> >>
> >> /* caller must hold the tree lock */
> >> @@ -405,10 +404,24 @@ cleanup:
> >> /*********************************
> >> * helpers
> >> **********************************/
> >> +static u64 get_zswap_pool_pages(void)
> >> +{
> >> + int i;
> >> + u64 pool_pages = 0;
> >> +
> >> + for (i = 0; i < MAX_SWAPFILES; i++) {
> >> + if (zswap_trees[i])
> >> + pool_pages += zbud_get_pool_size(zswap_trees[i]->pool);
> >> + }
> >> + zswap_pool_pages = pool_pages;
> >> +
> >> + return pool_pages;
> >> +}
> >> +
> >> static bool zswap_is_full(void)
> >> {
> >> return (totalram_pages * zswap_max_pool_percent / 100 <
> >> - zswap_pool_pages);
> >> + get_zswap_pool_pages());
> >> }
> >>
> >> /*********************************
> >> @@ -716,7 +729,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
> >>
> >> /* update stats */
> >> atomic_inc(&zswap_stored_pages);
> >> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
> >>
> >> return 0;
> >>
> >> --
> >> 1.7.10.4
> --
> Regards,
> -Bob
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

--
Kind regards,
Minchan Kim

2014-01-14 05:04:55

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

On Tue, Jan 14, 2014 at 01:50:22PM +0900, Minchan Kim wrote:
> Hello Bob,
>
> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
> >
> > On 01/14/2014 07:35 AM, Minchan Kim wrote:
> > > Hello,
> > >
> > > On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
> > >> zswap can support multiple swapfiles. So we need to check
> > >> all zbud pool pages in zswap.
> > >
> > > True but this patch is rather costly that we should iterate
> > > zswap_tree[MAX_SWAPFILES] to check it. SIGH.
> > >
> > > How about defining zswap_tress as linked list instead of static
> > > array? Then, we could reduce unnecessary iteration too much.
> > >
> >
> > But if use linked list, it might not easy to access the tree like this:
> > struct zswap_tree *tree = zswap_trees[type];
>
> struct zswap_tree {
> ..
> ..
> struct list_head list;
> }
>
> zswap_frontswap_init()
> {
> ..
> ..
> zswap_trees[type] = tree;
> list_add(&tree->list, &zswap_list);
> }
>
> get_zswap_pool_pages(void)
> {
> struct zswap_tree *cur;
> list_for_each_entry(cur, &zswap_list, list) {
> pool_pages += zbud_get_pool_size(cur->pool);
> }
> return pool_pages;
> }
>
>
> >
> > BTW: I'm still prefer to use dynamic pool size, instead of use
> > zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
> > which will be more flexible to support this feature and page migration
> > as well.
> >
> > > Other question:
> > > Why do we need to update zswap_pool_pages too frequently?
> > > As I read the code, I think it's okay to update it only when user
> > > want to see it by debugfs and zswap_is_full is called.
> > > So could we optimize it out?
> > >
> > >>
> > >> Signed-off-by: Cai Liu <[email protected]>
> >
> > Reviewed-by: Bob Liu <[email protected]>
>
> Hmm, I really suprised you are okay in this code piece where we have
> unnecessary cost most of case(ie, most system has a swap device) in
> *mm* part.
>
> Anyway, I don't want to merge this patchset.
> If Andrew merge it and anybody doesn't do right work, I will send a patch.
> Cai, Could you redo a patch?
> I don't want to intercept your credit.
>
> Even, we could optimize to reduce the the number of call as I said in
> previous reply.

You did it already. Please write it out in description.

>
> Thanks.
--
Kind regards,
Minchan Kim

2014-01-14 05:43:00

by Bob Liu

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages


On 01/14/2014 01:05 PM, Minchan Kim wrote:
> On Tue, Jan 14, 2014 at 01:50:22PM +0900, Minchan Kim wrote:
>> Hello Bob,
>>
>> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
>>>
>>> On 01/14/2014 07:35 AM, Minchan Kim wrote:
>>>> Hello,
>>>>
>>>> On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
>>>>> zswap can support multiple swapfiles. So we need to check
>>>>> all zbud pool pages in zswap.
>>>>
>>>> True but this patch is rather costly that we should iterate
>>>> zswap_tree[MAX_SWAPFILES] to check it. SIGH.
>>>>
>>>> How about defining zswap_tress as linked list instead of static
>>>> array? Then, we could reduce unnecessary iteration too much.
>>>>
>>>
>>> But if use linked list, it might not easy to access the tree like this:
>>> struct zswap_tree *tree = zswap_trees[type];
>>
>> struct zswap_tree {
>> ..
>> ..
>> struct list_head list;
>> }
>>
>> zswap_frontswap_init()
>> {
>> ..
>> ..
>> zswap_trees[type] = tree;
>> list_add(&tree->list, &zswap_list);
>> }
>>
>> get_zswap_pool_pages(void)
>> {
>> struct zswap_tree *cur;
>> list_for_each_entry(cur, &zswap_list, list) {
>> pool_pages += zbud_get_pool_size(cur->pool);
>> }
>> return pool_pages;
>> }

Okay, I see your point. Yes, it's much better.
Cai, Please make an new patch.

Thanks,
-Bob

>>
>>
>>>
>>> BTW: I'm still prefer to use dynamic pool size, instead of use
>>> zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
>>> which will be more flexible to support this feature and page migration
>>> as well.
>>>
>>>> Other question:
>>>> Why do we need to update zswap_pool_pages too frequently?
>>>> As I read the code, I think it's okay to update it only when user
>>>> want to see it by debugfs and zswap_is_full is called.
>>>> So could we optimize it out?
>>>>
>>>>>
>>>>> Signed-off-by: Cai Liu <[email protected]>
>>>
>>> Reviewed-by: Bob Liu <[email protected]>
>>
>> Hmm, I really suprised you are okay in this code piece where we have
>> unnecessary cost most of case(ie, most system has a swap device) in
>> *mm* part.
>>
>> Anyway, I don't want to merge this patchset.
>> If Andrew merge it and anybody doesn't do right work, I will send a patch.
>> Cai, Could you redo a patch?
>> I don't want to intercept your credit.
>>
>> Even, we could optimize to reduce the the number of call as I said in
>> previous reply.
>
> You did it already. Please write it out in description.
>

2014-01-14 06:15:47

by Weijie Yang

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

On Tue, Jan 14, 2014 at 1:42 PM, Bob Liu <[email protected]> wrote:
>
> On 01/14/2014 01:05 PM, Minchan Kim wrote:
>> On Tue, Jan 14, 2014 at 01:50:22PM +0900, Minchan Kim wrote:
>>> Hello Bob,
>>>
>>> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
>>>>
>>>> On 01/14/2014 07:35 AM, Minchan Kim wrote:
>>>>> Hello,
>>>>>
>>>>> On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
>>>>>> zswap can support multiple swapfiles. So we need to check
>>>>>> all zbud pool pages in zswap.
>>>>>
>>>>> True but this patch is rather costly that we should iterate
>>>>> zswap_tree[MAX_SWAPFILES] to check it. SIGH.
>>>>>
>>>>> How about defining zswap_tress as linked list instead of static
>>>>> array? Then, we could reduce unnecessary iteration too much.
>>>>>
>>>>
>>>> But if use linked list, it might not easy to access the tree like this:
>>>> struct zswap_tree *tree = zswap_trees[type];
>>>
>>> struct zswap_tree {
>>> ..
>>> ..
>>> struct list_head list;
>>> }
>>>
>>> zswap_frontswap_init()
>>> {
>>> ..
>>> ..
>>> zswap_trees[type] = tree;
>>> list_add(&tree->list, &zswap_list);
>>> }
>>>
>>> get_zswap_pool_pages(void)
>>> {
>>> struct zswap_tree *cur;
>>> list_for_each_entry(cur, &zswap_list, list) {
>>> pool_pages += zbud_get_pool_size(cur->pool);
>>> }
>>> return pool_pages;
>>> }
>
> Okay, I see your point. Yes, it's much better.
> Cai, Please make an new patch.

This improved patch could reduce unnecessary iteration too much.

But I still have a question: why do we need so many zbud pools?
How about use only one global zbud pool for all zswap_tree?
I do not test it, but I think it can improve the strore density.

Just for your reference, Thanks!

> Thanks,
> -Bob
>
>>>
>>>
>>>>
>>>> BTW: I'm still prefer to use dynamic pool size, instead of use
>>>> zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
>>>> which will be more flexible to support this feature and page migration
>>>> as well.
>>>>
>>>>> Other question:
>>>>> Why do we need to update zswap_pool_pages too frequently?
>>>>> As I read the code, I think it's okay to update it only when user
>>>>> want to see it by debugfs and zswap_is_full is called.
>>>>> So could we optimize it out?
>>>>>
>>>>>>
>>>>>> Signed-off-by: Cai Liu <[email protected]>
>>>>
>>>> Reviewed-by: Bob Liu <[email protected]>
>>>
>>> Hmm, I really suprised you are okay in this code piece where we have
>>> unnecessary cost most of case(ie, most system has a swap device) in
>>> *mm* part.
>>>
>>> Anyway, I don't want to merge this patchset.
>>> If Andrew merge it and anybody doesn't do right work, I will send a patch.
>>> Cai, Could you redo a patch?
>>> I don't want to intercept your credit.
>>>
>>> Even, we could optimize to reduce the the number of call as I said in
>>> previous reply.
>>
>> You did it already. Please write it out in description.
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2014-01-14 07:10:10

by Cai Liu

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

2014/1/14 Bob Liu <[email protected]>:
>
> On 01/14/2014 01:05 PM, Minchan Kim wrote:
>> On Tue, Jan 14, 2014 at 01:50:22PM +0900, Minchan Kim wrote:
>>> Hello Bob,
>>>
>>> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
>>>>
>>>> On 01/14/2014 07:35 AM, Minchan Kim wrote:
>>>>> Hello,
>>>>>
>>>>> On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
>>>>>> zswap can support multiple swapfiles. So we need to check
>>>>>> all zbud pool pages in zswap.
>>>>>
>>>>> True but this patch is rather costly that we should iterate
>>>>> zswap_tree[MAX_SWAPFILES] to check it. SIGH.
>>>>>
>>>>> How about defining zswap_tress as linked list instead of static
>>>>> array? Then, we could reduce unnecessary iteration too much.
>>>>>
>>>>
>>>> But if use linked list, it might not easy to access the tree like this:
>>>> struct zswap_tree *tree = zswap_trees[type];
>>>
>>> struct zswap_tree {
>>> ..
>>> ..
>>> struct list_head list;
>>> }
>>>
>>> zswap_frontswap_init()
>>> {
>>> ..
>>> ..
>>> zswap_trees[type] = tree;
>>> list_add(&tree->list, &zswap_list);
>>> }
>>>
>>> get_zswap_pool_pages(void)
>>> {
>>> struct zswap_tree *cur;
>>> list_for_each_entry(cur, &zswap_list, list) {
>>> pool_pages += zbud_get_pool_size(cur->pool);
>>> }
>>> return pool_pages;
>>> }
>
> Okay, I see your point. Yes, it's much better.
> Cai, Please make an new patch.
>

Thanks for your review.
I will re-send a patch.

Also, as weijie metioned in anonther mail. Should we add "all pool
pages" count in zbud
file. Then we can keep zswap module unchanged. I think this is
reasonable, as in
zswap we only just need to know total pages, not individual pool pages.

Thanks

> Thanks,
> -Bob
>
>>>
>>>
>>>>
>>>> BTW: I'm still prefer to use dynamic pool size, instead of use
>>>> zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
>>>> which will be more flexible to support this feature and page migration
>>>> as well.
>>>>
>>>>> Other question:
>>>>> Why do we need to update zswap_pool_pages too frequently?
>>>>> As I read the code, I think it's okay to update it only when user
>>>>> want to see it by debugfs and zswap_is_full is called.
>>>>> So could we optimize it out?
>>>>>
>>>>>>
>>>>>> Signed-off-by: Cai Liu <[email protected]>
>>>>
>>>> Reviewed-by: Bob Liu <[email protected]>
>>>
>>> Hmm, I really suprised you are okay in this code piece where we have
>>> unnecessary cost most of case(ie, most system has a swap device) in
>>> *mm* part.
>>>
>>> Anyway, I don't want to merge this patchset.
>>> If Andrew merge it and anybody doesn't do right work, I will send a patch.
>>> Cai, Could you redo a patch?
>>> I don't want to intercept your credit.
>>>
>>> Even, we could optimize to reduce the the number of call as I said in
>>> previous reply.
>>
>> You did it already. Please write it out in description.
>>

2014-01-14 07:26:29

by Cai Liu

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

Hello, Kim

2014/1/14 Minchan Kim <[email protected]>:
> Hello Bob,
>
> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
>>
>> On 01/14/2014 07:35 AM, Minchan Kim wrote:
>> > Hello,
>> >
>> > On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
>> >> zswap can support multiple swapfiles. So we need to check
>> >> all zbud pool pages in zswap.
>> >
>> > True but this patch is rather costly that we should iterate
>> > zswap_tree[MAX_SWAPFILES] to check it. SIGH.
>> >
>> > How about defining zswap_tress as linked list instead of static
>> > array? Then, we could reduce unnecessary iteration too much.
>> >
>>
>> But if use linked list, it might not easy to access the tree like this:
>> struct zswap_tree *tree = zswap_trees[type];
>
> struct zswap_tree {
> ..
> ..
> struct list_head list;
> }
>
> zswap_frontswap_init()
> {
> ..
> ..
> zswap_trees[type] = tree;
> list_add(&tree->list, &zswap_list);
> }
>
> get_zswap_pool_pages(void)
> {
> struct zswap_tree *cur;
> list_for_each_entry(cur, &zswap_list, list) {
> pool_pages += zbud_get_pool_size(cur->pool);
> }
> return pool_pages;
> }
>
>
>>
>> BTW: I'm still prefer to use dynamic pool size, instead of use
>> zswap_is_full(). AFAIR, Seth has a plan to replace the rbtree with radix
>> which will be more flexible to support this feature and page migration
>> as well.
>>
>> > Other question:
>> > Why do we need to update zswap_pool_pages too frequently?
>> > As I read the code, I think it's okay to update it only when user
>> > want to see it by debugfs and zswap_is_full is called.
>> > So could we optimize it out?
>> >
>> >>
>> >> Signed-off-by: Cai Liu <[email protected]>
>>
>> Reviewed-by: Bob Liu <[email protected]>
>
> Hmm, I really suprised you are okay in this code piece where we have
> unnecessary cost most of case(ie, most system has a swap device) in
> *mm* part.
>
> Anyway, I don't want to merge this patchset.
> If Andrew merge it and anybody doesn't do right work, I will send a patch.
> Cai, Could you redo a patch?

Yes, Unnecessary iteration is not good design.
I will redo this patch.

Thanks!

> I don't want to intercept your credit.
>
> Even, we could optimize to reduce the the number of call as I said in
> previous reply.
>
> Thanks.
>
>>
>> >> ---
>> >> mm/zswap.c | 18 +++++++++++++++---
>> >> 1 file changed, 15 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/mm/zswap.c b/mm/zswap.c
>> >> index d93afa6..2438344 100644
>> >> --- a/mm/zswap.c
>> >> +++ b/mm/zswap.c
>> >> @@ -291,7 +291,6 @@ static void zswap_free_entry(struct zswap_tree *tree,
>> >> zbud_free(tree->pool, entry->handle);
>> >> zswap_entry_cache_free(entry);
>> >> atomic_dec(&zswap_stored_pages);
>> >> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
>> >> }
>> >>
>> >> /* caller must hold the tree lock */
>> >> @@ -405,10 +404,24 @@ cleanup:
>> >> /*********************************
>> >> * helpers
>> >> **********************************/
>> >> +static u64 get_zswap_pool_pages(void)
>> >> +{
>> >> + int i;
>> >> + u64 pool_pages = 0;
>> >> +
>> >> + for (i = 0; i < MAX_SWAPFILES; i++) {
>> >> + if (zswap_trees[i])
>> >> + pool_pages += zbud_get_pool_size(zswap_trees[i]->pool);
>> >> + }
>> >> + zswap_pool_pages = pool_pages;
>> >> +
>> >> + return pool_pages;
>> >> +}
>> >> +
>> >> static bool zswap_is_full(void)
>> >> {
>> >> return (totalram_pages * zswap_max_pool_percent / 100 <
>> >> - zswap_pool_pages);
>> >> + get_zswap_pool_pages());
>> >> }
>> >>
>> >> /*********************************
>> >> @@ -716,7 +729,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>> >>
>> >> /* update stats */
>> >> atomic_inc(&zswap_stored_pages);
>> >> - zswap_pool_pages = zbud_get_pool_size(tree->pool);
>> >>
>> >> return 0;
>> >>
>> >> --
>> >> 1.7.10.4
>> --
>> Regards,
>> -Bob
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to [email protected]. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>
> --
> Kind regards,
> Minchan Kim

2014-01-15 05:17:14

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages

On Tue, Jan 14, 2014 at 02:15:44PM +0800, Weijie Yang wrote:
> On Tue, Jan 14, 2014 at 1:42 PM, Bob Liu <[email protected]> wrote:
> >
> > On 01/14/2014 01:05 PM, Minchan Kim wrote:
> >> On Tue, Jan 14, 2014 at 01:50:22PM +0900, Minchan Kim wrote:
> >>> Hello Bob,
> >>>
> >>> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote:
> >>>>
> >>>> On 01/14/2014 07:35 AM, Minchan Kim wrote:
> >>>>> Hello,
> >>>>>
> >>>>> On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote:
> >>>>>> zswap can support multiple swapfiles. So we need to check
> >>>>>> all zbud pool pages in zswap.
> >>>>>
> >>>>> True but this patch is rather costly that we should iterate
> >>>>> zswap_tree[MAX_SWAPFILES] to check it. SIGH.
> >>>>>
> >>>>> How about defining zswap_tress as linked list instead of static
> >>>>> array? Then, we could reduce unnecessary iteration too much.
> >>>>>
> >>>>
> >>>> But if use linked list, it might not easy to access the tree like this:
> >>>> struct zswap_tree *tree = zswap_trees[type];
> >>>
> >>> struct zswap_tree {
> >>> ..
> >>> ..
> >>> struct list_head list;
> >>> }
> >>>
> >>> zswap_frontswap_init()
> >>> {
> >>> ..
> >>> ..
> >>> zswap_trees[type] = tree;
> >>> list_add(&tree->list, &zswap_list);
> >>> }
> >>>
> >>> get_zswap_pool_pages(void)
> >>> {
> >>> struct zswap_tree *cur;
> >>> list_for_each_entry(cur, &zswap_list, list) {
> >>> pool_pages += zbud_get_pool_size(cur->pool);
> >>> }
> >>> return pool_pages;
> >>> }
> >
> > Okay, I see your point. Yes, it's much better.
> > Cai, Please make an new patch.
>
> This improved patch could reduce unnecessary iteration too much.
>
> But I still have a question: why do we need so many zbud pools?
> How about use only one global zbud pool for all zswap_tree?
> I do not test it, but I think it can improve the strore density.

Just a quick glance,

I don't know how multiple swap configuration is popular?
With your approach, what kinds of change do we need in frontswap_invalidate_area?
You will add encoded *type* in offset of entry?
So we always should decode it when we need search opeartion?
We lose speed but get a density(? but not sure because it's dependent on workload)
for rare configuration(ie, multiple swap) and rare event(ie, swapoff).
It's just popped question, not strong objection.
Anyway, point is that you can try it if you want and then, report the number. :)

Thanks.

>
> Just for your reference, Thanks!
>
--
Kind regards,
Minchan Kim