2012-06-10 10:50:04

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 00/10] minor frontswap cleanups and tracing support

Most of these patches are minor cleanups to the mm/frontswap.c code, the big
chunk of new code can be attributed to the new tracing support.

Changes in v3:
- Fix merge error
- Commenct about new spinlock assertions

Changes in v2:
- Rebase to current version
- Address Konrad's comments

Sasha Levin (10):
mm: frontswap: remove casting from function calls through ops
structure
mm: frontswap: trivial coding convention issues
mm: frontswap: split out __frontswap_curr_pages
mm: frontswap: split out __frontswap_unuse_pages
mm: frontswap: split frontswap_shrink further to simplify locking
mm: frontswap: make all branches of if statement in put page
consistent
mm: frontswap: remove unnecessary check during initialization
mm: frontswap: add tracing support
mm: frontswap: split out function to clear a page out
mm: frontswap: remove unneeded headers

include/trace/events/frontswap.h | 167 ++++++++++++++++++++++++++++++++++++++
mm/frontswap.c | 162 +++++++++++++++++++++++-------------
2 files changed, 270 insertions(+), 59 deletions(-)
create mode 100644 include/trace/events/frontswap.h

--
1.7.8.6


2012-06-10 10:50:08

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 02/10] mm: frontswap: trivial coding convention issues

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 557e8af4..7ec53d5 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -148,8 +148,9 @@ int __frontswap_store(struct page *page)
frontswap_clear(sis, offset);
atomic_dec(&sis->frontswap_pages);
inc_frontswap_failed_stores();
- } else
+ } else {
inc_frontswap_failed_stores();
+ }
if (frontswap_writethrough_enabled)
/* report failure so swap also writes to swap device */
ret = -1;
@@ -250,9 +251,9 @@ void frontswap_shrink(unsigned long target_pages)
for (type = swap_list.head; type >= 0; type = si->next) {
si = swap_info[type];
si_frontswap_pages = atomic_read(&si->frontswap_pages);
- if (total_pages_to_unuse < si_frontswap_pages)
+ if (total_pages_to_unuse < si_frontswap_pages) {
pages = pages_to_unuse = total_pages_to_unuse;
- else {
+ } else {
pages = si_frontswap_pages;
pages_to_unuse = 0; /* unuse all */
}
--
1.7.8.6

2012-06-10 10:50:18

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 08/10] mm: frontswap: add tracing support

Add tracepoints to frontswap API.

Signed-off-by: Sasha Levin <[email protected]>
---
include/trace/events/frontswap.h | 167 ++++++++++++++++++++++++++++++++++++++
mm/frontswap.c | 14 +++
2 files changed, 181 insertions(+), 0 deletions(-)
create mode 100644 include/trace/events/frontswap.h

diff --git a/include/trace/events/frontswap.h b/include/trace/events/frontswap.h
new file mode 100644
index 0000000..2e5efab
--- /dev/null
+++ b/include/trace/events/frontswap.h
@@ -0,0 +1,167 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM frontswap
+
+#if !defined(_TRACE_FRONTSWAP_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_FRONTSWAP_H
+
+#include <linux/tracepoint.h>
+
+struct frontswap_ops;
+
+TRACE_EVENT(frontswap_init,
+ TP_PROTO(unsigned int type, void *sis, void *frontswap_map),
+ TP_ARGS(type, sis, frontswap_map),
+
+ TP_STRUCT__entry(
+ __field( unsigned int, type )
+ __field( void *, sis )
+ __field( void *, frontswap_map )
+ ),
+
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->sis = sis;
+ __entry->frontswap_map = frontswap_map;
+ ),
+
+ TP_printk("type: %u sis: %p frontswap_map: %p",
+ __entry->type, __entry->sis, __entry->frontswap_map)
+);
+
+TRACE_EVENT(frontswap_register_ops,
+ TP_PROTO(struct frontswap_ops *old, struct frontswap_ops *new),
+ TP_ARGS(old, new),
+
+ TP_STRUCT__entry(
+ __field(struct frontswap_ops *, old )
+ __field(struct frontswap_ops *, new )
+ ),
+
+ TP_fast_assign(
+ __entry->old = old;
+ __entry->new = new;
+ ),
+
+ TP_printk("old: {init=%p store=%p load=%p invalidate_page=%p invalidate_area=%p}"
+ " new: {init=%p store=%p load=%p invalidate_page=%p invalidate_area=%p}",
+ __entry->old->init,__entry->old->store,__entry->old->load,
+ __entry->old->invalidate_page,__entry->old->invalidate_area,__entry->new->init,
+ __entry->new->store,__entry->new->load,__entry->new->invalidate_page,
+ __entry->new->invalidate_area)
+);
+
+TRACE_EVENT(frontswap_store,
+ TP_PROTO(void *page, int dup, int ret),
+ TP_ARGS(page, dup, ret),
+
+ TP_STRUCT__entry(
+ __field( int, dup )
+ __field( int, ret )
+ __field( void *, page )
+ ),
+
+ TP_fast_assign(
+ __entry->dup = dup;
+ __entry->ret = ret;
+ __entry->page = page;
+ ),
+
+ TP_printk("page: %p dup: %d ret: %d",
+ __entry->page, __entry->dup, __entry->ret)
+);
+
+TRACE_EVENT(frontswap_load,
+ TP_PROTO(void *page, int ret),
+ TP_ARGS(page, ret),
+
+ TP_STRUCT__entry(
+ __field( int, ret )
+ __field( void *, page )
+ ),
+
+ TP_fast_assign(
+ __entry->ret = ret;
+ __entry->page = page;
+ ),
+
+ TP_printk("page: %p ret: %d",
+ __entry->page, __entry->ret)
+);
+
+TRACE_EVENT(frontswap_invalidate_page,
+ TP_PROTO(int type, unsigned long offset, void *sis, int test),
+ TP_ARGS(type, offset, sis, test),
+
+ TP_STRUCT__entry(
+ __field( int, type )
+ __field( unsigned long, offset )
+ __field( void *, sis )
+ __field( int, test )
+ ),
+
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->offset = offset;
+ __entry->sis = sis;
+ __entry->test = test;
+ ),
+
+ TP_printk("type: %d offset: %lu sys: %p frontswap_test: %d",
+ __entry->type, __entry->offset, __entry->sis, __entry->test)
+);
+
+TRACE_EVENT(frontswap_invalidate_area,
+ TP_PROTO(int type, void *sis, void *map),
+ TP_ARGS(type, sis, map),
+
+ TP_STRUCT__entry(
+ __field( int, type )
+ __field( void *, map )
+ __field( void *, sis )
+ ),
+
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->sis = sis;
+ __entry->map = map;
+ ),
+
+ TP_printk("type: %d sys: %p map: %p",
+ __entry->type, __entry->sis, __entry->map)
+);
+
+TRACE_EVENT(frontswap_curr_pages,
+ TP_PROTO(unsigned long totalpages),
+ TP_ARGS(totalpages),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, totalpages )
+ ),
+
+ TP_fast_assign(
+ __entry->totalpages = totalpages;
+ ),
+
+ TP_printk("total pages: %lu",
+ __entry->totalpages)
+);
+
+TRACE_EVENT(frontswap_shrink,
+ TP_PROTO(unsigned long target_pages),
+ TP_ARGS(target_pages),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, target_pages )
+ ),
+
+ TP_fast_assign(
+ __entry->target_pages = target_pages;
+ ),
+
+ TP_printk("target pages: %lu",
+ __entry->target_pages)
+);
+
+#endif /* _TRACE_FRONTSWAP_H */
+
+#include <trace/define_trace.h>
diff --git a/mm/frontswap.c b/mm/frontswap.c
index 7c26e89..7da55a3 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -11,6 +11,7 @@
* This work is licensed under the terms of the GNU GPL, version 2.
*/

+#define CREATE_TRACE_POINTS
#include <linux/mm.h>
#include <linux/mman.h>
#include <linux/swap.h>
@@ -23,6 +24,7 @@
#include <linux/debugfs.h>
#include <linux/frontswap.h>
#include <linux/swapfile.h>
+#include <trace/events/frontswap.h>

/*
* frontswap_ops is set by frontswap_register_ops to contain the pointers
@@ -85,6 +87,7 @@ struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
{
struct frontswap_ops old = frontswap_ops;

+ trace_frontswap_register_ops(&old, ops);
frontswap_ops = *ops;
frontswap_enabled = true;
return old;
@@ -108,6 +111,9 @@ void __frontswap_init(unsigned type)
struct swap_info_struct *sis = swap_info[type];

BUG_ON(sis == NULL);
+
+ trace_frontswap_init(type, sis, sis->frontswap_map);
+
if (sis->frontswap_map == NULL)
return;
frontswap_ops.init(type);
@@ -134,6 +140,7 @@ int __frontswap_store(struct page *page)
if (frontswap_test(sis, offset))
dup = 1;
ret = frontswap_ops.store(type, offset, page);
+ trace_frontswap_store(page, dup, ret);
if (ret == 0) {
frontswap_set(sis, offset);
inc_frontswap_succ_stores();
@@ -174,6 +181,7 @@ int __frontswap_load(struct page *page)
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
ret = frontswap_ops.load(type, offset, page);
+ trace_frontswap_load(page, ret);
if (ret == 0)
inc_frontswap_loads();
return ret;
@@ -189,6 +197,7 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
struct swap_info_struct *sis = swap_info[type];

BUG_ON(sis == NULL);
+ trace_frontswap_invalidate_page(type, offset, sis, frontswap_test(sis, offset));
if (frontswap_test(sis, offset)) {
frontswap_ops.invalidate_page(type, offset);
atomic_dec(&sis->frontswap_pages);
@@ -207,6 +216,7 @@ void __frontswap_invalidate_area(unsigned type)
struct swap_info_struct *sis = swap_info[type];

BUG_ON(sis == NULL);
+ trace_frontswap_invalidate_area(type, sis, sis->frontswap_map);
if (sis->frontswap_map == NULL)
return;
frontswap_ops.invalidate_area(type);
@@ -295,6 +305,8 @@ void frontswap_shrink(unsigned long target_pages)
unsigned long pages_to_unuse = 0;
int type, ret;

+ trace_frontswap_shrink(target_pages);
+
/*
* we don't want to hold swap_lock while doing a very
* lengthy try_to_unuse, but swap_list may change
@@ -322,6 +334,8 @@ unsigned long frontswap_curr_pages(void)
totalpages = __frontswap_curr_pages();
spin_unlock(&swap_lock);

+ trace_frontswap_curr_pages(totalpages);
+
return totalpages;
}
EXPORT_SYMBOL(frontswap_curr_pages);
--
1.7.8.6

2012-06-10 10:50:23

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 09/10] mm: frontswap: split out function to clear a page out

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 7da55a3..c056f6e 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -120,6 +120,12 @@ void __frontswap_init(unsigned type)
}
EXPORT_SYMBOL(__frontswap_init);

+static inline void __frontswap_clear(struct swap_info_struct *sis, pgoff_t offset)
+{
+ frontswap_clear(sis, offset);
+ atomic_dec(&sis->frontswap_pages);
+}
+
/*
* "Store" data from a page to frontswap and associate it with the page's
* swaptype and offset. Page must be locked and in the swap cache.
@@ -152,10 +158,8 @@ int __frontswap_store(struct page *page)
the (older) page from frontswap
*/
inc_frontswap_failed_stores();
- if (dup) {
- frontswap_clear(sis, offset);
- atomic_dec(&sis->frontswap_pages);
- }
+ if (dup)
+ __frontswap_clear(sis, offset);
}
if (frontswap_writethrough_enabled)
/* report failure so swap also writes to swap device */
@@ -200,8 +204,7 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
trace_frontswap_invalidate_page(type, offset, sis, frontswap_test(sis, offset));
if (frontswap_test(sis, offset)) {
frontswap_ops.invalidate_page(type, offset);
- atomic_dec(&sis->frontswap_pages);
- frontswap_clear(sis, offset);
+ __frontswap_clear(sis, offset);
inc_frontswap_invalidates();
}
}
--
1.7.8.6

2012-06-10 10:50:30

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 10/10] mm: frontswap: remove unneeded headers

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index c056f6e..9b667a4 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -12,15 +12,11 @@
*/

#define CREATE_TRACE_POINTS
-#include <linux/mm.h>
#include <linux/mman.h>
#include <linux/swap.h>
#include <linux/swapops.h>
-#include <linux/proc_fs.h>
#include <linux/security.h>
-#include <linux/capability.h>
#include <linux/module.h>
-#include <linux/uaccess.h>
#include <linux/debugfs.h>
#include <linux/frontswap.h>
#include <linux/swapfile.h>
--
1.7.8.6

2012-06-10 10:50:15

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages

An attempt at making frontswap_shrink shorter and more readable. This patch
splits out walking through the swap list to find an entry with enough
pages to unuse.

Also, assert that the internal __frontswap_unuse_pages is called under swap
lock, since that part of code was previously directly happen inside the lock.

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 59 +++++++++++++++++++++++++++++++++++++-------------------
1 files changed, 39 insertions(+), 20 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 5faf840..faa43b7 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -230,6 +230,41 @@ static unsigned long __frontswap_curr_pages(void)
return totalpages;
}

+static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
+ int *swapid)
+{
+ int ret = -EINVAL;
+ struct swap_info_struct *si = NULL;
+ int si_frontswap_pages;
+ unsigned long total_pages_to_unuse = total;
+ unsigned long pages = 0, pages_to_unuse = 0;
+ int type;
+
+ assert_spin_locked(&swap_lock);
+ for (type = swap_list.head; type >= 0; type = si->next) {
+ si = swap_info[type];
+ si_frontswap_pages = atomic_read(&si->frontswap_pages);
+ if (total_pages_to_unuse < si_frontswap_pages) {
+ pages = pages_to_unuse = total_pages_to_unuse;
+ } else {
+ pages = si_frontswap_pages;
+ pages_to_unuse = 0; /* unuse all */
+ }
+ /* ensure there is enough RAM to fetch pages from frontswap */
+ if (security_vm_enough_memory_mm(current->mm, pages)) {
+ ret = -ENOMEM;
+ continue;
+ }
+ vm_unacct_memory(pages);
+ *unused = pages_to_unuse;
+ *swapid = type;
+ ret = 0;
+ break;
+ }
+
+ return ret;
+}
+
/*
* Frontswap, like a true swap device, may unnecessarily retain pages
* under certain circumstances; "shrink" frontswap is essentially a
@@ -240,11 +275,9 @@ static unsigned long __frontswap_curr_pages(void)
*/
void frontswap_shrink(unsigned long target_pages)
{
- struct swap_info_struct *si = NULL;
- int si_frontswap_pages;
unsigned long total_pages = 0, total_pages_to_unuse;
- unsigned long pages = 0, pages_to_unuse = 0;
- int type;
+ unsigned long pages_to_unuse = 0;
+ int type, ret;
bool locked = false;

/*
@@ -258,22 +291,8 @@ void frontswap_shrink(unsigned long target_pages)
if (total_pages <= target_pages)
goto out;
total_pages_to_unuse = total_pages - target_pages;
- for (type = swap_list.head; type >= 0; type = si->next) {
- si = swap_info[type];
- si_frontswap_pages = atomic_read(&si->frontswap_pages);
- if (total_pages_to_unuse < si_frontswap_pages) {
- pages = pages_to_unuse = total_pages_to_unuse;
- } else {
- pages = si_frontswap_pages;
- pages_to_unuse = 0; /* unuse all */
- }
- /* ensure there is enough RAM to fetch pages from frontswap */
- if (security_vm_enough_memory_mm(current->mm, pages))
- continue;
- vm_unacct_memory(pages);
- break;
- }
- if (type < 0)
+ ret = __frontswap_unuse_pages(total_pages_to_unuse, &pages_to_unuse, &type);
+ if (ret < 0)
goto out;
locked = false;
spin_unlock(&swap_lock);
--
1.7.8.6

2012-06-10 10:51:18

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 07/10] mm: frontswap: remove unnecessary check during initialization

The check whether frontswap is enabled or not is done in the API functions in
the frontswap header, before they are passed to the internal
double-underscored frontswap functions.

Remove the check from __frontswap_init for consistency.

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index d8dc986..7c26e89 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -110,8 +110,7 @@ void __frontswap_init(unsigned type)
BUG_ON(sis == NULL);
if (sis->frontswap_map == NULL)
return;
- if (frontswap_enabled)
- frontswap_ops.init(type);
+ frontswap_ops.init(type);
}
EXPORT_SYMBOL(__frontswap_init);

--
1.7.8.6

2012-06-10 10:50:13

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 05/10] mm: frontswap: split frontswap_shrink further to simplify locking

Split frontswap_shrink to simplify the locking in the original code.

Also, assert that the function that was split still runs under the
swap spinlock.

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 36 +++++++++++++++++++++---------------
1 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index faa43b7..e6353d9 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -265,6 +265,24 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
return ret;
}

+static int __frontswap_shrink(unsigned long target_pages,
+ unsigned long *pages_to_unuse,
+ int *type)
+{
+ unsigned long total_pages = 0, total_pages_to_unuse;
+
+ assert_spin_locked(&swap_lock);
+
+ total_pages = __frontswap_curr_pages();
+ if (total_pages <= target_pages) {
+ /* Nothing to do */
+ *pages_to_unuse = 0;
+ return 0;
+ }
+ total_pages_to_unuse = total_pages - target_pages;
+ return __frontswap_unuse_pages(total_pages_to_unuse, pages_to_unuse, type);
+}
+
/*
* Frontswap, like a true swap device, may unnecessarily retain pages
* under certain circumstances; "shrink" frontswap is essentially a
@@ -275,10 +293,8 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
*/
void frontswap_shrink(unsigned long target_pages)
{
- unsigned long total_pages = 0, total_pages_to_unuse;
unsigned long pages_to_unuse = 0;
int type, ret;
- bool locked = false;

/*
* we don't want to hold swap_lock while doing a very
@@ -286,20 +302,10 @@ void frontswap_shrink(unsigned long target_pages)
* so restart scan from swap_list.head each time
*/
spin_lock(&swap_lock);
- locked = true;
- total_pages = __frontswap_curr_pages();
- if (total_pages <= target_pages)
- goto out;
- total_pages_to_unuse = total_pages - target_pages;
- ret = __frontswap_unuse_pages(total_pages_to_unuse, &pages_to_unuse, &type);
- if (ret < 0)
- goto out;
- locked = false;
+ ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
spin_unlock(&swap_lock);
- try_to_unuse(type, true, pages_to_unuse);
-out:
- if (locked)
- spin_unlock(&swap_lock);
+ if (ret == 0 && pages_to_unuse)
+ try_to_unuse(type, true, pages_to_unuse);
return;
}
EXPORT_SYMBOL(frontswap_shrink);
--
1.7.8.6

2012-06-10 10:51:48

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 06/10] mm: frontswap: make all branches of if statement in put page consistent

Currently it has a complex structure where different things are compared
at each branch. Simplify that and make both branches look similar.

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index e6353d9..d8dc986 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -140,16 +140,16 @@ int __frontswap_store(struct page *page)
inc_frontswap_succ_stores();
if (!dup)
atomic_inc(&sis->frontswap_pages);
- } else if (dup) {
+ } else {
/*
failed dup always results in automatic invalidate of
the (older) page from frontswap
*/
- frontswap_clear(sis, offset);
- atomic_dec(&sis->frontswap_pages);
- inc_frontswap_failed_stores();
- } else {
inc_frontswap_failed_stores();
+ if (dup) {
+ frontswap_clear(sis, offset);
+ atomic_dec(&sis->frontswap_pages);
+ }
}
if (frontswap_writethrough_enabled)
/* report failure so swap also writes to swap device */
--
1.7.8.6

2012-06-10 10:52:08

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 03/10] mm: frontswap: split out __frontswap_curr_pages

Code was duplicated in two functions, clean it up.

Also, assert that the deduplicated code runs under the swap spinlock.

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 28 +++++++++++++++++-----------
1 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 7ec53d5..5faf840 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -216,6 +216,20 @@ void __frontswap_invalidate_area(unsigned type)
}
EXPORT_SYMBOL(__frontswap_invalidate_area);

+static unsigned long __frontswap_curr_pages(void)
+{
+ int type;
+ unsigned long totalpages = 0;
+ struct swap_info_struct *si = NULL;
+
+ assert_spin_locked(&swap_lock);
+ for (type = swap_list.head; type >= 0; type = si->next) {
+ si = swap_info[type];
+ totalpages += atomic_read(&si->frontswap_pages);
+ }
+ return totalpages;
+}
+
/*
* Frontswap, like a true swap device, may unnecessarily retain pages
* under certain circumstances; "shrink" frontswap is essentially a
@@ -240,11 +254,7 @@ void frontswap_shrink(unsigned long target_pages)
*/
spin_lock(&swap_lock);
locked = true;
- total_pages = 0;
- for (type = swap_list.head; type >= 0; type = si->next) {
- si = swap_info[type];
- total_pages += atomic_read(&si->frontswap_pages);
- }
+ total_pages = __frontswap_curr_pages();
if (total_pages <= target_pages)
goto out;
total_pages_to_unuse = total_pages - target_pages;
@@ -282,16 +292,12 @@ EXPORT_SYMBOL(frontswap_shrink);
*/
unsigned long frontswap_curr_pages(void)
{
- int type;
unsigned long totalpages = 0;
- struct swap_info_struct *si = NULL;

spin_lock(&swap_lock);
- for (type = swap_list.head; type >= 0; type = si->next) {
- si = swap_info[type];
- totalpages += atomic_read(&si->frontswap_pages);
- }
+ totalpages = __frontswap_curr_pages();
spin_unlock(&swap_lock);
+
return totalpages;
}
EXPORT_SYMBOL(frontswap_curr_pages);
--
1.7.8.6

2012-06-10 10:52:32

by Sasha Levin

[permalink] [raw]
Subject: [PATCH v3 01/10] mm: frontswap: remove casting from function calls through ops structure

Removes unneeded casts.

Signed-off-by: Sasha Levin <[email protected]>
---
mm/frontswap.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index e250255..557e8af4 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -111,7 +111,7 @@ void __frontswap_init(unsigned type)
if (sis->frontswap_map == NULL)
return;
if (frontswap_enabled)
- (*frontswap_ops.init)(type);
+ frontswap_ops.init(type);
}
EXPORT_SYMBOL(__frontswap_init);

@@ -134,7 +134,7 @@ int __frontswap_store(struct page *page)
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
dup = 1;
- ret = (*frontswap_ops.store)(type, offset, page);
+ ret = frontswap_ops.store(type, offset, page);
if (ret == 0) {
frontswap_set(sis, offset);
inc_frontswap_succ_stores();
@@ -173,7 +173,7 @@ int __frontswap_load(struct page *page)
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
- ret = (*frontswap_ops.load)(type, offset, page);
+ ret = frontswap_ops.load(type, offset, page);
if (ret == 0)
inc_frontswap_loads();
return ret;
@@ -190,7 +190,7 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)

BUG_ON(sis == NULL);
if (frontswap_test(sis, offset)) {
- (*frontswap_ops.invalidate_page)(type, offset);
+ frontswap_ops.invalidate_page(type, offset);
atomic_dec(&sis->frontswap_pages);
frontswap_clear(sis, offset);
inc_frontswap_invalidates();
@@ -209,7 +209,7 @@ void __frontswap_invalidate_area(unsigned type)
BUG_ON(sis == NULL);
if (sis->frontswap_map == NULL)
return;
- (*frontswap_ops.invalidate_area)(type);
+ frontswap_ops.invalidate_area(type);
atomic_set(&sis->frontswap_pages, 0);
memset(sis->frontswap_map, 0, sis->max / sizeof(long));
}
--
1.7.8.6

2012-06-11 05:21:03

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 01/10] mm: frontswap: remove casting from function calls through ops structure

On 06/10/2012 07:50 PM, Sasha Levin wrote:

> Removes unneeded casts.
>
> Signed-off-by: Sasha Levin <[email protected]>

Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2012-06-11 05:24:45

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 02/10] mm: frontswap: trivial coding convention issues

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> Signed-off-by: Sasha Levin <[email protected]>

Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2012-06-11 05:28:55

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 03/10] mm: frontswap: split out __frontswap_curr_pages

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> Code was duplicated in two functions, clean it up.
>
> Also, assert that the deduplicated code runs under the swap spinlock.
>
> Signed-off-by: Sasha Levin <[email protected]>

Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2012-06-11 05:43:11

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> An attempt at making frontswap_shrink shorter and more readable. This patch
> splits out walking through the swap list to find an entry with enough
> pages to unuse.
>
> Also, assert that the internal __frontswap_unuse_pages is called under swap
> lock, since that part of code was previously directly happen inside the lock.
>
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> mm/frontswap.c | 59 +++++++++++++++++++++++++++++++++++++-------------------
> 1 files changed, 39 insertions(+), 20 deletions(-)
>
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index 5faf840..faa43b7 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -230,6 +230,41 @@ static unsigned long __frontswap_curr_pages(void)
> return totalpages;
> }
>
> +static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> + int *swapid)


Normally, we use "unsigned int type" instead of swapid.
I admit the naming is rather awkward but that should be another patch.
So let's keep consistency with swap subsystem.

> +{
> + int ret = -EINVAL;
> + struct swap_info_struct *si = NULL;
> + int si_frontswap_pages;
> + unsigned long total_pages_to_unuse = total;
> + unsigned long pages = 0, pages_to_unuse = 0;
> + int type;
> +
> + assert_spin_locked(&swap_lock);


Normally, we should use this assertion when we can't find swap_lock is hold or not easily
by complicated call depth or unexpected use-case like general function.
But I expect this function's caller is very limited, not complicated.
Just comment write down isn't enough?


> + for (type = swap_list.head; type >= 0; type = si->next) {
> + si = swap_info[type];
> + si_frontswap_pages = atomic_read(&si->frontswap_pages);
> + if (total_pages_to_unuse < si_frontswap_pages) {
> + pages = pages_to_unuse = total_pages_to_unuse;
> + } else {
> + pages = si_frontswap_pages;
> + pages_to_unuse = 0; /* unuse all */
> + }
> + /* ensure there is enough RAM to fetch pages from frontswap */
> + if (security_vm_enough_memory_mm(current->mm, pages)) {
> + ret = -ENOMEM;


Nipick:
I am not sure detailed error returning would be good.
Caller doesn't matter it now but it can consider it in future.
Hmm,

> + continue;
> + }
> + vm_unacct_memory(pages);
> + *unused = pages_to_unuse;
> + *swapid = type;
> + ret = 0;
> + break;
> + }
> +
> + return ret;
> +}
> +
> /*
> * Frontswap, like a true swap device, may unnecessarily retain pages
> * under certain circumstances; "shrink" frontswap is essentially a
> @@ -240,11 +275,9 @@ static unsigned long __frontswap_curr_pages(void)
> */
> void frontswap_shrink(unsigned long target_pages)
> {
> - struct swap_info_struct *si = NULL;
> - int si_frontswap_pages;
> unsigned long total_pages = 0, total_pages_to_unuse;
> - unsigned long pages = 0, pages_to_unuse = 0;
> - int type;
> + unsigned long pages_to_unuse = 0;
> + int type, ret;
> bool locked = false;
>
> /*
> @@ -258,22 +291,8 @@ void frontswap_shrink(unsigned long target_pages)
> if (total_pages <= target_pages)
> goto out;
> total_pages_to_unuse = total_pages - target_pages;
> - for (type = swap_list.head; type >= 0; type = si->next) {
> - si = swap_info[type];
> - si_frontswap_pages = atomic_read(&si->frontswap_pages);
> - if (total_pages_to_unuse < si_frontswap_pages) {
> - pages = pages_to_unuse = total_pages_to_unuse;
> - } else {
> - pages = si_frontswap_pages;
> - pages_to_unuse = 0; /* unuse all */
> - }
> - /* ensure there is enough RAM to fetch pages from frontswap */
> - if (security_vm_enough_memory_mm(current->mm, pages))
> - continue;
> - vm_unacct_memory(pages);
> - break;
> - }
> - if (type < 0)
> + ret = __frontswap_unuse_pages(total_pages_to_unuse, &pages_to_unuse, &type);
> + if (ret < 0)
> goto out;
> locked = false;
> spin_unlock(&swap_lock);



--
Kind regards,
Minchan Kim

2012-06-11 05:49:10

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 05/10] mm: frontswap: split frontswap_shrink further to simplify locking

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> Split frontswap_shrink to simplify the locking in the original code.
>
> Also, assert that the function that was split still runs under the
> swap spinlock.
>
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> mm/frontswap.c | 36 +++++++++++++++++++++---------------
> 1 files changed, 21 insertions(+), 15 deletions(-)
>
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index faa43b7..e6353d9 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -265,6 +265,24 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> return ret;
> }
>
> +static int __frontswap_shrink(unsigned long target_pages,
> + unsigned long *pages_to_unuse,
> + int *type)


__frontswap_shrink isn't good name.
This function doesn't shrink at all.

How about __frontswap_shrink_pages with description of function?


> +{
> + unsigned long total_pages = 0, total_pages_to_unuse;
> +
> + assert_spin_locked(&swap_lock);


About assertion, it's ditto with my previous reply.

> +
> + total_pages = __frontswap_curr_pages();
> + if (total_pages <= target_pages) {
> + /* Nothing to do */
> + *pages_to_unuse = 0;
> + return 0;
> + }
> + total_pages_to_unuse = total_pages - target_pages;
> + return __frontswap_unuse_pages(total_pages_to_unuse, pages_to_unuse, type);
> +}
> +
> /*
> * Frontswap, like a true swap device, may unnecessarily retain pages
> * under certain circumstances; "shrink" frontswap is essentially a
> @@ -275,10 +293,8 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> */
> void frontswap_shrink(unsigned long target_pages)
> {
> - unsigned long total_pages = 0, total_pages_to_unuse;
> unsigned long pages_to_unuse = 0;
> int type, ret;
> - bool locked = false;
>
> /*
> * we don't want to hold swap_lock while doing a very
> @@ -286,20 +302,10 @@ void frontswap_shrink(unsigned long target_pages)
> * so restart scan from swap_list.head each time
> */
> spin_lock(&swap_lock);
> - locked = true;
> - total_pages = __frontswap_curr_pages();
> - if (total_pages <= target_pages)
> - goto out;
> - total_pages_to_unuse = total_pages - target_pages;
> - ret = __frontswap_unuse_pages(total_pages_to_unuse, &pages_to_unuse, &type);
> - if (ret < 0)
> - goto out;
> - locked = false;
> + ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> spin_unlock(&swap_lock);
> - try_to_unuse(type, true, pages_to_unuse);
> -out:
> - if (locked)
> - spin_unlock(&swap_lock);
> + if (ret == 0 && pages_to_unuse)
> + try_to_unuse(type, true, pages_to_unuse);
> return;
> }
> EXPORT_SYMBOL(frontswap_shrink);



--
Kind regards,
Minchan Kim

2012-06-11 05:52:13

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 06/10] mm: frontswap: make all branches of if statement in put page consistent

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> Currently it has a complex structure where different things are compared
> at each branch. Simplify that and make both branches look similar.
>
> Signed-off-by: Sasha Levin <[email protected]>

Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2012-06-11 05:54:26

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 07/10] mm: frontswap: remove unnecessary check during initialization

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> The check whether frontswap is enabled or not is done in the API functions in
> the frontswap header, before they are passed to the internal
> double-underscored frontswap functions.
>
> Remove the check from __frontswap_init for consistency.

>

> Signed-off-by: Sasha Levin <[email protected]>

Reviewed-by: Minchan Kim <[email protected]>

--
Kind regards,
Minchan Kim

2012-06-11 06:12:41

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 08/10] mm: frontswap: add tracing support

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> Add tracepoints to frontswap API.
>
> Signed-off-by: Sasha Levin <[email protected]>


Normally, adding new tracepoint isn't easy without special reason.
I'm not sure all of frontswap function tracing would be valuable.
Shsha, Why do you want to add tracing?
What's scenario you want to use tracing?

> ---
> include/trace/events/frontswap.h | 167 ++++++++++++++++++++++++++++++++++++++
> mm/frontswap.c | 14 +++
> 2 files changed, 181 insertions(+), 0 deletions(-)
> create mode 100644 include/trace/events/frontswap.h
>
> diff --git a/include/trace/events/frontswap.h b/include/trace/events/frontswap.h
> new file mode 100644
> index 0000000..2e5efab
> --- /dev/null
> +++ b/include/trace/events/frontswap.h
> @@ -0,0 +1,167 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM frontswap
> +
> +#if !defined(_TRACE_FRONTSWAP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_FRONTSWAP_H
> +
> +#include <linux/tracepoint.h>
> +
> +struct frontswap_ops;
> +
> +TRACE_EVENT(frontswap_init,
> + TP_PROTO(unsigned int type, void *sis, void *frontswap_map),
> + TP_ARGS(type, sis, frontswap_map),
> +
> + TP_STRUCT__entry(
> + __field( unsigned int, type )
> + __field( void *, sis )
> + __field( void *, frontswap_map )
> + ),
> +
> + TP_fast_assign(
> + __entry->type = type;
> + __entry->sis = sis;
> + __entry->frontswap_map = frontswap_map;
> + ),
> +
> + TP_printk("type: %u sis: %p frontswap_map: %p",
> + __entry->type, __entry->sis, __entry->frontswap_map)
> +);
> +
> +TRACE_EVENT(frontswap_register_ops,
> + TP_PROTO(struct frontswap_ops *old, struct frontswap_ops *new),
> + TP_ARGS(old, new),
> +
> + TP_STRUCT__entry(
> + __field(struct frontswap_ops *, old )
> + __field(struct frontswap_ops *, new )
> + ),
> +
> + TP_fast_assign(
> + __entry->old = old;
> + __entry->new = new;
> + ),
> +
> + TP_printk("old: {init=%p store=%p load=%p invalidate_page=%p invalidate_area=%p}"
> + " new: {init=%p store=%p load=%p invalidate_page=%p invalidate_area=%p}",
> + __entry->old->init,__entry->old->store,__entry->old->load,
> + __entry->old->invalidate_page,__entry->old->invalidate_area,__entry->new->init,
> + __entry->new->store,__entry->new->load,__entry->new->invalidate_page,
> + __entry->new->invalidate_area)
> +);
> +
> +TRACE_EVENT(frontswap_store,
> + TP_PROTO(void *page, int dup, int ret),
> + TP_ARGS(page, dup, ret),
> +
> + TP_STRUCT__entry(
> + __field( int, dup )
> + __field( int, ret )
> + __field( void *, page )
> + ),
> +
> + TP_fast_assign(
> + __entry->dup = dup;
> + __entry->ret = ret;
> + __entry->page = page;
> + ),
> +
> + TP_printk("page: %p dup: %d ret: %d",
> + __entry->page, __entry->dup, __entry->ret)
> +);
> +
> +TRACE_EVENT(frontswap_load,
> + TP_PROTO(void *page, int ret),
> + TP_ARGS(page, ret),
> +
> + TP_STRUCT__entry(
> + __field( int, ret )
> + __field( void *, page )
> + ),
> +
> + TP_fast_assign(
> + __entry->ret = ret;
> + __entry->page = page;
> + ),
> +
> + TP_printk("page: %p ret: %d",
> + __entry->page, __entry->ret)
> +);
> +
> +TRACE_EVENT(frontswap_invalidate_page,
> + TP_PROTO(int type, unsigned long offset, void *sis, int test),
> + TP_ARGS(type, offset, sis, test),
> +
> + TP_STRUCT__entry(
> + __field( int, type )
> + __field( unsigned long, offset )
> + __field( void *, sis )
> + __field( int, test )
> + ),
> +
> + TP_fast_assign(
> + __entry->type = type;
> + __entry->offset = offset;
> + __entry->sis = sis;
> + __entry->test = test;
> + ),
> +
> + TP_printk("type: %d offset: %lu sys: %p frontswap_test: %d",
> + __entry->type, __entry->offset, __entry->sis, __entry->test)
> +);
> +
> +TRACE_EVENT(frontswap_invalidate_area,
> + TP_PROTO(int type, void *sis, void *map),
> + TP_ARGS(type, sis, map),
> +
> + TP_STRUCT__entry(
> + __field( int, type )
> + __field( void *, map )
> + __field( void *, sis )
> + ),
> +
> + TP_fast_assign(
> + __entry->type = type;
> + __entry->sis = sis;
> + __entry->map = map;
> + ),
> +
> + TP_printk("type: %d sys: %p map: %p",
> + __entry->type, __entry->sis, __entry->map)
> +);
> +
> +TRACE_EVENT(frontswap_curr_pages,
> + TP_PROTO(unsigned long totalpages),
> + TP_ARGS(totalpages),
> +
> + TP_STRUCT__entry(
> + __field(unsigned long, totalpages )
> + ),
> +
> + TP_fast_assign(
> + __entry->totalpages = totalpages;
> + ),
> +
> + TP_printk("total pages: %lu",
> + __entry->totalpages)
> +);
> +
> +TRACE_EVENT(frontswap_shrink,
> + TP_PROTO(unsigned long target_pages),
> + TP_ARGS(target_pages),
> +
> + TP_STRUCT__entry(
> + __field(unsigned long, target_pages )
> + ),
> +
> + TP_fast_assign(
> + __entry->target_pages = target_pages;
> + ),
> +
> + TP_printk("target pages: %lu",
> + __entry->target_pages)
> +);
> +
> +#endif /* _TRACE_FRONTSWAP_H */
> +
> +#include <trace/define_trace.h>
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index 7c26e89..7da55a3 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -11,6 +11,7 @@
> * This work is licensed under the terms of the GNU GPL, version 2.
> */
>
> +#define CREATE_TRACE_POINTS
> #include <linux/mm.h>
> #include <linux/mman.h>
> #include <linux/swap.h>
> @@ -23,6 +24,7 @@
> #include <linux/debugfs.h>
> #include <linux/frontswap.h>
> #include <linux/swapfile.h>
> +#include <trace/events/frontswap.h>
>
> /*
> * frontswap_ops is set by frontswap_register_ops to contain the pointers
> @@ -85,6 +87,7 @@ struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
> {
> struct frontswap_ops old = frontswap_ops;
>
> + trace_frontswap_register_ops(&old, ops);
> frontswap_ops = *ops;
> frontswap_enabled = true;
> return old;
> @@ -108,6 +111,9 @@ void __frontswap_init(unsigned type)
> struct swap_info_struct *sis = swap_info[type];
>
> BUG_ON(sis == NULL);
> +
> + trace_frontswap_init(type, sis, sis->frontswap_map);
> +
> if (sis->frontswap_map == NULL)
> return;
> frontswap_ops.init(type);
> @@ -134,6 +140,7 @@ int __frontswap_store(struct page *page)
> if (frontswap_test(sis, offset))
> dup = 1;
> ret = frontswap_ops.store(type, offset, page);
> + trace_frontswap_store(page, dup, ret);
> if (ret == 0) {
> frontswap_set(sis, offset);
> inc_frontswap_succ_stores();
> @@ -174,6 +181,7 @@ int __frontswap_load(struct page *page)
> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset))
> ret = frontswap_ops.load(type, offset, page);
> + trace_frontswap_load(page, ret);
> if (ret == 0)
> inc_frontswap_loads();
> return ret;
> @@ -189,6 +197,7 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
> struct swap_info_struct *sis = swap_info[type];
>
> BUG_ON(sis == NULL);
> + trace_frontswap_invalidate_page(type, offset, sis, frontswap_test(sis, offset));
> if (frontswap_test(sis, offset)) {
> frontswap_ops.invalidate_page(type, offset);
> atomic_dec(&sis->frontswap_pages);
> @@ -207,6 +216,7 @@ void __frontswap_invalidate_area(unsigned type)
> struct swap_info_struct *sis = swap_info[type];
>
> BUG_ON(sis == NULL);
> + trace_frontswap_invalidate_area(type, sis, sis->frontswap_map);
> if (sis->frontswap_map == NULL)
> return;
> frontswap_ops.invalidate_area(type);
> @@ -295,6 +305,8 @@ void frontswap_shrink(unsigned long target_pages)
> unsigned long pages_to_unuse = 0;
> int type, ret;
>
> + trace_frontswap_shrink(target_pages);
> +
> /*
> * we don't want to hold swap_lock while doing a very
> * lengthy try_to_unuse, but swap_list may change
> @@ -322,6 +334,8 @@ unsigned long frontswap_curr_pages(void)
> totalpages = __frontswap_curr_pages();
> spin_unlock(&swap_lock);
>
> + trace_frontswap_curr_pages(totalpages);
> +
> return totalpages;
> }
> EXPORT_SYMBOL(frontswap_curr_pages);



--
Kind regards,
Minchan Kim

2012-06-11 06:16:11

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v3 09/10] mm: frontswap: split out function to clear a page out

On 06/10/2012 07:51 PM, Sasha Levin wrote:

> Signed-off-by: Sasha Levin <[email protected]>
> ---
> mm/frontswap.c | 15 +++++++++------
> 1 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index 7da55a3..c056f6e 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -120,6 +120,12 @@ void __frontswap_init(unsigned type)
> }
> EXPORT_SYMBOL(__frontswap_init);
>
> +static inline void __frontswap_clear(struct swap_info_struct *sis, pgoff_t offset)
> +{
> + frontswap_clear(sis, offset);
> + atomic_dec(&sis->frontswap_pages);
> +}


Nipick:
Strange, Normally, NOT underscore function calls underscore function.
But this is opposite. :(

--
Kind regards,
Minchan Kim

2012-06-11 08:33:12

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH v3 08/10] mm: frontswap: add tracing support

On 06/10/2012 07:51 PM, Sasha Levin wrote:
>> Add tracepoints to frontswap API.
>>
>> Signed-off-by: Sasha Levin <[email protected]>

On Mon, Jun 11, 2012 at 9:12 AM, Minchan Kim <[email protected]> wrote:
> Normally, adding new tracepoint isn't easy without special reason.
> I'm not sure all of frontswap function tracing would be valuable.
> Shsha, Why do you want to add tracing?
> What's scenario you want to use tracing?

Yup, the added tracepoints look more like function tracing. Shouldn't
you use something like kprobes or ftrace/perf for this?

2012-06-11 10:29:41

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages

On Mon, 2012-06-11 at 14:43 +0900, Minchan Kim wrote:
> On 06/10/2012 07:51 PM, Sasha Levin wrote:
>
> > An attempt at making frontswap_shrink shorter and more readable. This patch
> > splits out walking through the swap list to find an entry with enough
> > pages to unuse.
> >
> > Also, assert that the internal __frontswap_unuse_pages is called under swap
> > lock, since that part of code was previously directly happen inside the lock.
> >
> > Signed-off-by: Sasha Levin <[email protected]>
> > ---
> > mm/frontswap.c | 59 +++++++++++++++++++++++++++++++++++++-------------------
> > 1 files changed, 39 insertions(+), 20 deletions(-)
> >
> > diff --git a/mm/frontswap.c b/mm/frontswap.c
> > index 5faf840..faa43b7 100644
> > --- a/mm/frontswap.c
> > +++ b/mm/frontswap.c
> > @@ -230,6 +230,41 @@ static unsigned long __frontswap_curr_pages(void)
> > return totalpages;
> > }
> >
> > +static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> > + int *swapid)
>
>
> Normally, we use "unsigned int type" instead of swapid.
> I admit the naming is rather awkward but that should be another patch.
> So let's keep consistency with swap subsystem.

I was staying consistent with the naming in mm/frontswap.c. I'll add an
extra patch to modify it to be similar to what's being used in the rest
of the swap subsystem.

> > +{
> > + int ret = -EINVAL;
> > + struct swap_info_struct *si = NULL;
> > + int si_frontswap_pages;
> > + unsigned long total_pages_to_unuse = total;
> > + unsigned long pages = 0, pages_to_unuse = 0;
> > + int type;
> > +
> > + assert_spin_locked(&swap_lock);
>
>
> Normally, we should use this assertion when we can't find swap_lock is hold or not easily
> by complicated call depth or unexpected use-case like general function.
> But I expect this function's caller is very limited, not complicated.
> Just comment write down isn't enough?

Is there a reason not to do it though? Debugging a case where this
function is called without a swaplock and causes corruption won't be
easy.

> > + for (type = swap_list.head; type >= 0; type = si->next) {
> > + si = swap_info[type];
> > + si_frontswap_pages = atomic_read(&si->frontswap_pages);
> > + if (total_pages_to_unuse < si_frontswap_pages) {
> > + pages = pages_to_unuse = total_pages_to_unuse;
> > + } else {
> > + pages = si_frontswap_pages;
> > + pages_to_unuse = 0; /* unuse all */
> > + }
> > + /* ensure there is enough RAM to fetch pages from frontswap */
> > + if (security_vm_enough_memory_mm(current->mm, pages)) {
> > + ret = -ENOMEM;
>
>
> Nipick:
> I am not sure detailed error returning would be good.
> Caller doesn't matter it now but it can consider it in future.
> Hmm,

Is there a reason to avoid returning a meaningful error when it's pretty
easy?

2012-06-11 10:38:04

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH v3 08/10] mm: frontswap: add tracing support

On Mon, 2012-06-11 at 11:33 +0300, Pekka Enberg wrote:
> On 06/10/2012 07:51 PM, Sasha Levin wrote:
> >> Add tracepoints to frontswap API.
> >>
> >> Signed-off-by: Sasha Levin <[email protected]>
>
> On Mon, Jun 11, 2012 at 9:12 AM, Minchan Kim <[email protected]> wrote:
> > Normally, adding new tracepoint isn't easy without special reason.
> > I'm not sure all of frontswap function tracing would be valuable.
> > Shsha, Why do you want to add tracing?
> > What's scenario you want to use tracing?

I added tracing when working on code to integrate KVM with
frontswap/cleancache and needed to see that the flow of code between
host side kvm and zcache and guest side cleancache, frontswap and kvm is
correct.

> Yup, the added tracepoints look more like function tracing. Shouldn't
> you use something like kprobes or ftrace/perf for this?

I'm not sure really, there are quite a few options provided by the
kernel...

I used tracepoints because I was working on code that integrates with
KVM, and saw that KVM was working with tracepoints in a very similar way
to what I needed, so I assumed tracepoints is the right choice for me.

2012-06-11 14:28:17

by Dan Magenheimer

[permalink] [raw]
Subject: RE: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages

> From: Sasha Levin [mailto:[email protected]]
> Subject: Re: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages
>
> > > + assert_spin_locked(&swap_lock);
> >
> > Normally, we should use this assertion when we can't find swap_lock is hold or not easily
> > by complicated call depth or unexpected use-case like general function.
> > But I expect this function's caller is very limited, not complicated.
> > Just comment write down isn't enough?
>
> Is there a reason not to do it though? Debugging a case where this
> function is called without a swaplock and causes corruption won't be
> easy.

I'm not sure of the correct kernel style but I like the fact
that assert_spin_locked both documents the lock requirement and tests
it at runtime.

I don't know the correct kernel syntax but is it possible
to make this code be functional when the kernel "debug"
option is on, but a no-op when "debug" is disabled?
IMHO, that would be the ideal solution.

> > > + for (type = swap_list.head; type >= 0; type = si->next) {
> > > + si = swap_info[type];
> > > + si_frontswap_pages = atomic_read(&si->frontswap_pages);
> > > + if (total_pages_to_unuse < si_frontswap_pages) {
> > > + pages = pages_to_unuse = total_pages_to_unuse;
> > > + } else {
> > > + pages = si_frontswap_pages;
> > > + pages_to_unuse = 0; /* unuse all */
> > > + }
> > > + /* ensure there is enough RAM to fetch pages from frontswap */
> > > + if (security_vm_enough_memory_mm(current->mm, pages)) {
> > > + ret = -ENOMEM;
> >
> >
> > Nipick:
> > I am not sure detailed error returning would be good.
> > Caller doesn't matter it now but it can consider it in future.
> > Hmm,
>
> Is there a reason to avoid returning a meaningful error when it's pretty
> easy?

I'm certainly not an expert on kernel style (as this whole series
of patches demonstrates :-) but I think setting a meaningful
error code is useful documentation and plans for future users
that might use the error code.

2012-06-11 14:31:31

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages

On Mon, Jun 11, 2012 at 10:27 AM, Dan Magenheimer
<[email protected]> wrote:
>> From: Sasha Levin [mailto:[email protected]]
>> Subject: Re: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages
>>
>> > > + assert_spin_locked(&swap_lock);
>> >
>> > Normally, we should use this assertion when we can't find swap_lock is hold or not easily
>> > by complicated call depth or unexpected use-case like general function.
>> > But I expect this function's caller is very limited, not complicated.
>> > Just comment write down isn't enough?
>>
>> Is there a reason not to do it though? Debugging a case where this
>> function is called without a swaplock and causes corruption won't be
>> easy.
>
> I'm not sure of the correct kernel style but I like the fact
> that assert_spin_locked both documents the lock requirement and tests
> it at runtime.

The kernel style is to do "
3) Separate your changes.

Separate _logical changes_ into a single patch file.
"

So it is fine, but it should be in its own patch.
>
> I don't know the correct kernel syntax but is it possible
> to make this code be functional when the kernel "debug"
> option is on, but a no-op when "debug" is disabled?
> IMHO, that would be the ideal solution.
>
>> > > + for (type = swap_list.head; type >= 0; type = si->next) {
>> > > + ? ? ? ? si = swap_info[type];
>> > > + ? ? ? ? si_frontswap_pages = atomic_read(&si->frontswap_pages);
>> > > + ? ? ? ? if (total_pages_to_unuse < si_frontswap_pages) {
>> > > + ? ? ? ? ? ? ? ? pages = pages_to_unuse = total_pages_to_unuse;
>> > > + ? ? ? ? } else {
>> > > + ? ? ? ? ? ? ? ? pages = si_frontswap_pages;
>> > > + ? ? ? ? ? ? ? ? pages_to_unuse = 0; /* unuse all */
>> > > + ? ? ? ? }
>> > > + ? ? ? ? /* ensure there is enough RAM to fetch pages from frontswap */
>> > > + ? ? ? ? if (security_vm_enough_memory_mm(current->mm, pages)) {
>> > > + ? ? ? ? ? ? ? ? ret = -ENOMEM;
>> >
>> >
>> > Nipick:
>> > I am not sure detailed error returning would be good.
>> > Caller doesn't matter it now but it can consider it in future.
>> > Hmm,
>>
>> Is there a reason to avoid returning a meaningful error when it's pretty
>> easy?
>
> I'm certainly not an expert on kernel style (as this whole series
> of patches demonstrates :-) but I think setting a meaningful
> error code is useful documentation and plans for future users
> that might use the error code.

Aye.

2012-06-11 14:37:35

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH v3 04/10] mm: frontswap: split out __frontswap_unuse_pages

On Mon, 2012-06-11 at 10:31 -0400, Konrad Rzeszutek Wilk wrote:
> > I'm not sure of the correct kernel style but I like the fact
> > that assert_spin_locked both documents the lock requirement and tests
> > it at runtime.
>
> The kernel style is to do "
> 3) Separate your changes.
>
> Separate _logical changes_ into a single patch file.
> "
>
> So it is fine, but it should be in its own patch.

It is one logical change: I've moved a block of code that has to be
locked in the swap mutex into it's own function, adding the spinlock
assertion isn't new code, nor it relates to any new code. It's there to
assert that what happened before still happens now.