From: Gioh Kim <[email protected]>
Hello,
This series try to enable migration of non-LRU pages, such as driver's page.
My ARM-based platform occured severe fragmentation problem after long-term
(several days) test. Sometimes even order-3 page allocation failed. It has
memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
and 20~30 memory is reserved for zram.
I found that many pages of GPU driver and zram are non-movable pages. So I
reported Minchan Kim, the maintainer of zram, and he made the internal
compaction logic of zram. And I made the internal compaction of GPU driver.
They reduced some fragmentation but they are not enough effective.
They are activated by its own interface, /sys, so they are not cooperative
with kernel compaction. If there is too much fragmentation and kernel starts
to compaction, zram and GPU driver cannot work with the kernel compaction.
This patch set combines 5 patches.
1. patch 1/5: get inode from anon_inodes
This patch adds new interface to create inode from anon_inodes.
2. patch 2/5: framework to isolate/migrate/putback page
Add isolatepage, putbackpage into address_space_operations
and wrapper function to call them
3. patch 3/5: apply the framework into balloon driver
The balloon driver is applied into the framework. It gets a inode
from anon_inodes and register operations in the inode.
Any other drivers can register operations via inode like this
to migrate it's pages.
4. patch 4/5: compaction/migration call the generic interfaces
Compaction and migration pages call the generic interfaces of the framework,
instead of calling balloon migration directly.
5. patch 5/5: remove direct calling of migration of driver pages
Non-lru pages are migrated with lru pages by move_to_new_page().
This patch set is tested:
- turn on Ubuntu 14.04 with 1G memory on qemu.
- do kernel building
- after several seconds check more than 512MB is used with free command
- command "balloon 512" in qemu monitor
- check hundreds MB of pages are migrated
My thanks to Konstantin Khlebnikov for his reviews of the v2 patch set.
Most of the changes were based on his feedback.
Changes since v2:
- change the name of page type from migratable page into mobile page
- get and lock page to isolate page
- add wrapper interfaces for page->mapping->a_ops->isolate/putback
- leave balloon pages marked as balloon
This patch-set is based on v4.1
Gioh Kim (5):
fs/anon_inodes: new interface to create new inode
mm/compaction: enable mobile-page migration
mm/balloon: apply mobile page migratable into balloon
mm/compaction: call generic migration callbacks
mm: remove direct calling of migration
drivers/virtio/virtio_balloon.c | 3 ++
fs/anon_inodes.c | 6 +++
fs/proc/page.c | 3 ++
include/linux/anon_inodes.h | 1 +
include/linux/balloon_compaction.h | 15 +++++--
include/linux/compaction.h | 76 ++++++++++++++++++++++++++++++++++
include/linux/fs.h | 2 +
include/linux/page-flags.h | 19 +++++++++
include/uapi/linux/kernel-page-flags.h | 1 +
mm/balloon_compaction.c | 71 ++++++++++---------------------
mm/compaction.c | 8 ++--
mm/migrate.c | 24 +++--------
12 files changed, 154 insertions(+), 75 deletions(-)
--
2.1.4
From: Gioh Kim <[email protected]>
The anon_inodes has already complete interfaces to create manage
many anonymous inodes but don't have interface to get
new inode. Other sub-modules can create anonymous inode
without creating and mounting it's own pseudo filesystem.
Signed-off-by: Gioh Kim <[email protected]>
---
fs/anon_inodes.c | 6 ++++++
include/linux/anon_inodes.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 80ef38c..1d51f96 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -162,6 +162,12 @@ err_put_unused_fd:
}
EXPORT_SYMBOL_GPL(anon_inode_getfd);
+struct inode *anon_inode_new(void)
+{
+ return alloc_anon_inode(anon_inode_mnt->mnt_sb);
+}
+EXPORT_SYMBOL_GPL(anon_inode_new);
+
static int __init anon_inode_init(void)
{
anon_inode_mnt = kern_mount(&anon_inode_fs_type);
diff --git a/include/linux/anon_inodes.h b/include/linux/anon_inodes.h
index 8013a45..ddbd67f 100644
--- a/include/linux/anon_inodes.h
+++ b/include/linux/anon_inodes.h
@@ -15,6 +15,7 @@ struct file *anon_inode_getfile(const char *name,
void *priv, int flags);
int anon_inode_getfd(const char *name, const struct file_operations *fops,
void *priv, int flags);
+struct inode *anon_inode_new(void);
#endif /* _LINUX_ANON_INODES_H */
--
2.1.4
From: Gioh Kim <[email protected]>
Add framework to register callback functions and check page mobility.
There are some modes for page isolation so that isolate interface
has arguments of page address and isolation mode while putback
interface has only page address as argument.
Signed-off-by: Gioh Kim <[email protected]>
---
fs/proc/page.c | 3 ++
include/linux/compaction.h | 76 ++++++++++++++++++++++++++++++++++
include/linux/fs.h | 2 +
include/linux/page-flags.h | 19 +++++++++
include/uapi/linux/kernel-page-flags.h | 1 +
5 files changed, 101 insertions(+)
diff --git a/fs/proc/page.c b/fs/proc/page.c
index 7eee2d8..a4f5a00 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -146,6 +146,9 @@ u64 stable_page_flags(struct page *page)
if (PageBalloon(page))
u |= 1 << KPF_BALLOON;
+ if (PageMobile(page))
+ u |= 1 << KPF_MOBILE;
+
u |= kpf_copy_bit(k, KPF_LOCKED, PG_locked);
u |= kpf_copy_bit(k, KPF_SLAB, PG_slab);
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index aa8f61c..c375a89 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -1,6 +1,9 @@
#ifndef _LINUX_COMPACTION_H
#define _LINUX_COMPACTION_H
+#include <linux/page-flags.h>
+#include <linux/pagemap.h>
+
/* Return values for compact_zone() and try_to_compact_pages() */
/* compaction didn't start as it was deferred due to past failures */
#define COMPACT_DEFERRED 0
@@ -51,6 +54,66 @@ extern void compaction_defer_reset(struct zone *zone, int order,
bool alloc_success);
extern bool compaction_restarting(struct zone *zone, int order);
+static inline bool mobile_page(struct page *page)
+{
+ return page->mapping && page->mapping->a_ops &&
+ (PageMobile(page) || PageBalloon(page));
+}
+
+static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
+{
+ bool ret;
+
+ /*
+ * Avoid burning cycles with pages that are yet under __free_pages(),
+ * or just got freed under us.
+ *
+ * In case we 'win' a race for a mobile page being freed under us and
+ * raise its refcount preventing __free_pages() from doing its job
+ * the put_page() at the end of this block will take care of
+ * release this page, thus avoiding a nasty leakage.
+ */
+ if (likely(get_page_unless_zero(page))) {
+ /*
+ * As mobile pages are not isolated from LRU lists, concurrent
+ * compaction threads can race against page migration functions
+ * as well as race against the releasing a page.
+ *
+ * In order to avoid having an already isolated mobile page
+ * being (wrongly) re-isolated while it is under migration,
+ * or to avoid attempting to isolate pages being released,
+ * lets be sure we have the page lock
+ * before proceeding with the mobile page isolation steps.
+ */
+ if (likely(trylock_page(page))) {
+ if (mobile_page(page) &&
+ page->mapping->a_ops->isolatepage) {
+ ret = page->mapping->a_ops->isolatepage(page,
+ mode);
+ unlock_page(page);
+ return ret;
+ }
+ unlock_page(page);
+ }
+ put_page(page);
+ }
+ return false;
+}
+
+static inline void putback_mobilepage(struct page *page)
+{
+ /*
+ * 'lock_page()' stabilizes the page and prevents races against
+ * concurrent isolation threads attempting to re-isolate it.
+ */
+ lock_page(page);
+ if (mobile_page(page) && page->mapping->a_ops->putbackpage) {
+ page->mapping->a_ops->putbackpage(page);
+ /* drop the extra ref count taken for mobile page isolation */
+ put_page(page);
+ }
+ unlock_page(page);
+}
#else
static inline unsigned long try_to_compact_pages(gfp_t gfp_mask,
unsigned int order, int alloc_flags,
@@ -83,6 +146,19 @@ static inline bool compaction_deferred(struct zone *zone, int order)
return true;
}
+static inline bool mobile_page(struct page *page)
+{
+ return false;
+}
+
+static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
+{
+ return false;
+}
+
+static inline void putback_mobilepage(struct page *page)
+{
+}
#endif /* CONFIG_COMPACTION */
#if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 35ec87e..33c9aa5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -395,6 +395,8 @@ struct address_space_operations {
*/
int (*migratepage) (struct address_space *,
struct page *, struct page *, enum migrate_mode);
+ bool (*isolatepage) (struct page *, isolate_mode_t);
+ void (*putbackpage) (struct page *);
int (*launder_page) (struct page *);
int (*is_partially_uptodate) (struct page *, unsigned long,
unsigned long);
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index f34e040..abef145 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -582,6 +582,25 @@ static inline void __ClearPageBalloon(struct page *page)
atomic_set(&page->_mapcount, -1);
}
+#define PAGE_MOBILE_MAPCOUNT_VALUE (-255)
+
+static inline int PageMobile(struct page *page)
+{
+ return atomic_read(&page->_mapcount) == PAGE_MOBILE_MAPCOUNT_VALUE;
+}
+
+static inline void __SetPageMobile(struct page *page)
+{
+ VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
+ atomic_set(&page->_mapcount, PAGE_MOBILE_MAPCOUNT_VALUE);
+}
+
+static inline void __ClearPageMobile(struct page *page)
+{
+ VM_BUG_ON_PAGE(!PageMobile(page), page);
+ atomic_set(&page->_mapcount, -1);
+}
+
/*
* If network-based swap is enabled, sl*b must keep track of whether pages
* were allocated from pfmemalloc reserves.
diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
index a6c4962..d50d9e8 100644
--- a/include/uapi/linux/kernel-page-flags.h
+++ b/include/uapi/linux/kernel-page-flags.h
@@ -33,6 +33,7 @@
#define KPF_THP 22
#define KPF_BALLOON 23
#define KPF_ZERO_PAGE 24
+#define KPF_MOBILE 25
#endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
--
2.1.4
From: Gioh Kim <[email protected]>
Apply mobile page migration into balloon driver.
The balloong driver has an anonymous inode that manages
address_space_operation for page migration.
Signed-off-by: Gioh Kim <[email protected]>
---
drivers/virtio/virtio_balloon.c | 3 ++
include/linux/balloon_compaction.h | 15 +++++++--
mm/balloon_compaction.c | 65 +++++++++++++-------------------------
mm/compaction.c | 2 +-
mm/migrate.c | 2 +-
5 files changed, 39 insertions(+), 48 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 82e80e0..ef5b9b5 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -30,6 +30,7 @@
#include <linux/balloon_compaction.h>
#include <linux/oom.h>
#include <linux/wait.h>
+#include <linux/anon_inodes.h>
/*
* Balloon device works in 4K page units. So each page is pointed to by
@@ -505,6 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
balloon_devinfo_init(&vb->vb_dev_info);
#ifdef CONFIG_BALLOON_COMPACTION
vb->vb_dev_info.migratepage = virtballoon_migratepage;
+ vb->vb_dev_info.inode = anon_inode_new();
+ vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
#endif
err = init_vqs(vb);
diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
index 9b0a15d..a9e0bde 100644
--- a/include/linux/balloon_compaction.h
+++ b/include/linux/balloon_compaction.h
@@ -48,6 +48,7 @@
#include <linux/migrate.h>
#include <linux/gfp.h>
#include <linux/err.h>
+#include <linux/fs.h>
/*
* Balloon device information descriptor.
@@ -62,6 +63,7 @@ struct balloon_dev_info {
struct list_head pages; /* Pages enqueued & handled to Host */
int (*migratepage)(struct balloon_dev_info *, struct page *newpage,
struct page *page, enum migrate_mode mode);
+ struct inode *inode;
};
extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info);
@@ -73,12 +75,16 @@ static inline void balloon_devinfo_init(struct balloon_dev_info *balloon)
spin_lock_init(&balloon->pages_lock);
INIT_LIST_HEAD(&balloon->pages);
balloon->migratepage = NULL;
+ balloon->inode = NULL;
}
#ifdef CONFIG_BALLOON_COMPACTION
-extern bool balloon_page_isolate(struct page *page);
+extern const struct address_space_operations balloon_aops;
+extern bool balloon_page_isolate(struct page *page,
+ isolate_mode_t mode);
extern void balloon_page_putback(struct page *page);
-extern int balloon_page_migrate(struct page *newpage,
+extern int balloon_page_migrate(struct address_space *mapping,
+ struct page *newpage,
struct page *page, enum migrate_mode mode);
/*
@@ -124,6 +130,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
struct page *page)
{
__SetPageBalloon(page);
+ page->mapping = balloon->inode->i_mapping;
SetPagePrivate(page);
set_page_private(page, (unsigned long)balloon);
list_add(&page->lru, &balloon->pages);
@@ -140,6 +147,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
static inline void balloon_page_delete(struct page *page)
{
__ClearPageBalloon(page);
+ page->mapping = NULL;
set_page_private(page, 0);
if (PagePrivate(page)) {
ClearPagePrivate(page);
@@ -191,7 +199,8 @@ static inline bool isolated_balloon_page(struct page *page)
return false;
}
-static inline bool balloon_page_isolate(struct page *page)
+static inline bool balloon_page_isolate(struct page *page,
+ isolate_mode_t mode)
{
return false;
}
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index fcad832..0dd0b0d 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -131,43 +131,16 @@ static inline void __putback_balloon_page(struct page *page)
}
/* __isolate_lru_page() counterpart for a ballooned page */
-bool balloon_page_isolate(struct page *page)
+bool balloon_page_isolate(struct page *page, isolate_mode_t mode)
{
/*
- * Avoid burning cycles with pages that are yet under __free_pages(),
- * or just got freed under us.
- *
- * In case we 'win' a race for a balloon page being freed under us and
- * raise its refcount preventing __free_pages() from doing its job
- * the put_page() at the end of this block will take care of
- * release this page, thus avoiding a nasty leakage.
+ * A ballooned page, by default, has PagePrivate set.
+ * Prevent concurrent compaction threads from isolating
+ * an already isolated balloon page by clearing it.
*/
- if (likely(get_page_unless_zero(page))) {
- /*
- * As balloon pages are not isolated from LRU lists, concurrent
- * compaction threads can race against page migration functions
- * as well as race against the balloon driver releasing a page.
- *
- * In order to avoid having an already isolated balloon page
- * being (wrongly) re-isolated while it is under migration,
- * or to avoid attempting to isolate pages being released by
- * the balloon driver, lets be sure we have the page lock
- * before proceeding with the balloon page isolation steps.
- */
- if (likely(trylock_page(page))) {
- /*
- * A ballooned page, by default, has PagePrivate set.
- * Prevent concurrent compaction threads from isolating
- * an already isolated balloon page by clearing it.
- */
- if (balloon_page_movable(page)) {
- __isolate_balloon_page(page);
- unlock_page(page);
- return true;
- }
- unlock_page(page);
- }
- put_page(page);
+ if (balloon_page_movable(page)) {
+ __isolate_balloon_page(page);
+ return true;
}
return false;
}
@@ -175,30 +148,28 @@ bool balloon_page_isolate(struct page *page)
/* putback_lru_page() counterpart for a ballooned page */
void balloon_page_putback(struct page *page)
{
- /*
- * 'lock_page()' stabilizes the page and prevents races against
- * concurrent isolation threads attempting to re-isolate it.
- */
- lock_page(page);
+ if (!isolated_balloon_page(page))
+ return;
if (__is_movable_balloon_page(page)) {
__putback_balloon_page(page);
- /* drop the extra ref count taken for page isolation */
- put_page(page);
} else {
WARN_ON(1);
dump_page(page, "not movable balloon page");
}
- unlock_page(page);
}
/* move_to_new_page() counterpart for a ballooned page */
-int balloon_page_migrate(struct page *newpage,
+int balloon_page_migrate(struct address_space *mapping,
+ struct page *newpage,
struct page *page, enum migrate_mode mode)
{
struct balloon_dev_info *balloon = balloon_page_device(page);
int rc = -EAGAIN;
+ if (!isolated_balloon_page(page))
+ return rc;
+
/*
* Block others from accessing the 'newpage' when we get around to
* establishing additional references. We should be the only one
@@ -218,4 +189,12 @@ int balloon_page_migrate(struct page *newpage,
unlock_page(newpage);
return rc;
}
+
+/* define the balloon_mapping->a_ops callback to allow balloon page migration */
+const struct address_space_operations balloon_aops = {
+ .migratepage = balloon_page_migrate,
+ .isolatepage = balloon_page_isolate,
+ .putbackpage = balloon_page_putback,
+};
+EXPORT_SYMBOL_GPL(balloon_aops);
#endif /* CONFIG_BALLOON_COMPACTION */
diff --git a/mm/compaction.c b/mm/compaction.c
index 018f08d..81bafaf 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -719,7 +719,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
*/
if (!PageLRU(page)) {
if (unlikely(balloon_page_movable(page))) {
- if (balloon_page_isolate(page)) {
+ if (balloon_page_isolate(page, isolate_mode)) {
/* Successfully isolated */
goto isolate_success;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index f53838f..c94038e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -852,7 +852,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
* in order to avoid burning cycles at rmap level, and perform
* the page migration right away (proteced by page lock).
*/
- rc = balloon_page_migrate(newpage, page, mode);
+ rc = balloon_page_migrate(page->mapping, newpage, page, mode);
goto out_unlock;
}
--
2.1.4
From: Gioh Kim <[email protected]>
Compaction calls interfaces of mobile page migration
instead of calling balloon migration directly.
Signed-off-by: Gioh Kim <[email protected]>
---
mm/compaction.c | 8 ++++----
mm/migrate.c | 19 ++++++++++---------
2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 81bafaf..60e4cbb 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -14,7 +14,7 @@
#include <linux/backing-dev.h>
#include <linux/sysctl.h>
#include <linux/sysfs.h>
-#include <linux/balloon_compaction.h>
+#include <linux/compaction.h>
#include <linux/page-isolation.h>
#include <linux/kasan.h>
#include "internal.h"
@@ -714,12 +714,12 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
/*
* Check may be lockless but that's ok as we recheck later.
- * It's possible to migrate LRU pages and balloon pages
+ * It's possible to migrate LRU pages and mobile pages
* Skip any other type of page
*/
if (!PageLRU(page)) {
- if (unlikely(balloon_page_movable(page))) {
- if (balloon_page_isolate(page, isolate_mode)) {
+ if (unlikely(mobile_page(page))) {
+ if (isolate_mobilepage(page, isolate_mode)) {
/* Successfully isolated */
goto isolate_success;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index c94038e..e22be67 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -35,7 +35,7 @@
#include <linux/hugetlb.h>
#include <linux/hugetlb_cgroup.h>
#include <linux/gfp.h>
-#include <linux/balloon_compaction.h>
+#include <linux/compaction.h>
#include <linux/mmu_notifier.h>
#include <asm/tlbflush.h>
@@ -76,7 +76,7 @@ int migrate_prep_local(void)
* from where they were once taken off for compaction/migration.
*
* This function shall be used whenever the isolated pageset has been
- * built from lru, balloon, hugetlbfs page. See isolate_migratepages_range()
+ * built from lru, mobile, hugetlbfs page. See isolate_migratepages_range()
* and isolate_huge_page().
*/
void putback_movable_pages(struct list_head *l)
@@ -92,8 +92,8 @@ void putback_movable_pages(struct list_head *l)
list_del(&page->lru);
dec_zone_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));
- if (unlikely(isolated_balloon_page(page)))
- balloon_page_putback(page);
+ if (unlikely(mobile_page(page)))
+ putback_mobilepage(page);
else
putback_lru_page(page);
}
@@ -844,15 +844,16 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
}
}
- if (unlikely(isolated_balloon_page(page))) {
+ if (unlikely(mobile_page(page))) {
/*
- * A ballooned page does not need any special attention from
+ * A mobile page does not need any special attention from
* physical to virtual reverse mapping procedures.
* Skip any attempt to unmap PTEs or to remap swap cache,
* in order to avoid burning cycles at rmap level, and perform
* the page migration right away (proteced by page lock).
*/
- rc = balloon_page_migrate(page->mapping, newpage, page, mode);
+ rc = page->mapping->a_ops->migratepage(page->mapping,
+ newpage, page, mode);
goto out_unlock;
}
@@ -960,8 +961,8 @@ out:
if (rc != MIGRATEPAGE_SUCCESS && put_new_page) {
ClearPageSwapBacked(newpage);
put_new_page(newpage, private);
- } else if (unlikely(__is_movable_balloon_page(newpage))) {
- /* drop our reference, page already in the balloon */
+ } else if (unlikely(mobile_page(newpage))) {
+ /* drop our reference */
put_page(newpage);
} else
putback_lru_page(newpage);
--
2.1.4
From: Gioh Kim <[email protected]>
Migration is completely generalized so that migrating mobile page
is processed with lru-pages in move_to_new_page.
Signed-off-by: Gioh Kim <[email protected]>
---
mm/balloon_compaction.c | 8 --------
mm/migrate.c | 13 -------------
2 files changed, 21 deletions(-)
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 0dd0b0d..9d07ed9 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -170,13 +170,6 @@ int balloon_page_migrate(struct address_space *mapping,
if (!isolated_balloon_page(page))
return rc;
- /*
- * Block others from accessing the 'newpage' when we get around to
- * establishing additional references. We should be the only one
- * holding a reference to the 'newpage' at this point.
- */
- BUG_ON(!trylock_page(newpage));
-
if (WARN_ON(!__is_movable_balloon_page(page))) {
dump_page(page, "not movable balloon page");
unlock_page(newpage);
@@ -186,7 +179,6 @@ int balloon_page_migrate(struct address_space *mapping,
if (balloon && balloon->migratepage)
rc = balloon->migratepage(balloon, newpage, page, mode);
- unlock_page(newpage);
return rc;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index e22be67..b82539b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -844,19 +844,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
}
}
- if (unlikely(mobile_page(page))) {
- /*
- * A mobile page does not need any special attention from
- * physical to virtual reverse mapping procedures.
- * Skip any attempt to unmap PTEs or to remap swap cache,
- * in order to avoid burning cycles at rmap level, and perform
- * the page migration right away (proteced by page lock).
- */
- rc = page->mapping->a_ops->migratepage(page->mapping,
- newpage, page, mode);
- goto out_unlock;
- }
-
/*
* Corner case handling:
* 1. When a new swap-cache page is read into, it is added to the LRU
--
2.1.4
On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim <[email protected]> wrote:
> From: Gioh Kim <[email protected]>
>
> Hello,
>
> This series try to enable migration of non-LRU pages, such as driver's page.
>
> My ARM-based platform occured severe fragmentation problem after long-term
> (several days) test. Sometimes even order-3 page allocation failed. It has
> memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
> and 20~30 memory is reserved for zram.
>
> I found that many pages of GPU driver and zram are non-movable pages. So I
> reported Minchan Kim, the maintainer of zram, and he made the internal
> compaction logic of zram. And I made the internal compaction of GPU driver.
>
> They reduced some fragmentation but they are not enough effective.
> They are activated by its own interface, /sys, so they are not cooperative
> with kernel compaction. If there is too much fragmentation and kernel starts
> to compaction, zram and GPU driver cannot work with the kernel compaction.
>
> ...
>
> This patch set is tested:
> - turn on Ubuntu 14.04 with 1G memory on qemu.
> - do kernel building
> - after several seconds check more than 512MB is used with free command
> - command "balloon 512" in qemu monitor
> - check hundreds MB of pages are migrated
OK, but what happens if the balloon driver is not used to force
compaction? Does your test machine successfully compact pages on
demand, so those order-3 allocations now succeed?
Why are your changes to the GPU driver not included in this patch series?
2015-07-08 오전 7:37에 Andrew Morton 이(가) 쓴 글:
> On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim <[email protected]> wrote:
>
>> From: Gioh Kim <[email protected]>
>>
>> Hello,
>>
>> This series try to enable migration of non-LRU pages, such as driver's page.
>>
>> My ARM-based platform occured severe fragmentation problem after long-term
>> (several days) test. Sometimes even order-3 page allocation failed. It has
>> memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
>> and 20~30 memory is reserved for zram.
>>
>> I found that many pages of GPU driver and zram are non-movable pages. So I
>> reported Minchan Kim, the maintainer of zram, and he made the internal
>> compaction logic of zram. And I made the internal compaction of GPU driver.
>>
>> They reduced some fragmentation but they are not enough effective.
>> They are activated by its own interface, /sys, so they are not cooperative
>> with kernel compaction. If there is too much fragmentation and kernel starts
>> to compaction, zram and GPU driver cannot work with the kernel compaction.
>>
>> ...
>>
>> This patch set is tested:
>> - turn on Ubuntu 14.04 with 1G memory on qemu.
>> - do kernel building
>> - after several seconds check more than 512MB is used with free command
>> - command "balloon 512" in qemu monitor
>> - check hundreds MB of pages are migrated
>
> OK, but what happens if the balloon driver is not used to force
> compaction? Does your test machine successfully compact pages on
> demand, so those order-3 allocations now succeed?
If any driver that has many pages like the balloon driver is forced to compact,
the system can get free high-order pages.
I have to show how this patch work with a driver existing in the kernel source,
for kernel developers' undestanding. So I selected the balloon driver
because it has already compaction and working with kernel compaction.
I can show how driver pages is compacted with lru-pages together.
Actually balloon driver is not best example to show how this patch compacts pages.
The balloon driver compaction is decreasing page consumtion, for instance 1024MB -> 512MB.
I think it is not compaction precisely. It frees pages.
Of course there will be many high-order pages after 512MB is freed.
>
> Why are your changes to the GPU driver not included in this patch series?
My platform is ARM-based and GPU is ARM-Mali. The driver is not open source.
It's too bad that I cannot show effect of this patch with the GPU driver.
>
>
>
On Wed, 08 Jul 2015 09:02:59 +0900 Gioh Kim <[email protected]> wrote:
>
>
> 2015-07-08 ______ 7:37___ Andrew Morton ___(___) ___ ___:
> > On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim <[email protected]> wrote:
> >
> >> From: Gioh Kim <[email protected]>
> >>
> >> Hello,
> >>
> >> This series try to enable migration of non-LRU pages, such as driver's page.
> >>
> >> My ARM-based platform occured severe fragmentation problem after long-term
> >> (several days) test. Sometimes even order-3 page allocation failed. It has
> >> memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
> >> and 20~30 memory is reserved for zram.
> >>
> >> I found that many pages of GPU driver and zram are non-movable pages. So I
> >> reported Minchan Kim, the maintainer of zram, and he made the internal
> >> compaction logic of zram. And I made the internal compaction of GPU driver.
> >>
> >> They reduced some fragmentation but they are not enough effective.
> >> They are activated by its own interface, /sys, so they are not cooperative
> >> with kernel compaction. If there is too much fragmentation and kernel starts
> >> to compaction, zram and GPU driver cannot work with the kernel compaction.
> >>
> >> ...
> >>
> >> This patch set is tested:
> >> - turn on Ubuntu 14.04 with 1G memory on qemu.
> >> - do kernel building
> >> - after several seconds check more than 512MB is used with free command
> >> - command "balloon 512" in qemu monitor
> >> - check hundreds MB of pages are migrated
> >
> > OK, but what happens if the balloon driver is not used to force
> > compaction? Does your test machine successfully compact pages on
> > demand, so those order-3 allocations now succeed?
>
> If any driver that has many pages like the balloon driver is forced to compact,
> the system can get free high-order pages.
>
> I have to show how this patch work with a driver existing in the kernel source,
> for kernel developers' undestanding. So I selected the balloon driver
> because it has already compaction and working with kernel compaction.
> I can show how driver pages is compacted with lru-pages together.
>
> Actually balloon driver is not best example to show how this patch compacts pages.
> The balloon driver compaction is decreasing page consumtion, for instance 1024MB -> 512MB.
> I think it is not compaction precisely. It frees pages.
> Of course there will be many high-order pages after 512MB is freed.
Can the various in-kernel GPU drivers benefit from this? If so, wiring
up one or more of those would be helpful?
2015-07-08 오전 9:07에 Andrew Morton 이(가) 쓴 글:
> On Wed, 08 Jul 2015 09:02:59 +0900 Gioh Kim <[email protected]> wrote:
>
>>
>>
>> 2015-07-08 ______ 7:37___ Andrew Morton ___(___) ___ ___:
>>> On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim <[email protected]> wrote:
>>>
>>>> From: Gioh Kim <[email protected]>
>>>>
>>>> Hello,
>>>>
>>>> This series try to enable migration of non-LRU pages, such as driver's page.
>>>>
>>>> My ARM-based platform occured severe fragmentation problem after long-term
>>>> (several days) test. Sometimes even order-3 page allocation failed. It has
>>>> memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
>>>> and 20~30 memory is reserved for zram.
>>>>
>>>> I found that many pages of GPU driver and zram are non-movable pages. So I
>>>> reported Minchan Kim, the maintainer of zram, and he made the internal
>>>> compaction logic of zram. And I made the internal compaction of GPU driver.
>>>>
>>>> They reduced some fragmentation but they are not enough effective.
>>>> They are activated by its own interface, /sys, so they are not cooperative
>>>> with kernel compaction. If there is too much fragmentation and kernel starts
>>>> to compaction, zram and GPU driver cannot work with the kernel compaction.
>>>>
>>>> ...
>>>>
>>>> This patch set is tested:
>>>> - turn on Ubuntu 14.04 with 1G memory on qemu.
>>>> - do kernel building
>>>> - after several seconds check more than 512MB is used with free command
>>>> - command "balloon 512" in qemu monitor
>>>> - check hundreds MB of pages are migrated
>>>
>>> OK, but what happens if the balloon driver is not used to force
>>> compaction? Does your test machine successfully compact pages on
>>> demand, so those order-3 allocations now succeed?
>>
>> If any driver that has many pages like the balloon driver is forced to compact,
>> the system can get free high-order pages.
>>
>> I have to show how this patch work with a driver existing in the kernel source,
>> for kernel developers' undestanding. So I selected the balloon driver
>> because it has already compaction and working with kernel compaction.
>> I can show how driver pages is compacted with lru-pages together.
>>
>> Actually balloon driver is not best example to show how this patch compacts pages.
>> The balloon driver compaction is decreasing page consumtion, for instance 1024MB -> 512MB.
>> I think it is not compaction precisely. It frees pages.
>> Of course there will be many high-order pages after 512MB is freed.
>
> Can the various in-kernel GPU drivers benefit from this? If so, wiring
> up one or more of those would be helpful?
I'm sure that other in-kernel GPU drivers can have benefit.
It must be helpful.
If I was familiar with other in-kernel GPU drivers code, I tried to patch them.
It's too bad.
Minchan Kim said he had a plan to apply this patch into zram compaction.
Many embedded machines use several hundreds MB for zram.
The zram can also have benefit with this patch as much as GPU drivers.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Wed, Jul 08, 2015 at 09:19:50AM +0900, Gioh Kim wrote:
>
>
> 2015-07-08 오전 9:07에 Andrew Morton 이(가) 쓴 글:
> >On Wed, 08 Jul 2015 09:02:59 +0900 Gioh Kim <[email protected]> wrote:
> >
> >>
> >>
> >>2015-07-08 ______ 7:37___ Andrew Morton ___(___) ___ ___:
> >>>On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim <[email protected]> wrote:
> >>>
> >>>>From: Gioh Kim <[email protected]>
> >>>>
> >>>>Hello,
> >>>>
> >>>>This series try to enable migration of non-LRU pages, such as driver's page.
> >>>>
> >>>>My ARM-based platform occured severe fragmentation problem after long-term
> >>>>(several days) test. Sometimes even order-3 page allocation failed. It has
> >>>>memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
> >>>>and 20~30 memory is reserved for zram.
> >>>>
> >>>>I found that many pages of GPU driver and zram are non-movable pages. So I
> >>>>reported Minchan Kim, the maintainer of zram, and he made the internal
> >>>>compaction logic of zram. And I made the internal compaction of GPU driver.
> >>>>
> >>>>They reduced some fragmentation but they are not enough effective.
> >>>>They are activated by its own interface, /sys, so they are not cooperative
> >>>>with kernel compaction. If there is too much fragmentation and kernel starts
> >>>>to compaction, zram and GPU driver cannot work with the kernel compaction.
> >>>>
> >>>>...
> >>>>
> >>>>This patch set is tested:
> >>>>- turn on Ubuntu 14.04 with 1G memory on qemu.
> >>>>- do kernel building
> >>>>- after several seconds check more than 512MB is used with free command
> >>>>- command "balloon 512" in qemu monitor
> >>>>- check hundreds MB of pages are migrated
> >>>
> >>>OK, but what happens if the balloon driver is not used to force
> >>>compaction? Does your test machine successfully compact pages on
> >>>demand, so those order-3 allocations now succeed?
> >>
> >>If any driver that has many pages like the balloon driver is forced to compact,
> >>the system can get free high-order pages.
> >>
> >>I have to show how this patch work with a driver existing in the kernel source,
> >>for kernel developers' undestanding. So I selected the balloon driver
> >>because it has already compaction and working with kernel compaction.
> >>I can show how driver pages is compacted with lru-pages together.
> >>
> >>Actually balloon driver is not best example to show how this patch compacts pages.
> >>The balloon driver compaction is decreasing page consumtion, for instance 1024MB -> 512MB.
> >>I think it is not compaction precisely. It frees pages.
> >>Of course there will be many high-order pages after 512MB is freed.
> >
> >Can the various in-kernel GPU drivers benefit from this? If so, wiring
> >up one or more of those would be helpful?
>
> I'm sure that other in-kernel GPU drivers can have benefit.
> It must be helpful.
>
> If I was familiar with other in-kernel GPU drivers code, I tried to patch them.
> It's too bad.
>
> Minchan Kim said he had a plan to apply this patch into zram compaction.
> Many embedded machines use several hundreds MB for zram.
> The zram can also have benefit with this patch as much as GPU drivers.
>
Hello Gioh,
It would be helpful for fork-latency and zra+CMA in small memory system.
I will implement zsmalloc.migratepages after I finish current going works.
Thanks for the nice work!
--
Kind regards,
Minchan Kim
>>
>>
>> Can the various in-kernel GPU drivers benefit from this? If so, wiring
>> up one or more of those would be helpful?
>
>
> I'm sure that other in-kernel GPU drivers can have benefit.
> It must be helpful.
>
> If I was familiar with other in-kernel GPU drivers code, I tried to patch
> them.
> It's too bad.
I'll bring dri-devel into the loop here.
ARM GPU developers please take a look at this stuff, Laurent, Rob,
Eric I suppose.
Daniel Vetter you might have some opinions as well.
Dave.
On Tue, Jul 07, 2015 at 01:36:20PM +0900, Gioh Kim wrote:
> From: Gioh Kim <[email protected]>
>
> Hello,
>
> This series try to enable migration of non-LRU pages, such as driver's page.
>
> My ARM-based platform occured severe fragmentation problem after long-term
> (several days) test. Sometimes even order-3 page allocation failed. It has
> memory size 512MB ~ 1024MB. 30% ~ 40% memory is consumed for graphic processing
> and 20~30 memory is reserved for zram.
>
> I found that many pages of GPU driver and zram are non-movable pages. So I
> reported Minchan Kim, the maintainer of zram, and he made the internal
> compaction logic of zram. And I made the internal compaction of GPU driver.
>
> They reduced some fragmentation but they are not enough effective.
> They are activated by its own interface, /sys, so they are not cooperative
> with kernel compaction. If there is too much fragmentation and kernel starts
> to compaction, zram and GPU driver cannot work with the kernel compaction.
>
> This patch set combines 5 patches.
>
> 1. patch 1/5: get inode from anon_inodes
> This patch adds new interface to create inode from anon_inodes.
>
> 2. patch 2/5: framework to isolate/migrate/putback page
> Add isolatepage, putbackpage into address_space_operations
> and wrapper function to call them
>
> 3. patch 3/5: apply the framework into balloon driver
> The balloon driver is applied into the framework. It gets a inode
> from anon_inodes and register operations in the inode.
> Any other drivers can register operations via inode like this
> to migrate it's pages.
>
> 4. patch 4/5: compaction/migration call the generic interfaces
> Compaction and migration pages call the generic interfaces of the framework,
> instead of calling balloon migration directly.
>
> 5. patch 5/5: remove direct calling of migration of driver pages
> Non-lru pages are migrated with lru pages by move_to_new_page().
>
> This patch set is tested:
> - turn on Ubuntu 14.04 with 1G memory on qemu.
> - do kernel building
> - after several seconds check more than 512MB is used with free command
> - command "balloon 512" in qemu monitor
> - check hundreds MB of pages are migrated
>
> My thanks to Konstantin Khlebnikov for his reviews of the v2 patch set.
> Most of the changes were based on his feedback.
>
> Changes since v2:
> - change the name of page type from migratable page into mobile page
> - get and lock page to isolate page
> - add wrapper interfaces for page->mapping->a_ops->isolate/putback
> - leave balloon pages marked as balloon
>
> This patch-set is based on v4.1
>
> Gioh Kim (5):
> fs/anon_inodes: new interface to create new inode
> mm/compaction: enable mobile-page migration
> mm/balloon: apply mobile page migratable into balloon
> mm/compaction: call generic migration callbacks
> mm: remove direct calling of migration
>
> drivers/virtio/virtio_balloon.c | 3 ++
> fs/anon_inodes.c | 6 +++
> fs/proc/page.c | 3 ++
> include/linux/anon_inodes.h | 1 +
> include/linux/balloon_compaction.h | 15 +++++--
> include/linux/compaction.h | 76 ++++++++++++++++++++++++++++++++++
> include/linux/fs.h | 2 +
> include/linux/page-flags.h | 19 +++++++++
> include/uapi/linux/kernel-page-flags.h | 1 +
> mm/balloon_compaction.c | 71 ++++++++++---------------------
> mm/compaction.c | 8 ++--
> mm/migrate.c | 24 +++--------
> 12 files changed, 154 insertions(+), 75 deletions(-)
>
> --
> 2.1.4
>
Acked-by: Rafael Aquini <[email protected]>
On Tue, Jul 07, 2015 at 01:36:23PM +0900, Gioh Kim wrote:
> From: Gioh Kim <[email protected]>
>
> Apply mobile page migration into balloon driver.
> The balloong driver has an anonymous inode that manages
> address_space_operation for page migration.
>
> Signed-off-by: Gioh Kim <[email protected]>
> ---
> drivers/virtio/virtio_balloon.c | 3 ++
> include/linux/balloon_compaction.h | 15 +++++++--
> mm/balloon_compaction.c | 65 +++++++++++++-------------------------
> mm/compaction.c | 2 +-
> mm/migrate.c | 2 +-
> 5 files changed, 39 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 82e80e0..ef5b9b5 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -30,6 +30,7 @@
> #include <linux/balloon_compaction.h>
> #include <linux/oom.h>
> #include <linux/wait.h>
> +#include <linux/anon_inodes.h>
>
> /*
> * Balloon device works in 4K page units. So each page is pointed to by
> @@ -505,6 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
> balloon_devinfo_init(&vb->vb_dev_info);
> #ifdef CONFIG_BALLOON_COMPACTION
> vb->vb_dev_info.migratepage = virtballoon_migratepage;
> + vb->vb_dev_info.inode = anon_inode_new();
> + vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
> #endif
>
> err = init_vqs(vb);
> diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
> index 9b0a15d..a9e0bde 100644
> --- a/include/linux/balloon_compaction.h
> +++ b/include/linux/balloon_compaction.h
> @@ -48,6 +48,7 @@
> #include <linux/migrate.h>
> #include <linux/gfp.h>
> #include <linux/err.h>
> +#include <linux/fs.h>
>
> /*
> * Balloon device information descriptor.
> @@ -62,6 +63,7 @@ struct balloon_dev_info {
> struct list_head pages; /* Pages enqueued & handled to Host */
> int (*migratepage)(struct balloon_dev_info *, struct page *newpage,
> struct page *page, enum migrate_mode mode);
> + struct inode *inode;
> };
>
> extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info);
> @@ -73,12 +75,16 @@ static inline void balloon_devinfo_init(struct balloon_dev_info *balloon)
> spin_lock_init(&balloon->pages_lock);
> INIT_LIST_HEAD(&balloon->pages);
> balloon->migratepage = NULL;
> + balloon->inode = NULL;
> }
>
> #ifdef CONFIG_BALLOON_COMPACTION
> -extern bool balloon_page_isolate(struct page *page);
> +extern const struct address_space_operations balloon_aops;
> +extern bool balloon_page_isolate(struct page *page,
> + isolate_mode_t mode);
> extern void balloon_page_putback(struct page *page);
> -extern int balloon_page_migrate(struct page *newpage,
> +extern int balloon_page_migrate(struct address_space *mapping,
> + struct page *newpage,
> struct page *page, enum migrate_mode mode);
>
> /*
> @@ -124,6 +130,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
> struct page *page)
> {
> __SetPageBalloon(page);
> + page->mapping = balloon->inode->i_mapping;
> SetPagePrivate(page);
> set_page_private(page, (unsigned long)balloon);
> list_add(&page->lru, &balloon->pages);
> @@ -140,6 +147,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
> static inline void balloon_page_delete(struct page *page)
> {
> __ClearPageBalloon(page);
> + page->mapping = NULL;
> set_page_private(page, 0);
> if (PagePrivate(page)) {
> ClearPagePrivate(page);
Order of cleanup here is not the reverse of the order of initialization.
Better make it exactly the reverse.
Also, I have a question: is it enough to lock the page to make changing
the mapping safe? Do all users lock the page too?
> @@ -191,7 +199,8 @@ static inline bool isolated_balloon_page(struct page *page)
> return false;
> }
>
> -static inline bool balloon_page_isolate(struct page *page)
> +static inline bool balloon_page_isolate(struct page *page,
> + isolate_mode_t mode)
> {
> return false;
> }
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index fcad832..0dd0b0d 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -131,43 +131,16 @@ static inline void __putback_balloon_page(struct page *page)
> }
>
> /* __isolate_lru_page() counterpart for a ballooned page */
> -bool balloon_page_isolate(struct page *page)
> +bool balloon_page_isolate(struct page *page, isolate_mode_t mode)
> {
> /*
> - * Avoid burning cycles with pages that are yet under __free_pages(),
> - * or just got freed under us.
> - *
> - * In case we 'win' a race for a balloon page being freed under us and
> - * raise its refcount preventing __free_pages() from doing its job
> - * the put_page() at the end of this block will take care of
> - * release this page, thus avoiding a nasty leakage.
> + * A ballooned page, by default, has PagePrivate set.
> + * Prevent concurrent compaction threads from isolating
> + * an already isolated balloon page by clearing it.
> */
> - if (likely(get_page_unless_zero(page))) {
> - /*
> - * As balloon pages are not isolated from LRU lists, concurrent
> - * compaction threads can race against page migration functions
> - * as well as race against the balloon driver releasing a page.
> - *
> - * In order to avoid having an already isolated balloon page
> - * being (wrongly) re-isolated while it is under migration,
> - * or to avoid attempting to isolate pages being released by
> - * the balloon driver, lets be sure we have the page lock
> - * before proceeding with the balloon page isolation steps.
> - */
> - if (likely(trylock_page(page))) {
> - /*
> - * A ballooned page, by default, has PagePrivate set.
> - * Prevent concurrent compaction threads from isolating
> - * an already isolated balloon page by clearing it.
> - */
> - if (balloon_page_movable(page)) {
> - __isolate_balloon_page(page);
> - unlock_page(page);
> - return true;
> - }
> - unlock_page(page);
> - }
> - put_page(page);
> + if (balloon_page_movable(page)) {
> + __isolate_balloon_page(page);
> + return true;
> }
> return false;
> }
> @@ -175,30 +148,28 @@ bool balloon_page_isolate(struct page *page)
> /* putback_lru_page() counterpart for a ballooned page */
> void balloon_page_putback(struct page *page)
> {
> - /*
> - * 'lock_page()' stabilizes the page and prevents races against
> - * concurrent isolation threads attempting to re-isolate it.
> - */
> - lock_page(page);
> + if (!isolated_balloon_page(page))
> + return;
>
> if (__is_movable_balloon_page(page)) {
> __putback_balloon_page(page);
> - /* drop the extra ref count taken for page isolation */
> - put_page(page);
> } else {
> WARN_ON(1);
> dump_page(page, "not movable balloon page");
> }
> - unlock_page(page);
> }
>
> /* move_to_new_page() counterpart for a ballooned page */
> -int balloon_page_migrate(struct page *newpage,
> +int balloon_page_migrate(struct address_space *mapping,
> + struct page *newpage,
> struct page *page, enum migrate_mode mode)
> {
> struct balloon_dev_info *balloon = balloon_page_device(page);
> int rc = -EAGAIN;
>
> + if (!isolated_balloon_page(page))
> + return rc;
> +
> /*
> * Block others from accessing the 'newpage' when we get around to
> * establishing additional references. We should be the only one
> @@ -218,4 +189,12 @@ int balloon_page_migrate(struct page *newpage,
> unlock_page(newpage);
> return rc;
> }
> +
> +/* define the balloon_mapping->a_ops callback to allow balloon page migration */
> +const struct address_space_operations balloon_aops = {
> + .migratepage = balloon_page_migrate,
> + .isolatepage = balloon_page_isolate,
> + .putbackpage = balloon_page_putback,
> +};
> +EXPORT_SYMBOL_GPL(balloon_aops);
> #endif /* CONFIG_BALLOON_COMPACTION */
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 018f08d..81bafaf 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -719,7 +719,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> */
> if (!PageLRU(page)) {
> if (unlikely(balloon_page_movable(page))) {
> - if (balloon_page_isolate(page)) {
> + if (balloon_page_isolate(page, isolate_mode)) {
> /* Successfully isolated */
> goto isolate_success;
> }
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f53838f..c94038e 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -852,7 +852,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> * in order to avoid burning cycles at rmap level, and perform
> * the page migration right away (proteced by page lock).
> */
> - rc = balloon_page_migrate(newpage, page, mode);
> + rc = balloon_page_migrate(page->mapping, newpage, page, mode);
> goto out_unlock;
> }
>
> --
> 2.1.4
2015-07-09 오후 10:08에 Daniel Vetter 이(가) 쓴 글:
> Also there's a bit a lack of gpu drivers from the arm world in upstream,
> which is probabyl why this patch series doesn't come with a user. Might be
> better to first upstream the driver before talking about additional
> infrastructure that it needs.
> -Daniel
I'm not from ARM but I just got the idea of driver page migration
during I worked with ARM gpu driver.
I'm sure this patch is good for zram and balloon
and hope it can be applied to drivers consuming many pages and generating fragmentation,
such as GPU or gfx driver.
>> @@ -124,6 +130,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
>> struct page *page)
>> {
>> __SetPageBalloon(page);
>> + page->mapping = balloon->inode->i_mapping;
>> SetPagePrivate(page);
>> set_page_private(page, (unsigned long)balloon);
>> list_add(&page->lru, &balloon->pages);
>> @@ -140,6 +147,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
>> static inline void balloon_page_delete(struct page *page)
>> {
>> __ClearPageBalloon(page);
>> + page->mapping = NULL;
>> set_page_private(page, 0);
>> if (PagePrivate(page)) {
>> ClearPagePrivate(page);
>
> Order of cleanup here is not the reverse of the order of initialization.
> Better make it exactly the reverse.
>
>
> Also, I have a question: is it enough to lock the page to make changing
> the mapping safe? Do all users lock the page too?
>
>
>
>
I think balloon developers can answer that precisely.
I've just follow this comment:
http://lxr.free-electrons.com/source/include/linux/balloon_compaction.h#L16
On Tue, Jul 7, 2015 at 7:36 AM, Gioh Kim <[email protected]> wrote:
> From: Gioh Kim <[email protected]>
>
> Add framework to register callback functions and check page mobility.
> There are some modes for page isolation so that isolate interface
> has arguments of page address and isolation mode while putback
> interface has only page address as argument.
>
> Signed-off-by: Gioh Kim <[email protected]>
> ---
> fs/proc/page.c | 3 ++
> include/linux/compaction.h | 76 ++++++++++++++++++++++++++++++++++
> include/linux/fs.h | 2 +
> include/linux/page-flags.h | 19 +++++++++
> include/uapi/linux/kernel-page-flags.h | 1 +
> 5 files changed, 101 insertions(+)
>
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 7eee2d8..a4f5a00 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -146,6 +146,9 @@ u64 stable_page_flags(struct page *page)
> if (PageBalloon(page))
> u |= 1 << KPF_BALLOON;
>
> + if (PageMobile(page))
> + u |= 1 << KPF_MOBILE;
> +
> u |= kpf_copy_bit(k, KPF_LOCKED, PG_locked);
>
> u |= kpf_copy_bit(k, KPF_SLAB, PG_slab);
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index aa8f61c..c375a89 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -1,6 +1,9 @@
> #ifndef _LINUX_COMPACTION_H
> #define _LINUX_COMPACTION_H
>
> +#include <linux/page-flags.h>
> +#include <linux/pagemap.h>
> +
> /* Return values for compact_zone() and try_to_compact_pages() */
> /* compaction didn't start as it was deferred due to past failures */
> #define COMPACT_DEFERRED 0
> @@ -51,6 +54,66 @@ extern void compaction_defer_reset(struct zone *zone, int order,
> bool alloc_success);
> extern bool compaction_restarting(struct zone *zone, int order);
>
> +static inline bool mobile_page(struct page *page)
> +{
> + return page->mapping && page->mapping->a_ops &&
Dereferncing mapping->a_ops isn't safe without page-lock and isn't required:
all mappings always have ->a_ops.
> + (PageMobile(page) || PageBalloon(page));
> +}
> +
> +static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
> +{
> + bool ret;
> +
> + /*
> + * Avoid burning cycles with pages that are yet under __free_pages(),
> + * or just got freed under us.
> + *
> + * In case we 'win' a race for a mobile page being freed under us and
> + * raise its refcount preventing __free_pages() from doing its job
> + * the put_page() at the end of this block will take care of
> + * release this page, thus avoiding a nasty leakage.
> + */
> + if (likely(get_page_unless_zero(page))) {
> + /*
> + * As mobile pages are not isolated from LRU lists, concurrent
> + * compaction threads can race against page migration functions
> + * as well as race against the releasing a page.
> + *
> + * In order to avoid having an already isolated mobile page
> + * being (wrongly) re-isolated while it is under migration,
> + * or to avoid attempting to isolate pages being released,
> + * lets be sure we have the page lock
> + * before proceeding with the mobile page isolation steps.
> + */
> + if (likely(trylock_page(page))) {
> + if (mobile_page(page) &&
> + page->mapping->a_ops->isolatepage) {
> + ret = page->mapping->a_ops->isolatepage(page,
> + mode);
> + unlock_page(page);
> + return ret;
> + }
> + unlock_page(page);
> + }
> + put_page(page);
> + }
> + return false;
> +}
> +
> +static inline void putback_mobilepage(struct page *page)
> +{
> + /*
> + * 'lock_page()' stabilizes the page and prevents races against
> + * concurrent isolation threads attempting to re-isolate it.
> + */
> + lock_page(page);
> + if (mobile_page(page) && page->mapping->a_ops->putbackpage) {
It seems "if (page->mapping && page->mapping->a_ops->putbackpage)"
should be enough: we already seen that page as mobile.
> + page->mapping->a_ops->putbackpage(page);
> + /* drop the extra ref count taken for mobile page isolation */
> + put_page(page);
> + }
> + unlock_page(page);
call put_page() after unlock and do that always -- putback must drop
page reference from caller.
lock_page(page);
if (page->mapping && page->mapping->a_ops->putbackpage)
page->mapping->a_ops->putbackpage(page);
unlock_page();
put_page(page);
> +}
> #else
> static inline unsigned long try_to_compact_pages(gfp_t gfp_mask,
> unsigned int order, int alloc_flags,
> @@ -83,6 +146,19 @@ static inline bool compaction_deferred(struct zone *zone, int order)
> return true;
> }
>
> +static inline bool mobile_page(struct page *page)
> +{
> + return false;
> +}
> +
> +static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
> +{
> + return false;
> +}
> +
> +static inline void putback_mobilepage(struct page *page)
> +{
> +}
> #endif /* CONFIG_COMPACTION */
>
> #if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 35ec87e..33c9aa5 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -395,6 +395,8 @@ struct address_space_operations {
> */
> int (*migratepage) (struct address_space *,
> struct page *, struct page *, enum migrate_mode);
> + bool (*isolatepage) (struct page *, isolate_mode_t);
> + void (*putbackpage) (struct page *);
> int (*launder_page) (struct page *);
> int (*is_partially_uptodate) (struct page *, unsigned long,
> unsigned long);
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index f34e040..abef145 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -582,6 +582,25 @@ static inline void __ClearPageBalloon(struct page *page)
> atomic_set(&page->_mapcount, -1);
> }
>
> +#define PAGE_MOBILE_MAPCOUNT_VALUE (-255)
> +
> +static inline int PageMobile(struct page *page)
> +{
> + return atomic_read(&page->_mapcount) == PAGE_MOBILE_MAPCOUNT_VALUE;
> +}
> +
> +static inline void __SetPageMobile(struct page *page)
> +{
> + VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
> + atomic_set(&page->_mapcount, PAGE_MOBILE_MAPCOUNT_VALUE);
> +}
> +
> +static inline void __ClearPageMobile(struct page *page)
> +{
> + VM_BUG_ON_PAGE(!PageMobile(page), page);
> + atomic_set(&page->_mapcount, -1);
> +}
> +
> /*
> * If network-based swap is enabled, sl*b must keep track of whether pages
> * were allocated from pfmemalloc reserves.
> diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
> index a6c4962..d50d9e8 100644
> --- a/include/uapi/linux/kernel-page-flags.h
> +++ b/include/uapi/linux/kernel-page-flags.h
> @@ -33,6 +33,7 @@
> #define KPF_THP 22
> #define KPF_BALLOON 23
> #define KPF_ZERO_PAGE 24
> +#define KPF_MOBILE 25
>
>
> #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
> --
> 2.1.4
>
On Tue, Jul 7, 2015 at 7:36 AM, Gioh Kim <[email protected]> wrote:
> From: Gioh Kim <[email protected]>
>
> Apply mobile page migration into balloon driver.
> The balloong driver has an anonymous inode that manages
> address_space_operation for page migration.
>
> Signed-off-by: Gioh Kim <[email protected]>
> ---
> drivers/virtio/virtio_balloon.c | 3 ++
> include/linux/balloon_compaction.h | 15 +++++++--
> mm/balloon_compaction.c | 65 +++++++++++++-------------------------
> mm/compaction.c | 2 +-
> mm/migrate.c | 2 +-
> 5 files changed, 39 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 82e80e0..ef5b9b5 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -30,6 +30,7 @@
> #include <linux/balloon_compaction.h>
> #include <linux/oom.h>
> #include <linux/wait.h>
> +#include <linux/anon_inodes.h>
>
> /*
> * Balloon device works in 4K page units. So each page is pointed to by
> @@ -505,6 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
> balloon_devinfo_init(&vb->vb_dev_info);
> #ifdef CONFIG_BALLOON_COMPACTION
> vb->vb_dev_info.migratepage = virtballoon_migratepage;
> + vb->vb_dev_info.inode = anon_inode_new();
> + vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
> #endif
>
> err = init_vqs(vb);
> diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
> index 9b0a15d..a9e0bde 100644
> --- a/include/linux/balloon_compaction.h
> +++ b/include/linux/balloon_compaction.h
> @@ -48,6 +48,7 @@
> #include <linux/migrate.h>
> #include <linux/gfp.h>
> #include <linux/err.h>
> +#include <linux/fs.h>
>
> /*
> * Balloon device information descriptor.
> @@ -62,6 +63,7 @@ struct balloon_dev_info {
> struct list_head pages; /* Pages enqueued & handled to Host */
> int (*migratepage)(struct balloon_dev_info *, struct page *newpage,
> struct page *page, enum migrate_mode mode);
> + struct inode *inode;
> };
>
> extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info);
> @@ -73,12 +75,16 @@ static inline void balloon_devinfo_init(struct balloon_dev_info *balloon)
> spin_lock_init(&balloon->pages_lock);
> INIT_LIST_HEAD(&balloon->pages);
> balloon->migratepage = NULL;
> + balloon->inode = NULL;
> }
>
> #ifdef CONFIG_BALLOON_COMPACTION
> -extern bool balloon_page_isolate(struct page *page);
> +extern const struct address_space_operations balloon_aops;
> +extern bool balloon_page_isolate(struct page *page,
> + isolate_mode_t mode);
> extern void balloon_page_putback(struct page *page);
> -extern int balloon_page_migrate(struct page *newpage,
> +extern int balloon_page_migrate(struct address_space *mapping,
> + struct page *newpage,
> struct page *page, enum migrate_mode mode);
>
> /*
> @@ -124,6 +130,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
> struct page *page)
> {
> __SetPageBalloon(page);
> + page->mapping = balloon->inode->i_mapping;
> SetPagePrivate(page);
> set_page_private(page, (unsigned long)balloon);
> list_add(&page->lru, &balloon->pages);
> @@ -140,6 +147,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
> static inline void balloon_page_delete(struct page *page)
> {
> __ClearPageBalloon(page);
> + page->mapping = NULL;
> set_page_private(page, 0);
> if (PagePrivate(page)) {
> ClearPagePrivate(page);
> @@ -191,7 +199,8 @@ static inline bool isolated_balloon_page(struct page *page)
> return false;
> }
>
> -static inline bool balloon_page_isolate(struct page *page)
> +static inline bool balloon_page_isolate(struct page *page,
> + isolate_mode_t mode)
> {
> return false;
> }
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index fcad832..0dd0b0d 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -131,43 +131,16 @@ static inline void __putback_balloon_page(struct page *page)
> }
>
> /* __isolate_lru_page() counterpart for a ballooned page */
> -bool balloon_page_isolate(struct page *page)
> +bool balloon_page_isolate(struct page *page, isolate_mode_t mode)
> {
> /*
> - * Avoid burning cycles with pages that are yet under __free_pages(),
> - * or just got freed under us.
> - *
> - * In case we 'win' a race for a balloon page being freed under us and
> - * raise its refcount preventing __free_pages() from doing its job
> - * the put_page() at the end of this block will take care of
> - * release this page, thus avoiding a nasty leakage.
> + * A ballooned page, by default, has PagePrivate set.
> + * Prevent concurrent compaction threads from isolating
> + * an already isolated balloon page by clearing it.
> */
> - if (likely(get_page_unless_zero(page))) {
> - /*
> - * As balloon pages are not isolated from LRU lists, concurrent
> - * compaction threads can race against page migration functions
> - * as well as race against the balloon driver releasing a page.
> - *
> - * In order to avoid having an already isolated balloon page
> - * being (wrongly) re-isolated while it is under migration,
> - * or to avoid attempting to isolate pages being released by
> - * the balloon driver, lets be sure we have the page lock
> - * before proceeding with the balloon page isolation steps.
> - */
> - if (likely(trylock_page(page))) {
> - /*
> - * A ballooned page, by default, has PagePrivate set.
> - * Prevent concurrent compaction threads from isolating
> - * an already isolated balloon page by clearing it.
> - */
> - if (balloon_page_movable(page)) {
> - __isolate_balloon_page(page);
> - unlock_page(page);
> - return true;
> - }
> - unlock_page(page);
> - }
> - put_page(page);
> + if (balloon_page_movable(page)) {
> + __isolate_balloon_page(page);
> + return true;
> }
> return false;
> }
> @@ -175,30 +148,28 @@ bool balloon_page_isolate(struct page *page)
> /* putback_lru_page() counterpart for a ballooned page */
> void balloon_page_putback(struct page *page)
> {
> - /*
> - * 'lock_page()' stabilizes the page and prevents races against
> - * concurrent isolation threads attempting to re-isolate it.
> - */
> - lock_page(page);
> + if (!isolated_balloon_page(page))
> + return;
>
> if (__is_movable_balloon_page(page)) {
> __putback_balloon_page(page);
> - /* drop the extra ref count taken for page isolation */
> - put_page(page);
> } else {
> WARN_ON(1);
> dump_page(page, "not movable balloon page");
> }
> - unlock_page(page);
> }
>
> /* move_to_new_page() counterpart for a ballooned page */
> -int balloon_page_migrate(struct page *newpage,
> +int balloon_page_migrate(struct address_space *mapping,
> + struct page *newpage,
> struct page *page, enum migrate_mode mode)
> {
> struct balloon_dev_info *balloon = balloon_page_device(page);
> int rc = -EAGAIN;
>
> + if (!isolated_balloon_page(page))
> + return rc;
> +
> /*
> * Block others from accessing the 'newpage' when we get around to
> * establishing additional references. We should be the only one
> @@ -218,4 +189,12 @@ int balloon_page_migrate(struct page *newpage,
> unlock_page(newpage);
Both pages passed as arguments of ->migratepage() are locked.
So, please remove lock/unlock from this function here and add lock/unlock
newpage in __unmap_and_move() below. Right in this patch.
> return rc;
> }
> +
> +/* define the balloon_mapping->a_ops callback to allow balloon page migration */
> +const struct address_space_operations balloon_aops = {
> + .migratepage = balloon_page_migrate,
> + .isolatepage = balloon_page_isolate,
> + .putbackpage = balloon_page_putback,
> +};
> +EXPORT_SYMBOL_GPL(balloon_aops);
> #endif /* CONFIG_BALLOON_COMPACTION */
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 018f08d..81bafaf 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -719,7 +719,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> */
> if (!PageLRU(page)) {
> if (unlikely(balloon_page_movable(page))) {
> - if (balloon_page_isolate(page)) {
> + if (balloon_page_isolate(page, isolate_mode)) {
> /* Successfully isolated */
> goto isolate_success;
> }
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f53838f..c94038e 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -852,7 +852,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> * in order to avoid burning cycles at rmap level, and perform
> * the page migration right away (proteced by page lock).
> */
> - rc = balloon_page_migrate(newpage, page, mode);
Here:
lock_page(newpage);
> + rc = balloon_page_migrate(page->mapping, newpage, page, mode);
unlock_page(newpage);
> goto out_unlock;
> }
>
> --
> 2.1.4
>
One more note below.
On Tue, Jul 7, 2015 at 7:36 AM, Gioh Kim <[email protected]> wrote:
> From: Gioh Kim <[email protected]>
>
> Add framework to register callback functions and check page mobility.
> There are some modes for page isolation so that isolate interface
> has arguments of page address and isolation mode while putback
> interface has only page address as argument.
>
> Signed-off-by: Gioh Kim <[email protected]>
> ---
> fs/proc/page.c | 3 ++
> include/linux/compaction.h | 76 ++++++++++++++++++++++++++++++++++
> include/linux/fs.h | 2 +
> include/linux/page-flags.h | 19 +++++++++
> include/uapi/linux/kernel-page-flags.h | 1 +
> 5 files changed, 101 insertions(+)
>
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 7eee2d8..a4f5a00 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -146,6 +146,9 @@ u64 stable_page_flags(struct page *page)
> if (PageBalloon(page))
> u |= 1 << KPF_BALLOON;
>
> + if (PageMobile(page))
> + u |= 1 << KPF_MOBILE;
> +
> u |= kpf_copy_bit(k, KPF_LOCKED, PG_locked);
>
> u |= kpf_copy_bit(k, KPF_SLAB, PG_slab);
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index aa8f61c..c375a89 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -1,6 +1,9 @@
> #ifndef _LINUX_COMPACTION_H
> #define _LINUX_COMPACTION_H
>
> +#include <linux/page-flags.h>
> +#include <linux/pagemap.h>
> +
> /* Return values for compact_zone() and try_to_compact_pages() */
> /* compaction didn't start as it was deferred due to past failures */
> #define COMPACT_DEFERRED 0
> @@ -51,6 +54,66 @@ extern void compaction_defer_reset(struct zone *zone, int order,
> bool alloc_success);
> extern bool compaction_restarting(struct zone *zone, int order);
>
> +static inline bool mobile_page(struct page *page)
> +{
> + return page->mapping && page->mapping->a_ops &&
> + (PageMobile(page) || PageBalloon(page));
> +}
> +
> +static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
> +{
> + bool ret;
> +
> + /*
> + * Avoid burning cycles with pages that are yet under __free_pages(),
> + * or just got freed under us.
> + *
> + * In case we 'win' a race for a mobile page being freed under us and
> + * raise its refcount preventing __free_pages() from doing its job
> + * the put_page() at the end of this block will take care of
> + * release this page, thus avoiding a nasty leakage.
> + */
> + if (likely(get_page_unless_zero(page))) {
> + /*
> + * As mobile pages are not isolated from LRU lists, concurrent
> + * compaction threads can race against page migration functions
> + * as well as race against the releasing a page.
> + *
> + * In order to avoid having an already isolated mobile page
> + * being (wrongly) re-isolated while it is under migration,
> + * or to avoid attempting to isolate pages being released,
> + * lets be sure we have the page lock
> + * before proceeding with the mobile page isolation steps.
> + */
> + if (likely(trylock_page(page))) {
> + if (mobile_page(page) &&
> + page->mapping->a_ops->isolatepage) {
> + ret = page->mapping->a_ops->isolatepage(page,
> + mode);
> + unlock_page(page);
Here you leak page reference if isolatepage() fails.
> + return ret;
> + }
> + unlock_page(page);
> + }
> + put_page(page);
> + }
> + return false;
> +}
> +
> +static inline void putback_mobilepage(struct page *page)
> +{
> + /*
> + * 'lock_page()' stabilizes the page and prevents races against
> + * concurrent isolation threads attempting to re-isolate it.
> + */
> + lock_page(page);
> + if (mobile_page(page) && page->mapping->a_ops->putbackpage) {
> + page->mapping->a_ops->putbackpage(page);
> + /* drop the extra ref count taken for mobile page isolation */
> + put_page(page);
> + }
> + unlock_page(page);
> +}
> #else
> static inline unsigned long try_to_compact_pages(gfp_t gfp_mask,
> unsigned int order, int alloc_flags,
> @@ -83,6 +146,19 @@ static inline bool compaction_deferred(struct zone *zone, int order)
> return true;
> }
>
> +static inline bool mobile_page(struct page *page)
> +{
> + return false;
> +}
> +
> +static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
> +{
> + return false;
> +}
> +
> +static inline void putback_mobilepage(struct page *page)
> +{
> +}
> #endif /* CONFIG_COMPACTION */
>
> #if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 35ec87e..33c9aa5 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -395,6 +395,8 @@ struct address_space_operations {
> */
> int (*migratepage) (struct address_space *,
> struct page *, struct page *, enum migrate_mode);
> + bool (*isolatepage) (struct page *, isolate_mode_t);
> + void (*putbackpage) (struct page *);
> int (*launder_page) (struct page *);
> int (*is_partially_uptodate) (struct page *, unsigned long,
> unsigned long);
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index f34e040..abef145 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -582,6 +582,25 @@ static inline void __ClearPageBalloon(struct page *page)
> atomic_set(&page->_mapcount, -1);
> }
>
> +#define PAGE_MOBILE_MAPCOUNT_VALUE (-255)
> +
> +static inline int PageMobile(struct page *page)
> +{
> + return atomic_read(&page->_mapcount) == PAGE_MOBILE_MAPCOUNT_VALUE;
> +}
> +
> +static inline void __SetPageMobile(struct page *page)
> +{
> + VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
> + atomic_set(&page->_mapcount, PAGE_MOBILE_MAPCOUNT_VALUE);
> +}
> +
> +static inline void __ClearPageMobile(struct page *page)
> +{
> + VM_BUG_ON_PAGE(!PageMobile(page), page);
> + atomic_set(&page->_mapcount, -1);
> +}
> +
> /*
> * If network-based swap is enabled, sl*b must keep track of whether pages
> * were allocated from pfmemalloc reserves.
> diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
> index a6c4962..d50d9e8 100644
> --- a/include/uapi/linux/kernel-page-flags.h
> +++ b/include/uapi/linux/kernel-page-flags.h
> @@ -33,6 +33,7 @@
> #define KPF_THP 22
> #define KPF_BALLOON 23
> #define KPF_ZERO_PAGE 24
> +#define KPF_MOBILE 25
>
>
> #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
> --
> 2.1.4
>
>> @@ -51,6 +54,66 @@ extern void compaction_defer_reset(struct zone *zone, int order,
>> bool alloc_success);
>> extern bool compaction_restarting(struct zone *zone, int order);
>>
>> +static inline bool mobile_page(struct page *page)
>> +{
>> + return page->mapping && page->mapping->a_ops &&
>
> Dereferncing mapping->a_ops isn't safe without page-lock and isn't required:
> all mappings always have ->a_ops.
>
I got it.
>> +static inline void putback_mobilepage(struct page *page)
>> +{
>> + /*
>> + * 'lock_page()' stabilizes the page and prevents races against
>> + * concurrent isolation threads attempting to re-isolate it.
>> + */
>> + lock_page(page);
>> + if (mobile_page(page) && page->mapping->a_ops->putbackpage) {
>
> It seems "if (page->mapping && page->mapping->a_ops->putbackpage)"
> should be enough: we already seen that page as mobile.
Ditto.
>
>> + page->mapping->a_ops->putbackpage(page);
>> + /* drop the extra ref count taken for mobile page isolation */
>> + put_page(page);
>> + }
>> + unlock_page(page);
>
> call put_page() after unlock and do that always -- putback must drop
> page reference from caller.
>
> lock_page(page);
> if (page->mapping && page->mapping->a_ops->putbackpage)
> page->mapping->a_ops->putbackpage(page);
> unlock_page();
> put_page(page);
>
Ditto.
>> +}
>> #else
>> static inline unsigned long try_to_compact_pages(gfp_t gfp_mask,
>> unsigned int order, int alloc_flags,
>> @@ -83,6 +146,19 @@ static inline bool compaction_deferred(struct zone *zone, int order)
>> return true;
>> }
>>
>> +static inline bool mobile_page(struct page *page)
>> +{
>> + return false;
>> +}
>> +
>> +static inline bool isolate_mobilepage(struct page *page, isolate_mode_t mode)
>> +{
>> + return false;
>> +}
>> +
>> +static inline void putback_mobilepage(struct page *page)
>> +{
>> +}
>> #endif /* CONFIG_COMPACTION */
>>
>> #if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 35ec87e..33c9aa5 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -395,6 +395,8 @@ struct address_space_operations {
>> */
>> int (*migratepage) (struct address_space *,
>> struct page *, struct page *, enum migrate_mode);
>> + bool (*isolatepage) (struct page *, isolate_mode_t);
>> + void (*putbackpage) (struct page *);
>> int (*launder_page) (struct page *);
>> int (*is_partially_uptodate) (struct page *, unsigned long,
>> unsigned long);
>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
>> index f34e040..abef145 100644
>> --- a/include/linux/page-flags.h
>> +++ b/include/linux/page-flags.h
>> @@ -582,6 +582,25 @@ static inline void __ClearPageBalloon(struct page *page)
>> atomic_set(&page->_mapcount, -1);
>> }
>>
>> +#define PAGE_MOBILE_MAPCOUNT_VALUE (-255)
>> +
>> +static inline int PageMobile(struct page *page)
>> +{
>> + return atomic_read(&page->_mapcount) == PAGE_MOBILE_MAPCOUNT_VALUE;
>> +}
>> +
>> +static inline void __SetPageMobile(struct page *page)
>> +{
>> + VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
>> + atomic_set(&page->_mapcount, PAGE_MOBILE_MAPCOUNT_VALUE);
>> +}
>> +
>> +static inline void __ClearPageMobile(struct page *page)
>> +{
>> + VM_BUG_ON_PAGE(!PageMobile(page), page);
>> + atomic_set(&page->_mapcount, -1);
>> +}
>> +
>> /*
>> * If network-based swap is enabled, sl*b must keep track of whether pages
>> * were allocated from pfmemalloc reserves.
>> diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
>> index a6c4962..d50d9e8 100644
>> --- a/include/uapi/linux/kernel-page-flags.h
>> +++ b/include/uapi/linux/kernel-page-flags.h
>> @@ -33,6 +33,7 @@
>> #define KPF_THP 22
>> #define KPF_BALLOON 23
>> #define KPF_ZERO_PAGE 24
>> +#define KPF_MOBILE 25
>>
>>
>> #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
>> --
>> 2.1.4
>>
>
I fixed the code as your comments and I found patch 3/5 and 4/5 could not be applied separately.
So I merge them and report new [PATCH].
I appreciate your reviews.