2005-11-08 21:05:51

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 0/8] Direct Migration V2: Overview

Changes V1->V2:
- Call node_remap with the right parameters in do_migrate_pages().
- Take radix tree lock while examining page count to avoid races with
find_get_page() and various *_get_pages based on it.
- Convert direct ptes to swap ptes before radix tree update to avoid
more races.
- Fix problem if CONFIG_MIGRATION is off for buffer_migrate_page
- Add documentation about page migration
- Change migrate_pages() api so that the caller can decide what
to do about the migrated pages (badmem handling and hotplug
have to remove those pages for good).
- Drop config patch (already in mm)
- Add try_to_unmap patch
- Patchset now against 2.6.14-mm1 without requiring additional patches.

Note that the page migration here is different from the one of the memory
hotplug project. Pages are migrated in order to improve performance.
A best effort is made to migrate all pages that are in use by user space
and that are swappable. If a couple of pages are not moved then the
performance of a process will not increase as much as wanted but the
application will continue to function properly.

Much of the ideas for this code were originally developed in the memory
hotplug project and we hope that this code also will allow the hotplug
project to build on this patch in order to get to their goals. We also
would like to be able to move bad memory at SGI which is likely something
that will also be based on this patchset.

I am very thankful for the support of the hotplug developers for bringing
this patchset about. The migration of kernel pages, slab pages and
other unswappable pages that is also needed by the hotplug project
and for the remapping of bad memory is likely to require a significant
amount of additional changes to the Linux kernel beyond the scope of
this page migration endeavor.

Page migration can be triggered via:

A. Specifying MPOL_MF_MOVE(_ALL) when setting a new policy
for a range of addresses of a process.

B. Calling sys_migrate_pages() to control the location of the pages of
another process. Pages may migrate back through swapping if memory
policies, cpuset nodes and the node on which the process is executing
are not changed by other means.
sys_migrate_pages() may be particularly useful to move the pages of
a process if the scheduler has shifted the execution of a process
to a different node.

C. Changing the cpuset of a task (moving tasks to another cpuset or modifying
its set of allowed nodes) if a special option is set in the cpuset. The
cpuset code will call into the page migration layer in order to move the
process to its new environment. This is the preferred and easiest method
to use page migration. Thanks to Paul Jackson for realizing this
functionality [The additional cpuset functions are not in Andrew's tree yet].

The patchset consists of eight patches (only the first three are necessary to
have basic direct migration support):

1. Swap migration V5 fixes.

Some small fixes that may already be in Andrew's tree.

2. SwapCache patch

SwapCache pages may have changed their type after lock_page().
Check for this and retry lookup if the page is no longer a SwapCache
page.

3. migrate_pages()

Basic direct migration with fallback to swap if all other attempts
fail.

4. remove_from_swap()

Page migration installs swap ptes for anonymous pages in order to
preserve the information contained in the page tables. This patch
removes the swap ptes and replaces them with real ones after migration.

5. upgrade of MPOL_MF_MOVE and sys_migrate_pages()

Add logic to mm/mempolicy.c to allow the policy layer to control
direct page migration. Thanks to Paul Jackson for the interative
logic to move between sets of nodes.


6. buffer_migrate_pages() patch

Allow migration without writing back dirty pages. Add filesystem dependent
migration support for ext2/ext3 and xfs. Use swapper space to define a special
method to migrate anonymous pages without writeback.

7. add_to_swap with gfp mask

The default of add_to_swap is to use GFP_ATOMIC for necessary allocations.
This may cause out of memory situations during page migration. This patch
adds an additional parameter to add_to_swap to allow GFP_KERNEL allocations.

8. try_unmap patch

Allows to distinguish between permanent failure conditions and transient
conditions that may go away after a retry.

Credits (also in mm/vsmscan.c):

The idea for this scheme of page migration was first developed in the context
of the memory hotplug project. The main authors of the migration code from
the memory hotplug project are:

IWAMOTO Toshihiro <[email protected]>
Hirokazu Takahashi <[email protected]>
Dave Hansen <[email protected]>


2005-11-08 21:03:51

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 1/8] Direct Migration V2: Swap migration patchset fixes

Fixes to swap migration patch V5 (may already be in mm)

- Fix comment for isolate_lru_page() and the check for the result
of __isolate_lru_page in isolate_lru_pages()

- migrate_page_add: check for mapping == NULL

- check_range: Its okay if the first vma has the flag VM_RESERVED set if the
MPOL_MF_DISCONTIG_OK flag was specified by the caller.

- Change the permission check to use comparisons instead of XORs.
Revise the comments.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-07 11:48:47.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-08 11:17:13.000000000 -0800
@@ -755,7 +755,7 @@ static int isolate_lru_pages(struct zone
/* Succeeded to isolate page */
list_add(&page->lru, dst);
break;
- case -1:
+ case -ENOENT:
/* Not possible to isolate */
list_move(&page->lru, src);
break;
@@ -782,7 +782,7 @@ static void lru_add_drain_per_cpu(void *
* Result:
* 0 = page not on LRU list
* 1 = page removed from LRU list and added to the specified list.
- * -1 = page is being freed elsewhere.
+ * -ENOENT = page is being freed elsewhere.
*/
int isolate_lru_page(struct page *page)
{
Index: linux-2.6.14-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/mempolicy.c 2005-11-07 11:48:26.000000000 -0800
+++ linux-2.6.14-mm1/mm/mempolicy.c 2005-11-08 11:16:42.000000000 -0800
@@ -217,6 +217,7 @@ static void migrate_page_add(struct vm_a
* Avoid migrating a page that is shared by others and not writable.
*/
if ((flags & MPOL_MF_MOVE_ALL) ||
+ !page->mapping ||
PageAnon(page) ||
mapping_writably_mapped(page->mapping) ||
single_mm_mapping(vma->vm_mm, page->mapping)
@@ -359,7 +360,8 @@ check_range(struct mm_struct *mm, unsign
first = find_vma(mm, start);
if (!first)
return ERR_PTR(-EFAULT);
- if (first->vm_flags & VM_RESERVED)
+ if (first->vm_flags & VM_RESERVED &&
+ !(flags & MPOL_MF_DISCONTIG_OK))
return ERR_PTR(-EACCES);
prev = NULL;
for (vma = first; vma && vma->vm_start < end; vma = vma->vm_next) {
@@ -790,18 +792,13 @@ asmlinkage long sys_migrate_pages(pid_t
return -EINVAL;

/*
- * We only allow a process to move the pages of another
- * if the process issuing sys_migrate has the right to send a kill
- * signal to the process to be moved. Moving another processes
- * memory may impact the performance of that process. If the
- * process issuing sys_migrate_pages has the right to kill the
- * target process then obviously that process has the right to
- * impact the performance of the target process.
- *
- * The permission check was taken from check_kill_permission()
+ * Check if this process has the right to modify the specified
+ * process. The right exists if the process has administrative
+ * capabilities, superuser priviledges or the same
+ * userid as the target process.
*/
- if ((current->euid ^ task->suid) && (current->euid ^ task->uid) &&
- (current->uid ^ task->suid) && (current->uid ^ task->uid) &&
+ if ((current->euid != task->suid) && (current->euid != task->uid) &&
+ (current->uid != task->suid) && (current->uid != task->uid) &&
!capable(CAP_SYS_ADMIN)) {
err = -EPERM;
goto out;

2005-11-08 21:03:50

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 2/8] Direct Migration V2: PageSwapCache checks

Check for PageSwapCache after looking up and locking a swap page.

The page migration code may change a swap pte to point to a different page
under lock_page().

If that happens then the vm must retry the lookup operation in the swap
space to find the correct page number. There are a couple of locations
in the VM where a lock_page() is done on a swap page. In these locations
we need to check afterwards if the page was migrated. If the page was migrated
then the old page that was looked up before was freed and no longer has the
PageSwapCache bit set.

Signed-off-by: Hirokazu Takahashi <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Signed-off-by: Christoph Lameter <clameter@@sgi.com>

Index: linux-2.6.14-mm1/mm/memory.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/memory.c 2005-11-07 11:48:19.000000000 -0800
+++ linux-2.6.14-mm1/mm/memory.c 2005-11-07 11:55:08.000000000 -0800
@@ -1720,6 +1720,7 @@ static int do_swap_page(struct mm_struct
goto out;

entry = pte_to_swp_entry(orig_pte);
+again:
page = lookup_swap_cache(entry);
if (!page) {
swapin_readahead(entry, address, vma);
@@ -1743,6 +1744,12 @@ static int do_swap_page(struct mm_struct

mark_page_accessed(page);
lock_page(page);
+ if (!PageSwapCache(page)) {
+ /* Page migration has occured */
+ unlock_page(page);
+ page_cache_release(page);
+ goto again;
+ }

/*
* Back out if somebody else already faulted in this pte.
Index: linux-2.6.14-mm1/mm/shmem.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/shmem.c 2005-11-07 11:48:08.000000000 -0800
+++ linux-2.6.14-mm1/mm/shmem.c 2005-11-07 11:55:08.000000000 -0800
@@ -1013,6 +1013,14 @@ repeat:
page_cache_release(swappage);
goto repeat;
}
+ if (!PageSwapCache(swappage)) {
+ /* Page migration has occured */
+ shmem_swp_unmap(entry);
+ spin_unlock(&info->lock);
+ unlock_page(swappage);
+ page_cache_release(swappage);
+ goto repeat;
+ }
if (PageWriteback(swappage)) {
shmem_swp_unmap(entry);
spin_unlock(&info->lock);
Index: linux-2.6.14-mm1/mm/swapfile.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/swapfile.c 2005-11-07 11:48:49.000000000 -0800
+++ linux-2.6.14-mm1/mm/swapfile.c 2005-11-07 11:55:08.000000000 -0800
@@ -624,6 +624,7 @@ static int try_to_unuse(unsigned int typ
*/
swap_map = &si->swap_map[i];
entry = swp_entry(type, i);
+again:
page = read_swap_cache_async(entry, NULL, 0);
if (!page) {
/*
@@ -658,6 +659,12 @@ static int try_to_unuse(unsigned int typ
wait_on_page_locked(page);
wait_on_page_writeback(page);
lock_page(page);
+ if (!PageSwapCache(page)) {
+ /* Page migration has occured */
+ unlock_page(page);
+ page_cache_release(page);
+ goto again;
+ }
wait_on_page_writeback(page);

/*

2005-11-08 21:04:15

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 4/8] Direct Migration V2: remove_from_swap() to remove swap ptes

Add remove_from_swap

remove_from_swap() allows the restoration of the pte entries that existed
before page migration occurred for anonymous pages by walking the reverse
maps. This reduces swap use and establishes regular pte's without the need
for page faults.

It may also fix a leak of swap entries that could occur if a page
is freed without locking it first (zap_pte_range?) while migration occurs.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/swap.h 2005-11-08 10:15:02.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/swap.h 2005-11-08 10:16:58.000000000 -0800
@@ -266,6 +266,7 @@ extern int remove_exclusive_swap_page(st
struct backing_dev_info;

extern spinlock_t swap_lock;
+extern int unuse_vma(struct vm_area_struct *vma, swp_entry_t entry, struct page *page);

/* linux/mm/thrash.c */
extern struct mm_struct * swap_token_mm;
Index: linux-2.6.14-mm1/mm/swapfile.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/swapfile.c 2005-11-08 10:15:00.000000000 -0800
+++ linux-2.6.14-mm1/mm/swapfile.c 2005-11-08 10:16:58.000000000 -0800
@@ -477,7 +477,7 @@ static inline int unuse_pud_range(struct
return 0;
}

-static int unuse_vma(struct vm_area_struct *vma,
+int unuse_vma(struct vm_area_struct *vma,
swp_entry_t entry, struct page *page)
{
pgd_t *pgd;
Index: linux-2.6.14-mm1/mm/rmap.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/rmap.c 2005-11-07 11:48:08.000000000 -0800
+++ linux-2.6.14-mm1/mm/rmap.c 2005-11-08 10:16:58.000000000 -0800
@@ -206,6 +206,25 @@ out:
}

/*
+ * Remove an anonymous page from swap replacing the swap pte's
+ * through real pte's pointing to valid pages.
+ */
+void remove_from_swap(struct page *page)
+{
+ struct anon_vma *anon_vma;
+ struct vm_area_struct *vma;
+
+ anon_vma = page_lock_anon_vma(page);
+ if (!anon_vma)
+ return;
+ list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
+ swp_entry_t entry = { .val = page_private(page) };
+
+ unuse_vma(vma, entry, page);
+ }
+}
+
+/*
* At what user virtual address is page expected in vma?
*/
static inline unsigned long
Index: linux-2.6.14-mm1/include/linux/rmap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/rmap.h 2005-11-07 11:48:08.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/rmap.h 2005-11-08 10:16:58.000000000 -0800
@@ -91,6 +91,7 @@ static inline void page_dup_rmap(struct
*/
int page_referenced(struct page *, int is_locked, int ignore_token);
int try_to_unmap(struct page *);
+void remove_from_swap(struct page *page);

/*
* Called from mm/filemap_xip.c to unmap empty zero page
Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-08 10:16:53.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-08 10:16:58.000000000 -0800
@@ -959,12 +959,15 @@ next:

else if (rc) {
/* Permanent failure to migrate the page */
+ remove_from_swap(page);
list_move(&page->lru, failed);
nr_failed++;
}
- else if (newpage) {
- /* Successful migration. Return new page to LRU */
- move_to_lru(newpage);
+ else {
+ if (newpage) {
+ remove_from_swap(newpage);
+ move_to_lru(newpage);
+ }
list_move(&page->lru, moved);
}
}

2005-11-08 21:04:21

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 3/8] Direct Migration V2: migrate_pages() extension

Add direct migration support with fall back to swap.

Direct migration support on top of the swap based page migration facility.

This allows the direct migration of anonymous pages and the migration of
file backed pages by dropping the associated buffers (requires writeout).

Fall back to swap out if necessary.

The patch is based on lots of patches from the hotplug project but the code
was restructured, documented and simplified as much as possible.

Note that an additional patch that defines the migrate_page() method
for filesystems is necessary in order to avoid writeback for anonymous
and file backed pages.

V1->V2:
- Change migrate_pages() so that it can return pagelist for failed and
moved pages. No longer free the old pages but allow caller to dispose
of them.
- Unmap pages before changing reverse map under tree lock. Take
a write_lock instead of a read_lock.
- Add documentation

Signed-off-by: Mike Kravetz <[email protected]>
Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-08 11:17:13.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-08 11:44:13.000000000 -0800
@@ -575,10 +575,6 @@ keep:
/*
* swapout a single page
* page is locked upon entry, unlocked on exit
- *
- * return codes:
- * 0 = complete
- * 1 = retry
*/
static int swap_page(struct page *page)
{
@@ -594,69 +590,244 @@ static int swap_page(struct page *page)
case PAGE_KEEP:
case PAGE_ACTIVATE:
goto unlock_retry;
+
case PAGE_SUCCESS:
- goto retry;
+ return -EAGAIN;
+
case PAGE_CLEAN:
; /* try to free the page below */
}
}

if (PagePrivate(page)) {
- if (!try_to_release_page(page, GFP_KERNEL))
+ if (!try_to_release_page(page, GFP_KERNEL) ||
+ (!mapping && page_count(page) == 1))
goto unlock_retry;
- if (!mapping && page_count(page) == 1)
- goto free_it;
}

- if (!remove_mapping(mapping, page))
- goto unlock_retry; /* truncate got there first */
+ if (remove_mapping(mapping, page)) {
+ unlock_page(page);
+ return 0;
+ }
+
+unlock_retry:
+ unlock_page(page);
+
+ return -EAGAIN;
+}
+
+static inline void move_to_lru(struct page *page)
+{
+ list_del(&page->lru);
+ if (PageActive(page)) {
+ /*
+ * lru_cache_add_active checks that
+ * the PG_active bit is off.
+ */
+ ClearPageActive(page);
+ lru_cache_add_active(page);
+ } else
+ lru_cache_add(page);
+ put_page(page);
+}
+
+/*
+ * Page migration was developed in the context of the memory hotplug project.
+ * The main authors of the migration code are:
+ *
+ * IWAMOTO Toshihiro <[email protected]>
+ * Hirokazu Takahashi <[email protected]>
+ * Dave Hansen <[email protected]>
+ * Christoph Lameter <[email protected]>
+ */
+
+/*
+ * Remove references for a page and establish the new page with the correct
+ * basic settings to be able to stop accesses to the page.
+ */
+int migrate_page_remove_references(struct page *newpage, struct page *page, int nr_refs)
+{
+ struct address_space *mapping = page_mapping(page);
+ struct page **radix_pointer;
+ int i;
+
+ /*
+ * Avoid doing any of the following work if the page count
+ * indicates that the page is in use or truncate has removed
+ * the page.
+ */
+ if (!mapping || page_mapcount(page) + nr_refs != page_count(page))
+ return 1;
+
+ /*
+ * Establish swap ptes for anonymous pages or destroy pte
+ * maps for files.
+ *
+ * In order to reestablish file backed mappings the fault handlers
+ * will take the radix tree_lock which is then used to synchronize.
+ *
+ * A process accessing via a swap pte (an anonymous page) will take a
+ * page_lock on the old page which will block the process until the
+ * migration attempt is complete. At that time the PageSwapCache bit
+ * will be examined. If the page was migrated then the PageSwapCache
+ * bit will be clear and the operation to retrieve the page will be
+ * retried which will find the new page in the radix tree. Then a new
+ * direct mapping may be generated.
+ *
+ * If the page was not migrated then the PageSwapCache bit
+ * is still set and the operation may continue.
+ */
+ for(i = 0; i < 10 && page_mapped(page); i++) {
+ int rc = try_to_unmap(page);
+
+ if (rc == SWAP_SUCCESS)
+ break;
+ /*
+ * If there are other runnable processes then running
+ * them may make it possible to unmap the page
+ */
+ schedule();
+ }
+
+ /*
+ * Avoid taking the tree lock if there is no hope of success.
+ */
+ if (page_mapcount(page))
+ return 1;
+
+ write_lock_irq(&mapping->tree_lock);
+
+ radix_pointer = (struct page **)radix_tree_lookup_slot(
+ &mapping->page_tree,
+ page_index(page));
+
+ if (!page->mapping ||
+ page_count(page) != nr_refs ||
+ *radix_pointer != page) {
+ write_unlock_irq(&mapping->tree_lock);
+ return 1;
+ }

-free_it:
/*
- * We may free pages that were taken off the active list
- * by isolate_lru_page. However, free_hot_cold_page will check
- * if the active bit is set. So clear it.
+ * The page count for the old page may be raised by other kernel
+ * components at this point since there no lock exists to prevent
+ * increasing the page_count. If that happens then the page will
+ * continue to exist as long as the kernel component keeps the
+ * page count high. The page has no other references left and it
+ * is not being written to, otherwise the page lock would have been
+ * taken.
+ *
+ * Filesystems increase the page count while holding the tree_lock
+ * which provides synchronization with this code.
*/
+
+ /*
+ * Certain minimal information about a page must be available
+ * in order for other subsystems to properly handle the page if they
+ * find it through the radix tree update before we are finished
+ * copying the page.
+ */
+ get_page(newpage);
+ newpage->index = page_index(page);
+ if (PageSwapCache(page)) {
+ SetPageSwapCache(newpage);
+ set_page_private(newpage, page_private(page));
+ } else
+ newpage->mapping = page->mapping;
+
+ *radix_pointer = newpage;
+ __put_page(page);
+ write_unlock_irq(&mapping->tree_lock);
+
+ return 0;
+}
+
+/*
+ * Copy the page to its new location
+ */
+void migrate_page_copy(struct page *newpage, struct page *page)
+{
+
+ /* Debug potential trouble with concurrent increases of page_count */
+ if (page_count(page) != 1)
+ printk(KERN_ERR "precheck: copying %p->%p page count=%d\n",
+ page, newpage, page_count(page));
+
+ copy_highpage(newpage, page);
+
+ if (PageError(page))
+ SetPageError(newpage);
+ if (PageReferenced(page))
+ SetPageReferenced(newpage);
+ if (PageUptodate(page))
+ SetPageUptodate(newpage);
+ if (PageActive(page))
+ SetPageActive(newpage);
+ if (PageChecked(page))
+ SetPageChecked(newpage);
+ if (PageMappedToDisk(page))
+ SetPageMappedToDisk(newpage);
+
+ if (PageDirty(page)) {
+ clear_page_dirty_for_io(page);
+ set_page_dirty(newpage);
+ }
+
+ ClearPageSwapCache(page);
ClearPageActive(page);
+ ClearPagePrivate(page);
+ set_page_private(page, 0);
+ page->mapping = NULL;
+
+ if (page_count(page) != 1)
+ printk(KERN_ERR "postcheck: copying %p->%p page count=%d\n",
+ page, newpage, page_count(page));

- list_del(&page->lru);
- unlock_page(page);
- put_page(page);
- return 0;
+ /*
+ * If any waiters have accumulated on the new page then
+ * wake them up.
+ */
+ if (PageWriteback(newpage))
+ end_page_writeback(newpage);
+}

-unlock_retry:
- unlock_page(page);
+/*
+ * Common logic to directly migrate a single page suitable for
+ * pages that do not use PagePrivate.
+ *
+ * Pages are locked upon entry and exit.
+ */
+int migrate_page(struct page *newpage, struct page *page)
+{
+ BUG_ON(PageWriteback(page)); /* Writeback must be complete */
+
+ if (migrate_page_remove_references(newpage, page, 2))
+ return -EAGAIN;

-retry:
- return 1;
+ migrate_page_copy(newpage, page);
+
+ return 0;
}
+
/*
* migrate_pages
*
- * Two lists are passed to this function. The first list
- * contains the pages isolated from the LRU to be migrated.
- * The second list contains new pages that the pages isolated
- * can be moved to. If the second list is NULL then all
- * pages are swapped out.
- *
* The function returns after 10 attempts or if no pages
- * are movable anymore because t has become empty
+ * are movable anymore because to has become empty
* or no retryable pages exist anymore.
*
- * SIMPLIFIED VERSION: This implementation of migrate_pages
- * is only swapping out pages and never touches the second
- * list. The direct migration patchset
- * extends this function to avoid the use of swap.
+ * Return: Number of pages not migrated when to ran empty.
*/
-int migrate_pages(struct list_head *l, struct list_head *t)
+int migrate_pages(struct list_head *from, struct list_head *to,
+ struct list_head *moved, struct list_head *failed)
{
int retry;
- LIST_HEAD(failed);
int nr_failed = 0;
int pass = 0;
struct page *page;
struct page *page2;
int swapwrite = current->flags & PF_SWAPWRITE;
+ int rc = 0;

if (!swapwrite)
current->flags |= PF_SWAPWRITE;
@@ -664,50 +835,137 @@ int migrate_pages(struct list_head *l, s
redo:
retry = 0;

- list_for_each_entry_safe(page, page2, l, lru) {
+ list_for_each_entry_safe(page, page2, from, lru) {
+ struct page *newpage = NULL;
+ struct address_space *mapping;
+
cond_resched();

+ if (to && list_empty(to))
+ break;
+
+ if (page_count(page) == 1) {
+ /* page was freed from under us. So we are done. */
+ list_move(&page->lru, moved);
+ continue;
+ }
+
/*
* Skip locked pages during the first two passes to give the
- * functions holding the lock time to release the page. Later we use
- * lock_page to have a higher chance of acquiring the lock.
+ * functions holding the lock time to release the page.
+ * Later we use lock_page() to have a higher chance of
+ * acquiring the lock.
*/
+ rc = -EAGAIN;
if (pass > 2)
lock_page(page);
else
if (TestSetPageLocked(page))
- goto retry_later;
+ goto next;

/*
- * Only wait on writeback if we have already done a pass where
- * we we may have triggered writeouts for lots of pages.
+ * Only wait on writeback if we have already done a pass
+ * where we we may have triggered writeouts.
*/
if (pass > 0)
wait_on_page_writeback(page);
else
- if (PageWriteback(page)) {
- unlock_page(page);
- goto retry_later;
- }
+ if (PageWriteback(page))
+ goto unlock_page;

#ifdef CONFIG_SWAP
+ /*
+ * Anonymous pages must have swap cache references otherwise
+ * the information contained in the page maps cannot be
+ * preserved.
+ */
if (PageAnon(page) && !PageSwapCache(page)) {
if (!add_to_swap(page)) {
unlock_page(page);
- list_move(&page->lru, &failed);
+ list_move(&page->lru, failed);
nr_failed++;
continue;
}
}
#endif /* CONFIG_SWAP */

+ if (!to) {
+ rc = swap_page(page);
+ goto next;
+ }
+
+ newpage = lru_to_page(to);
+ lock_page(newpage);
+
/*
* Page is properly locked and writeback is complete.
* Try to migrate the page.
*/
- if (swap_page(page)) {
-retry_later:
+ mapping = page_mapping(page);
+ if (!mapping)
+ goto unlock_both;
+
+ /*
+ * Trigger writeout if page is dirty
+ */
+ if (PageDirty(page)) {
+ switch (pageout(page, mapping)) {
+ case PAGE_KEEP:
+ case PAGE_ACTIVATE:
+ goto unlock_both;
+
+ case PAGE_SUCCESS:
+ unlock_page(newpage);
+ goto next;
+
+ case PAGE_CLEAN:
+ ; /* try to migrate the page below */
+ }
+ }
+
+ /*
+ * If we have no buffer or can release the buffers
+ * then do a simple migration.
+ */
+ if (!page_has_buffers(page) ||
+ try_to_release_page(page, GFP_KERNEL)) {
+ rc = migrate_page(newpage, page);
+ goto unlock_both;
+ }
+
+ /*
+ * On early passes with mapped pages simply
+ * retry. There may be a lock held for some
+ * buffers that may go away later. Later
+ * swap them out.
+ */
+ if (pass > 4) {
+ unlock_page(newpage);
+ newpage = NULL;
+ rc = swap_page(page);
+ goto next;
+ }
+
+unlock_both:
+ unlock_page(newpage);
+
+unlock_page:
+ unlock_page(page);
+
+next:
+ if (rc == -EAGAIN)
+ /* Page should be retried later */
retry++;
+
+ else if (rc) {
+ /* Permanent failure to migrate the page */
+ list_move(&page->lru, failed);
+ nr_failed++;
+ }
+ else if (newpage) {
+ /* Successful migration. Return new page to LRU */
+ move_to_lru(newpage);
+ list_move(&page->lru, moved);
}
}
if (retry && pass++ < 10)
@@ -716,9 +974,6 @@ retry_later:
if (!swapwrite)
current->flags &= ~PF_SWAPWRITE;

- if (!list_empty(&failed))
- list_splice(&failed, l);
-
return nr_failed + retry;
}
#endif
@@ -868,21 +1123,6 @@ done:
pagevec_release(&pvec);
}

-static inline void move_to_lru(struct page *page)
-{
- list_del(&page->lru);
- if (PageActive(page)) {
- /*
- * lru_cache_add_active checks that
- * the PG_active bit is off.
- */
- ClearPageActive(page);
- lru_cache_add_active(page);
- } else
- lru_cache_add(page);
- put_page(page);
-}
-
/*
* Add isolated pages on the list back to the LRU
*
Index: linux-2.6.14-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/swap.h 2005-11-07 11:48:20.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/swap.h 2005-11-08 11:18:34.000000000 -0800
@@ -180,7 +180,12 @@ extern int isolate_lru_page(struct page
extern int putback_lru_pages(struct list_head *l);

#ifdef CONFIG_MIGRATION
-extern int migrate_pages(struct list_head *l, struct list_head *t);
+extern int migrate_pages(struct list_head *l, struct list_head *t,
+ struct list_head *moved, struct list_head *failed);
+
+extern int migrate_page(struct page *, struct page *);
+extern int migrate_page_remove_references(struct page *, struct page *, int);
+extern void migrate_page_copy(struct page *, struct page *);
#endif

#ifdef CONFIG_MMU
Index: linux-2.6.14-mm1/Documentation/vm/page_migration
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-mm1/Documentation/vm/page_migration 2005-11-08 11:18:34.000000000 -0800
@@ -0,0 +1,106 @@
+Page migration
+--------------
+
+Page migration occurs in several steps. First a high level
+description for those trying to use migrate_pages() and then
+a low level description of how the low level details work.
+
+
+A. Use of migrate_pages()
+-------------------------
+
+1. Remove pages from the LRU.
+
+ Lists of pages to be migrated are generated by scanning over
+ pages and moving them into lists. This is done by
+ calling isolate_lru_page() or __isolate_lru_page().
+ Calling isolate_lru_page increases the references to the page
+ so that it cannot vanish under us.
+
+2. Generate a list of newly allocates page to move the contents
+ of the first list to.
+
+3. The migrate_pages() function is called which attempts
+ to do the migration. It returns the moved pages in the
+ list specified as the third parameter and the failed
+ migrations in the fourth parameter. The first parameter
+ will contain the pages that could still be retried.
+
+4. The leftover pages of various types are returned
+ to the LRU using putback_to_lru_pages() or otherwise
+ disposed of. The pages will still have the refcount as
+ increased by isolate_lru_pages()!
+
+B. Operation of migrate_pages()
+--------------------------------
+
+migrate_pages does several passes over its list of pages. A page is moved
+if all references to a page are removable at the time.
+
+Steps:
+
+1. Lock the page to be migrated
+
+2. Insure that writeback is complete.
+
+3. Make sure that the page has assigned swap cache entry if
+ it is an anonyous page. The swap cache reference is necessary
+ to preserve the information contain in the page table maps.
+
+4. Prep the new page that we want to move to. It is locked
+ and set to not being uptodate so that all accesses to the new
+ page immediately lock while we are moving references.
+
+5. All the page table references to the page are either dropped (file backed)
+ or converted to swap references (anonymous pages). This should decrease the
+ reference count.
+
+6. The radix tree lock is taken
+
+7. The refcount of the page is examined and we back out if references remain
+
+8. The radix tree is checked and if it does not contain the pointer to this
+ page then we back out.
+
+9. The mapping is checked. If the mapping is gone then a truncate action may
+ be in progress and we back out.
+
+10. The new page is prepped with some settings from the old page so that accesses
+ to the new page will be discovererd to have the correct settings.
+
+11. The radix tree is changed to point to the new page.
+
+12. The reference count of the old page is dropped because the reference has now
+ been removed.
+
+13. The radix tree lock is dropped.
+
+14. The page contents are copied to the new page.
+
+15. The remaining page flags are copied to the new page.
+
+16. The old page flags are cleared to indicate that the page does
+ not use any information anymore.
+
+17. Queued up writeback on the new page is triggered.
+
+18. The locks are dropped from the old and new page.
+
+19. The swapcache reference is removed from the new page.
+
+20. The new page is moved to the LRU.
+
+This system is not without its problems. The check for the number of
+references while holding the radix tree lock may race with another function
+on another processor incrementing the reference counter for a page. In that
+case we will be in a situation where the page will linger until the reference
+count is dropped by that processor. There are no other references to the page
+though. The kernel functions would have taken a lock on the page if the page
+would have to be written to.
+
+The page is therefore likely just lingering for read purposes for a short while.
+The copy page code contains a couple of printks to detect the situation and help
+if there are any issues with the lingering pages.
+
+Christoph Lameter, November 7, 2005.
+

2005-11-08 21:04:59

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 6/8] Direct Migration V2: Avoid writeback / page_migrate() method

Migrate a page with buffers without requiring writeback

This introduces a new address space operation migrate_page() that
may be used by a filesystem to implement its own version of page migration.

A version is provided that migrates buffers attached to pages. Some
filesystems (ext2, ext3, xfs) are modified to utilize this feature.

The swapper address space operation are modified so that a regular
migrate_pages() will occur for anonymous pages without writeback
(migrate_pages forces every anonymous page to have a swap entry).

V1->V2:
- Fix CONFIG_MIGRATION handling

Signed-off-by: Mike Kravetz <[email protected]>
Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/include/linux/fs.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/fs.h 2005-11-07 11:48:46.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/fs.h 2005-11-08 10:18:51.000000000 -0800
@@ -332,6 +332,8 @@ struct address_space_operations {
loff_t offset, unsigned long nr_segs);
struct page* (*get_xip_page)(struct address_space *, sector_t,
int);
+ /* migrate the contents of a page to the specified target */
+ int (*migrate_page) (struct page *, struct page *);
};

struct backing_dev_info;
@@ -1679,6 +1681,12 @@ extern void simple_release_fs(struct vfs

extern ssize_t simple_read_from_buffer(void __user *, size_t, loff_t *, const void *, size_t);

+#ifdef CONFIG_MIGRATION
+extern int buffer_migrate_page(struct page *, struct page *);
+#else
+#define buffer_migrate_page(a,b) NULL
+#endif
+
extern int inode_change_ok(struct inode *, struct iattr *);
extern int __must_check inode_setattr(struct inode *, struct iattr *);

Index: linux-2.6.14-mm1/mm/swap_state.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/swap_state.c 2005-11-07 11:48:49.000000000 -0800
+++ linux-2.6.14-mm1/mm/swap_state.c 2005-11-08 10:18:51.000000000 -0800
@@ -26,6 +26,7 @@ static struct address_space_operations s
.writepage = swap_writepage,
.sync_page = block_sync_page,
.set_page_dirty = __set_page_dirty_nobuffers,
+ .migrate_page = migrate_page,
};

static struct backing_dev_info swap_backing_dev_info = {
Index: linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/xfs/linux-2.6/xfs_aops.c 2005-11-07 11:48:07.000000000 -0800
+++ linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_aops.c 2005-11-08 10:18:51.000000000 -0800
@@ -1348,4 +1348,5 @@ struct address_space_operations linvfs_a
.commit_write = generic_commit_write,
.bmap = linvfs_bmap,
.direct_IO = linvfs_direct_IO,
+ .migrate_page = buffer_migrate_page,
};
Index: linux-2.6.14-mm1/fs/buffer.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/buffer.c 2005-11-07 11:48:25.000000000 -0800
+++ linux-2.6.14-mm1/fs/buffer.c 2005-11-08 10:18:51.000000000 -0800
@@ -3026,6 +3026,70 @@ asmlinkage long sys_bdflush(int func, lo
}

/*
+ * Migration function for pages with buffers. This function can only be used
+ * if the underlying filesystem guarantees that no other references to "page"
+ * exist.
+ */
+#ifdef CONFIG_MIGRATION
+int buffer_migrate_page(struct page *newpage, struct page *page)
+{
+ struct address_space *mapping = page->mapping;
+ struct buffer_head *bh, *head;
+
+ if (!mapping)
+ return -EAGAIN;
+
+ if (!page_has_buffers(page))
+ return migrate_page(newpage, page);
+
+ head = page_buffers(page);
+
+ if (migrate_page_remove_references(newpage, page, 3))
+ return -EAGAIN;
+
+ spin_lock(&mapping->private_lock);
+
+ bh = head;
+ do {
+ get_bh(bh);
+ lock_buffer(bh);
+ bh = bh->b_this_page;
+
+ } while (bh != head);
+
+ ClearPagePrivate(page);
+ set_page_private(newpage, page_private(page));
+ set_page_private(page, 0);
+ put_page(page);
+ get_page(newpage);
+
+ bh = head;
+ do {
+ set_bh_page(bh, newpage, bh_offset(bh));
+ bh = bh->b_this_page;
+
+ } while (bh != head);
+
+ SetPagePrivate(newpage);
+ spin_unlock(&mapping->private_lock);
+
+ migrate_page_copy(newpage, page);
+
+ spin_lock(&mapping->private_lock);
+ bh = head;
+ do {
+ unlock_buffer(bh);
+ put_bh(bh);
+ bh = bh->b_this_page;
+
+ } while (bh != head);
+ spin_unlock(&mapping->private_lock);
+
+ return 0;
+}
+#endif
+
+/*
* Buffer-head allocation
*/
static kmem_cache_t *bh_cachep;
Index: linux-2.6.14-mm1/fs/ext3/inode.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/ext3/inode.c 2005-11-07 11:48:24.000000000 -0800
+++ linux-2.6.14-mm1/fs/ext3/inode.c 2005-11-08 10:18:51.000000000 -0800
@@ -1562,6 +1562,7 @@ static struct address_space_operations e
.invalidatepage = ext3_invalidatepage,
.releasepage = ext3_releasepage,
.direct_IO = ext3_direct_IO,
+ .migrate_page = buffer_migrate_page,
};

static struct address_space_operations ext3_writeback_aops = {
@@ -1575,6 +1576,7 @@ static struct address_space_operations e
.invalidatepage = ext3_invalidatepage,
.releasepage = ext3_releasepage,
.direct_IO = ext3_direct_IO,
+ .migrate_page = buffer_migrate_page,
};

static struct address_space_operations ext3_journalled_aops = {
Index: linux-2.6.14-mm1/fs/ext2/inode.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/ext2/inode.c 2005-11-07 11:48:07.000000000 -0800
+++ linux-2.6.14-mm1/fs/ext2/inode.c 2005-11-08 10:18:51.000000000 -0800
@@ -706,6 +706,7 @@ struct address_space_operations ext2_aop
.bmap = ext2_bmap,
.direct_IO = ext2_direct_IO,
.writepages = ext2_writepages,
+ .migrate_page = buffer_migrate_page,
};

struct address_space_operations ext2_aops_xip = {
@@ -723,6 +724,7 @@ struct address_space_operations ext2_nob
.bmap = ext2_bmap,
.direct_IO = ext2_direct_IO,
.writepages = ext2_writepages,
+ .migrate_page = buffer_migrate_page,
};

/*
Index: linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/xfs/linux-2.6/xfs_buf.c 2005-11-07 11:48:07.000000000 -0800
+++ linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_buf.c 2005-11-08 10:18:51.000000000 -0800
@@ -1568,6 +1568,7 @@ xfs_mapping_buftarg(
struct address_space *mapping;
static struct address_space_operations mapping_aops = {
.sync_page = block_sync_page,
+ .migrate_page = fail_migrate_page,
};

inode = new_inode(bdev->bd_inode->i_sb);
Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-08 10:16:58.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-08 10:19:30.000000000 -0800
@@ -571,6 +571,15 @@ keep:
return reclaimed;
}

+/*
+ * Non migratable page
+ */
+int fail_migrate_page(struct page *newpage, struct page *page)
+{
+ return -EIO;
+}
+
+
#ifdef CONFIG_MIGRATION
/*
* swapout a single page
@@ -905,6 +914,11 @@ redo:
if (!mapping)
goto unlock_both;

+ if (mapping->a_ops->migrate_page) {
+ rc = mapping->a_ops->migrate_page(newpage, page);
+ goto unlock_both;
+ }
+
/*
* Trigger writeout if page is dirty
*/
Index: linux-2.6.14-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/swap.h 2005-11-08 10:16:58.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/swap.h 2005-11-08 10:18:51.000000000 -0800
@@ -186,6 +186,11 @@ extern int migrate_pages(struct list_hea
extern int migrate_page(struct page *, struct page *);
extern int migrate_page_remove_references(struct page *, struct page *, int);
extern void migrate_page_copy(struct page *, struct page *);
+extern int fail_migrate_page(struct page *, struct page *);
+#else
+/* Possible settings for the migrate_page() method in address_operations */
+#define migrate_page(a,b) NULL
+#define fail_migrate_page(a,b) NULL
#endif

#ifdef CONFIG_MMU

2005-11-08 21:04:58

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 5/8] Direct Migration V2: upgrade MPOL_MF_MOVE and sys_migrate_pages()

Modify policy layer to support direct page migration

- Add migrate_pages_to() allowing the migration of a list of pages to a
a specified node or to vma with a specific allocation policy in sets
of MIGRATE_CHUNK_SIZE pages

- Modify do_migrate_pages() to do a staged move of pages from the
source nodes to the target nodes.

- Use comparisons instead of XOR in permission check.

V1->V2:
- Migrate processes in chunks of MIGRATE_CHUNK_SIZE

Signed-off-by: Paul Jackson <[email protected]>
Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/mempolicy.c 2005-11-08 10:06:04.000000000 -0800
+++ linux-2.6.14-mm1/mm/mempolicy.c 2005-11-08 10:17:09.000000000 -0800
@@ -89,6 +89,10 @@

/* Internal MPOL_MF_xxx flags */
#define MPOL_MF_DISCONTIG_OK (1<<20) /* Skip checks for continuous vmas */
+#define MPOL_MF_INVERT (1<<21) /* Invert check for nodemask */
+
+/* The number of pages to migrate per call to migrate_pages() */
+#define MIGRATE_CHUNK_SIZE 256

static kmem_cache_t *policy_cache;
static kmem_cache_t *sn_cache;
@@ -258,7 +262,7 @@ static int check_pte_range(struct vm_are
continue;
}
nid = pfn_to_nid(pfn);
- if (!node_isset(nid, *nodes)) {
+ if (!node_isset(nid, *nodes) == !(flags & MPOL_MF_INVERT)) {
if (pagelist) {
struct page *page = pfn_to_page(pfn);

@@ -447,6 +451,65 @@ static int contextualize_policy(int mode
return mpol_check_policy(mode, nodes);
}

+/*
+ * Migrate the list 'pagelist' of pages to a certain destination.
+ *
+ * Specify destination with either non-NULL vma or dest_node >= 0
+ * Return the number of pages not migrated or error code
+ */
+static int migrate_pages_to(struct list_head *pagelist,
+ struct vm_area_struct *vma, int dest)
+{
+ LIST_HEAD(newlist);
+ LIST_HEAD(moved);
+ LIST_HEAD(failed);
+ int err = 0;
+ int nr_pages;
+ struct page *page;
+ struct list_head *p;
+
+redo:
+ nr_pages = 0;
+ list_for_each(p, pagelist) {
+ if (vma)
+ page = alloc_page_vma(GFP_HIGHUSER, vma,
+ vma->vm_start);
+ else
+ page = alloc_pages_node(dest, GFP_HIGHUSER, 0);
+
+ if (!page) {
+ err = -ENOMEM;
+ goto out;
+ }
+ list_add(&page->lru, &newlist);
+ nr_pages++;
+ if (nr_pages > MIGRATE_CHUNK_SIZE);
+ break;
+ }
+ err = migrate_pages(pagelist, &newlist, &moved, &failed);
+
+ putback_lru_pages(&moved); /* Call release pages instead ?? */
+
+ if (err >= 0 && list_empty(&newlist) && !list_empty(pagelist))
+ goto redo;
+out:
+ /* Return leftover allocated pages */
+ while (!list_empty(&newlist)) {
+ page = list_entry(newlist.next, struct page, lru);
+ list_del(&page->lru);
+ __free_page(page);
+ }
+ list_splice(&failed, pagelist);
+ if (err < 0)
+ return err;
+
+ /* Calculate number of leftover pages */
+ nr_pages = 0;
+ list_for_each(p, pagelist)
+ nr_pages++;
+ return nr_pages;
+}
+
long do_mbind(unsigned long start, unsigned long len,
unsigned long mode, nodemask_t *nmask, unsigned long flags)
{
@@ -497,14 +560,22 @@ long do_mbind(unsigned long start, unsig
down_write(&mm->mmap_sem);
vma = check_range(mm, start, end, nmask, flags,
(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) ? &pagelist : NULL);
+
err = PTR_ERR(vma);
if (!IS_ERR(vma)) {
+
err = mbind_range(vma, start, end, new);
- if (!list_empty(&pagelist))
- migrate_pages(&pagelist, NULL);
- if (!err && !list_empty(&pagelist) && (flags & MPOL_MF_STRICT))
+
+ if (!err) {
+ if (!list_empty(&pagelist))
+ migrate_pages_to(&pagelist, vma, -1);
+
+ if (!list_empty(&pagelist) && (flags & MPOL_MF_STRICT))
err = -EIO;
+ }
+
}
+
if (!list_empty(&pagelist))
putback_lru_pages(&pagelist);

@@ -633,10 +704,37 @@ long do_get_mempolicy(int *policy, nodem
}

/*
- * For now migrate_pages simply swaps out the pages from nodes that are in
- * the source set but not in the target set. In the future, we would
- * want a function that moves pages between the two nodesets in such
- * a way as to preserve the physical layout as much as possible.
+ * Migrate pages from one node to a target node.
+ * Returns error or the number of pages not migrated.
+ */
+int migrate_to_node(struct mm_struct *mm, int source,
+ int dest, int flags)
+{
+ nodemask_t nmask;
+ LIST_HEAD(pagelist);
+ int err = 0;
+
+ nodes_clear(nmask);
+ node_set(source, nmask);
+
+ check_range(mm, mm->mmap->vm_start, TASK_SIZE, &nmask,
+ flags | MPOL_MF_DISCONTIG_OK | MPOL_MF_INVERT,
+ &pagelist);
+
+ if (!list_empty(&pagelist)) {
+
+ err = migrate_pages_to(&pagelist, NULL, dest);
+
+ if (!list_empty(&pagelist))
+ putback_lru_pages(&pagelist);
+
+ }
+ return err;
+}
+
+/*
+ * Move pages between the two nodesets so as to preserve the physical
+ * layout as much as possible.
*
* Returns the number of page that could not be moved.
*/
@@ -644,22 +742,76 @@ int do_migrate_pages(struct mm_struct *m
const nodemask_t *from_nodes, const nodemask_t *to_nodes, int flags)
{
LIST_HEAD(pagelist);
- int count = 0;
- nodemask_t nodes;
-
- nodes_andnot(nodes, *from_nodes, *to_nodes);
- nodes_complement(nodes, nodes);
+ int err = 0;
+ nodemask_t tmp;
+ int busy = 0;

down_read(&mm->mmap_sem);
- check_range(mm, mm->mmap->vm_start, TASK_SIZE, &nodes,
- flags | MPOL_MF_DISCONTIG_OK, &pagelist);
- if (!list_empty(&pagelist)) {
- migrate_pages(&pagelist, NULL);
- if (!list_empty(&pagelist))
- count = putback_lru_pages(&pagelist);
+
+/* Find a 'source' bit set in 'tmp' whose corresponding 'dest'
+ * bit in 'to' is not also set in 'tmp'. Clear the found 'source'
+ * bit in 'tmp', and return that <source, dest> pair for migration.
+ * The pair of nodemasks 'to' and 'from' define the map.
+ *
+ * If no pair of bits is found that way, fallback to picking some
+ * pair of 'source' and 'dest' bits that are not the same. If the
+ * 'source' and 'dest' bits are the same, this represents a node
+ * that will be migrating to itself, so no pages need move.
+ *
+ * If no bits are left in 'tmp', or if all remaining bits left
+ * in 'tmp' correspond to the same bit in 'to', return false
+ * (nothing left to migrate).
+ *
+ * This lets us pick a pair of nodes to migrate between, such that
+ * if possible the dest node is not already occupied by some other
+ * source node, minimizing the risk of overloading the memory on a
+ * node that would happen if we migrated incoming memory to a node
+ * before migrating outgoing memory source that same node.
+ *
+ * A single scan of tmp is sufficient. As we go, we remember the
+ * most recent <s, d> pair that moved (s != d). If we find a pair
+ * that not only moved, but what's better, moved to an empty slot
+ * (d is not set in tmp), then we break out then, with that pair.
+ * Otherwise when we finish scannng from_tmp, we at least have the
+ * most recent <s, d> pair that moved. If we get all the way through
+ * the scan of tmp without finding any node that moved, much less
+ * moved to an empty node, then there is nothing left worth migrating.
+ */
+
+ tmp = *from_nodes;
+ while (!nodes_empty(tmp)) {
+ int s,d;
+ int source = -1;
+ int dest = 0;
+
+ for_each_node_mask(s, tmp) {
+
+ d = node_remap(s, *from_nodes, *to_nodes);
+ if (s == d)
+ continue;
+
+ source = s; /* Node moved. Memorize */
+ dest = d;
+
+ /* dest not in remaining from nodes? */
+ if (!node_isset(dest, tmp))
+ break;
+ }
+ if (source == -1)
+ break;
+
+ node_clear(source, tmp);
+ err = migrate_to_node(mm, source, dest, flags);
+ if (err > 0)
+ busy += err;
+ if (err < 0)
+ break;
}
+
up_read(&mm->mmap_sem);
- return count;
+ if (err < 0)
+ return err;
+ return busy;
}

/*

2005-11-08 21:05:51

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 8/8] Direct Migration V2: SWAP_REFERENCE for try_to_unmap()

Distinguish in try_to_umap_one between the case when the page is truly
unswappable from the case when the page was recently referenced.

The page migration code uses try_to_unmap_one and can avoid calling
try_to_unmap again if there was a persistent failure.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/include/linux/rmap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/rmap.h 2005-11-07 18:18:14.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/rmap.h 2005-11-07 18:48:11.000000000 -0800
@@ -120,6 +120,7 @@ unsigned long page_address_in_vma(struct
*/
#define SWAP_SUCCESS 0
#define SWAP_AGAIN 1
-#define SWAP_FAIL 2
+#define SWAP_REFERENCE 2
+#define SWAP_FAIL 3

#endif /* _LINUX_RMAP_H */
Index: linux-2.6.14-mm1/mm/rmap.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/rmap.c 2005-11-07 18:18:14.000000000 -0800
+++ linux-2.6.14-mm1/mm/rmap.c 2005-11-07 18:48:11.000000000 -0800
@@ -546,16 +546,20 @@ static int try_to_unmap_one(struct page

/*
* If the page is mlock()d, we cannot swap it out.
- * If it's recently referenced (perhaps page_referenced
- * skipped over this mm) then we should reactivate it.
- *
* Pages belonging to VM_RESERVED regions should not happen here.
*/
- if ((vma->vm_flags & (VM_LOCKED|VM_RESERVED)) ||
- ptep_clear_flush_young(vma, address, pte)) {
+ if (vma->vm_flags & (VM_LOCKED|VM_RESERVED)) {
ret = SWAP_FAIL;
goto out_unmap;
}
+ /*
+ * If the page is recently referenced (perhaps page_referenced
+ * skipped over this mm) then we may want to reactivate it.
+ */
+ if (ptep_clear_flush_young(vma, address, pte)) {
+ ret = SWAP_REFERENCE;
+ goto out_unmap;
+ }

/* Nuke the page table entry. */
flush_cache_page(vma, address, page_to_pfn(page));
@@ -706,7 +710,9 @@ static int try_to_unmap_anon(struct page

list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
ret = try_to_unmap_one(page, vma);
- if (ret == SWAP_FAIL || !page_mapped(page))
+ if (ret == SWAP_FAIL ||
+ ret == SWAP_REFERENCE ||
+ !page_mapped(page))
break;
}
spin_unlock(&anon_vma->lock);
@@ -737,7 +743,9 @@ static int try_to_unmap_file(struct page
spin_lock(&mapping->i_mmap_lock);
vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
ret = try_to_unmap_one(page, vma);
- if (ret == SWAP_FAIL || !page_mapped(page))
+ if (ret == SWAP_FAIL ||
+ ret == SWAP_REFERENCE ||
+ !page_mapped(page))
goto out;
}

@@ -822,7 +830,9 @@ out:
*
* SWAP_SUCCESS - we succeeded in removing all mappings
* SWAP_AGAIN - we missed a mapping, try again later
+ * SWAP_REFERENCE - the page was recently referenced
* SWAP_FAIL - the page is unswappable
+ *
*/
int try_to_unmap(struct page *page)
{
Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-07 18:38:59.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-07 18:51:25.000000000 -0800
@@ -471,6 +471,7 @@ static int shrink_list(struct list_head
*/
if (page_mapped(page) && mapping) {
switch (try_to_unmap(page)) {
+ case SWAP_REFERENCE:
case SWAP_FAIL:
goto activate_locked;
case SWAP_AGAIN:
@@ -689,8 +690,9 @@ int migrate_page_remove_references(struc
for(i = 0; i < 10 && page_mapped(page); i++) {
int rc = try_to_unmap(page);

- if (rc == SWAP_SUCCESS)
+ if (rc == SWAP_SUCCESS || rc == SWAP_FAIL)
break;
+
/*
* If there are other runnable processes then running
* them may make it possible to unmap the page

2005-11-08 21:06:26

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH 7/8] Direct Migration V2: add_to_swap() with additional gfp_t parameter

Add gfp_mask to add_to_swap

The migration code calls the function with GFP_KERNEL
while the swap code calls it with GFP_ATOMIC, because
the migration code can ask the swap code to free some pages
when we're in a low memory situation.

Signed-off-by: Hirokazu Takahashi <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/swap.h 2005-11-07 18:35:40.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/swap.h 2005-11-07 18:38:59.000000000 -0800
@@ -242,7 +242,7 @@ extern int rw_swap_page_sync(int, swp_en
extern struct address_space swapper_space;
#define total_swapcache_pages swapper_space.nrpages
extern void show_swap_cache_info(void);
-extern int add_to_swap(struct page *);
+extern int add_to_swap(struct page *, gfp_t);
extern void __delete_from_swap_cache(struct page *);
extern void delete_from_swap_cache(struct page *);
extern int move_to_swap_cache(struct page *, swp_entry_t);
Index: linux-2.6.14-mm1/mm/swap_state.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/swap_state.c 2005-11-07 18:35:40.000000000 -0800
+++ linux-2.6.14-mm1/mm/swap_state.c 2005-11-07 18:38:59.000000000 -0800
@@ -143,7 +143,7 @@ void __delete_from_swap_cache(struct pag
* Allocate swap space for the page and add the page to the
* swap cache. Caller needs to hold the page lock.
*/
-int add_to_swap(struct page * page)
+int add_to_swap(struct page * page, gfp_t gfp_mask)
{
swp_entry_t entry;
int err;
@@ -171,7 +171,7 @@ int add_to_swap(struct page * page)
* Add it to the swap cache and mark it dirty
*/
err = __add_to_swap_cache(page, entry,
- GFP_ATOMIC|__GFP_NOMEMALLOC|__GFP_NOWARN);
+ gfp_mask|__GFP_NOMEMALLOC|__GFP_NOWARN);

switch (err) {
case 0: /* Success */
Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-07 18:35:40.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-07 18:38:59.000000000 -0800
@@ -456,7 +456,7 @@ static int shrink_list(struct list_head
if (PageAnon(page) && !PageSwapCache(page)) {
if (!sc->may_swap)
goto keep_locked;
- if (!add_to_swap(page))
+ if (!add_to_swap(page, GFP_ATOMIC))
goto activate_locked;
}
#endif /* CONFIG_SWAP */
@@ -889,7 +889,7 @@ redo:
* preserved.
*/
if (PageAnon(page) && !PageSwapCache(page)) {
- if (!add_to_swap(page)) {
+ if (!add_to_swap(page, GFP_KERNEL)) {
unlock_page(page);
list_move(&page->lru, failed);
nr_failed++;

2005-11-09 01:36:40

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH 5/8] Direct Migration V2: upgrade MPOL_MF_MOVE and sys_migrate_pages()

Christoph Lameter wrote:
> + err = migrate_pages(pagelist, &newlist, &moved, &failed);
> +
> + putback_lru_pages(&moved); /* Call release pages instead ?? */
> +
> + if (err >= 0 && list_empty(&newlist) && !list_empty(pagelist))
> + goto redo;


Here, list_empty(&newlist) is needed ?
For checking permanent failure case, list_empty(&failed) looks better.

-- Kame


2005-11-09 03:23:35

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 5/8] Direct Migration V2: upgrade MPOL_MF_MOVE and sys_migrate_pages()

On Wed, 9 Nov 2005, KAMEZAWA Hiroyuki wrote:

> Christoph Lameter wrote:
> > + err = migrate_pages(pagelist, &newlist, &moved, &failed);
> > +
> > + putback_lru_pages(&moved); /* Call release pages instead ?? */
> > +
> > + if (err >= 0 && list_empty(&newlist) && !list_empty(pagelist))
> > + goto redo;
>
>
> Here, list_empty(&newlist) is needed ?
> For checking permanent failure case, list_empty(&failed) looks better.

We only allocate 256 pages which are on the newlist. If the newlist is
empty but there are still pages that could be migrated
(!list_empty(pagelist)) then we need to allocate more pages and call
migrate_pages() again.

2005-11-09 04:00:39

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH 5/8] Direct Migration V2: upgrade MPOL_MF_MOVE and sys_migrate_pages()

Christoph Lameter wrote:
> On Wed, 9 Nov 2005, KAMEZAWA Hiroyuki wrote:
>
>
>>Christoph Lameter wrote:
>>
>>>+ err = migrate_pages(pagelist, &newlist, &moved, &failed);
>>>+
>>>+ putback_lru_pages(&moved); /* Call release pages instead ?? */
>>>+
>>>+ if (err >= 0 && list_empty(&newlist) && !list_empty(pagelist))
>>>+ goto redo;
>>
>>
>>Here, list_empty(&newlist) is needed ?
>>For checking permanent failure case, list_empty(&failed) looks better.
>
>
> We only allocate 256 pages which are on the newlist. If the newlist is
> empty but there are still pages that could be migrated
> (!list_empty(pagelist)) then we need to allocate more pages and call
> migrate_pages() again.
>
>
Ah, Okay.

confirmation:
1. Because mm->sem is held, there is no page-is-truncated/freed case.
2. Because pages in pagelist are removed from zone's lru, kswapd and others will not
find and unmap them. There is no page-is-swapedout-by-others case.

So if all target pages are successfuly remvoed from pagelist, newlist must be empty.
Right ?


-- Kame


2005-11-09 11:01:28

by Nikita Danilov

[permalink] [raw]
Subject: Re: [PATCH 6/8] Direct Migration V2: Avoid writeback / page_migrate() method

Christoph Lameter writes:
> Migrate a page with buffers without requiring writeback
>
> This introduces a new address space operation migrate_page() that
> may be used by a filesystem to implement its own version of page migration.
>
> A version is provided that migrates buffers attached to pages. Some
> filesystems (ext2, ext3, xfs) are modified to utilize this feature.
>
> The swapper address space operation are modified so that a regular
> migrate_pages() will occur for anonymous pages without writeback
> (migrate_pages forces every anonymous page to have a swap entry).
>
> V1->V2:
> - Fix CONFIG_MIGRATION handling
>
> Signed-off-by: Mike Kravetz <[email protected]>
> Signed-off-by: Christoph Lameter <[email protected]>
>
> Index: linux-2.6.14-mm1/include/linux/fs.h
> ===================================================================
> --- linux-2.6.14-mm1.orig/include/linux/fs.h 2005-11-07 11:48:46.000000000 -0800
> +++ linux-2.6.14-mm1/include/linux/fs.h 2005-11-08 10:18:51.000000000 -0800
> @@ -332,6 +332,8 @@ struct address_space_operations {
> loff_t offset, unsigned long nr_segs);
> struct page* (*get_xip_page)(struct address_space *, sector_t,
> int);
> + /* migrate the contents of a page to the specified target */
> + int (*migrate_page) (struct page *, struct page *);
> };
>
> struct backing_dev_info;
> @@ -1679,6 +1681,12 @@ extern void simple_release_fs(struct vfs
>
> extern ssize_t simple_read_from_buffer(void __user *, size_t, loff_t *, const void *, size_t);
>
> +#ifdef CONFIG_MIGRATION
> +extern int buffer_migrate_page(struct page *, struct page *);
> +#else
> +#define buffer_migrate_page(a,b) NULL
> +#endif

Depending on the CONFIG_MIGRATION, the type of buffer_migrate_page(a,b)
expansion is either int or void *, which doesn't look right.

Moreover below you have initializations

.migrate_page = buffer_migrate_page,

that wouldn't compile when CONFIG_MIGRATION is not defined (as macro
requires two arguments).

Nikita.

2005-11-09 16:50:46

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 5/8] Direct Migration V2: upgrade MPOL_MF_MOVE and sys_migrate_pages()

On Wed, 9 Nov 2005, KAMEZAWA Hiroyuki wrote:

> > We only allocate 256 pages which are on the newlist. If the newlist is empty
> > but there are still pages that could be migrated (!list_empty(pagelist))
> > then we need to allocate more pages and call migrate_pages() again.
> Ah, Okay.
>
> confirmation:
> 1. Because mm->sem is held, there is no page-is-truncated/freed case.

The page is truncated/freed case is handled by migrate_pages(). The page
is moved to the "moved" lists and then returned to the LRU. The functions
putting a page back to the LRU will check the refcount and discard the
page.

> 2. Because pages in pagelist are removed from zone's lru, kswapd and others
> will not
> find and unmap them. There is no page-is-swapedout-by-others case.

Right.

> So if all target pages are successfuly remvoed from pagelist, newlist must be
> empty.
> Right ?

It could be empty but there could be new pages left over because some
pages were freed before we could move them or we were unable to migrate a
page and fell back to swap for a particular page. We need to free the
leftover pages then.

2005-11-09 17:08:16

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 6/8] Direct Migration V2: Avoid writeback / page_migrate() method

On Wed, 9 Nov 2005, Nikita Danilov wrote:

> > +#ifdef CONFIG_MIGRATION
> > +extern int buffer_migrate_page(struct page *, struct page *);
> > +#else
> > +#define buffer_migrate_page(a,b) NULL
> > +#endif
>
> Depending on the CONFIG_MIGRATION, the type of buffer_migrate_page(a,b)
> expansion is either int or void *, which doesn't look right.

But its right. You need to think about buffer_migrate_page as a pointer to
a function.

> Moreover below you have initializations
>
> .migrate_page = buffer_migrate_page,
>
> that wouldn't compile when CONFIG_MIGRATION is not defined (as macro
> requires two arguments).

NULL is a void * pointer which should work.

2005-11-09 17:19:55

by Nikita Danilov

[permalink] [raw]
Subject: Re: [PATCH 6/8] Direct Migration V2: Avoid writeback / page_migrate() method

Christoph Lameter writes:
> On Wed, 9 Nov 2005, Nikita Danilov wrote:
>
> > > +#ifdef CONFIG_MIGRATION
> > > +extern int buffer_migrate_page(struct page *, struct page *);
> > > +#else
> > > +#define buffer_migrate_page(a,b) NULL
> > > +#endif
> >
> > Depending on the CONFIG_MIGRATION, the type of buffer_migrate_page(a,b)
> > expansion is either int or void *, which doesn't look right.
>
> But its right. You need to think about buffer_migrate_page as a pointer to
> a function.

buffer_migrate_page is a pointer to function.

buffer_migrate_page(a, b) is a value of type int (or void *).

>
> > Moreover below you have initializations
> >
> > .migrate_page = buffer_migrate_page,
> >
> > that wouldn't compile when CONFIG_MIGRATION is not defined (as macro
> > requires two arguments).
>
> NULL is a void * pointer which should work.

$ cat > macro.c
#define buffer_migrate_page(a,b) NULL

int (*migrate_page) (void *, void *) = buffer_migrate_page;
^D
$ cc macro.c
macro.c:3: error: `buffer_migrate_page' undeclared here (not in a function)

When CONFIG_MIGRATION is not defined, buffer_migrate_page is a macro,
taking _two_ arguments. Name of such macro cannot be used without
argument list.

Nikita.

2005-11-09 19:17:10

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 6/8] Direct Migration V2: Avoid writeback / page_migrate() method

On Wed, 9 Nov 2005, Nikita Danilov wrote:

> buffer_migrate_page is a pointer to function.
>
> buffer_migrate_page(a, b) is a value of type int (or void *).

Yup that is wrong. It must be

#define buffer_migrate_page NULL

While investigating that I found a name clash and some issues with #ifdef
CONFIG_MIGRATION for direct migration. Sigh.

===

Fix UP compile

- Remove parameters from macro definitions for migration function
if CONFIG_MIGRATION is off.

- Avoid name clash between migrate_page macro and the address operations
field names migrate_page by renaming it to migratepage.

- Adjust #ifdef CONFIG_MIGRATION in vmscan.c to include
fail_migrate_page (provide by macro if !CONFIG_MIGRATION and
surround putback_lru_pages() with #ifdef CONFIG_MIGRATION

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.14-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/vmscan.c 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/mm/vmscan.c 2005-11-09 11:08:31.000000000 -0800
@@ -572,6 +572,7 @@ keep:
return reclaimed;
}

+#ifdef CONFIG_MIGRATION
/*
* Non migratable page
*/
@@ -580,8 +581,6 @@ int fail_migrate_page(struct page *newpa
return -EIO;
}

-
-#ifdef CONFIG_MIGRATION
/*
* swapout a single page
* page is locked upon entry, unlocked on exit
@@ -916,8 +916,8 @@ redo:
if (!mapping)
goto unlock_both;

- if (mapping->a_ops->migrate_page) {
- rc = mapping->a_ops->migrate_page(newpage, page);
+ if (mapping->a_ops->migratepage) {
+ rc = mapping->a_ops->migratepage(newpage, page);
goto unlock_both;
}

@@ -1142,6 +1142,7 @@ done:
pagevec_release(&pvec);
}

+#ifdef CONFIG_MIGRATION
/*
* Add isolated pages on the list back to the LRU
*
@@ -1159,6 +1160,7 @@ int putback_lru_pages(struct list_head *
}
return count;
}
+#endif

/*
* This moves pages from the active list to the inactive list.
Index: linux-2.6.14-mm1/include/linux/fs.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/fs.h 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/fs.h 2005-11-09 10:54:03.000000000 -0800
@@ -333,7 +333,7 @@ struct address_space_operations {
struct page* (*get_xip_page)(struct address_space *, sector_t,
int);
/* migrate the contents of a page to the specified target */
- int (*migrate_page) (struct page *, struct page *);
+ int (*migratepage) (struct page *, struct page *);
};

struct backing_dev_info;
@@ -1684,7 +1684,7 @@ extern ssize_t simple_read_from_buffer(v
#ifdef CONFIG_MIGRATION
extern int buffer_migrate_page(struct page *, struct page *);
#else
-#define buffer_migrate_page(a,b) NULL
+#define buffer_migrate_page NULL
#endif

extern int inode_change_ok(struct inode *, struct iattr *);
Index: linux-2.6.14-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.14-mm1.orig/include/linux/swap.h 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/include/linux/swap.h 2005-11-09 10:54:03.000000000 -0800
@@ -189,8 +189,8 @@ extern void migrate_page_copy(struct pag
extern int fail_migrate_page(struct page *, struct page *);
#else
/* Possible settings for the migrate_page() method in address_operations */
-#define migrate_page(a,b) NULL
-#define fail_migrate_page(a,b) NULL
+#define migrate_page NULL
+#define fail_migrate_page NULL
#endif

#ifdef CONFIG_MMU
Index: linux-2.6.14-mm1/mm/swap_state.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/swap_state.c 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/mm/swap_state.c 2005-11-09 10:54:03.000000000 -0800
@@ -26,7 +26,7 @@ static struct address_space_operations s
.writepage = swap_writepage,
.sync_page = block_sync_page,
.set_page_dirty = __set_page_dirty_nobuffers,
- .migrate_page = migrate_page,
+ .migratepage = migrate_page,
};

static struct backing_dev_info swap_backing_dev_info = {
Index: linux-2.6.14-mm1/fs/ext2/inode.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/ext2/inode.c 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/fs/ext2/inode.c 2005-11-09 10:54:03.000000000 -0800
@@ -706,7 +706,7 @@ struct address_space_operations ext2_aop
.bmap = ext2_bmap,
.direct_IO = ext2_direct_IO,
.writepages = ext2_writepages,
- .migrate_page = buffer_migrate_page,
+ .migratepage = buffer_migrate_page,
};

struct address_space_operations ext2_aops_xip = {
@@ -724,7 +724,7 @@ struct address_space_operations ext2_nob
.bmap = ext2_bmap,
.direct_IO = ext2_direct_IO,
.writepages = ext2_writepages,
- .migrate_page = buffer_migrate_page,
+ .migratepage = buffer_migrate_page,
};

/*
Index: linux-2.6.14-mm1/fs/ext3/inode.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/ext3/inode.c 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/fs/ext3/inode.c 2005-11-09 10:54:03.000000000 -0800
@@ -1562,7 +1562,7 @@ static struct address_space_operations e
.invalidatepage = ext3_invalidatepage,
.releasepage = ext3_releasepage,
.direct_IO = ext3_direct_IO,
- .migrate_page = buffer_migrate_page,
+ .migratepage = buffer_migrate_page,
};

static struct address_space_operations ext3_writeback_aops = {
@@ -1576,7 +1576,7 @@ static struct address_space_operations e
.invalidatepage = ext3_invalidatepage,
.releasepage = ext3_releasepage,
.direct_IO = ext3_direct_IO,
- .migrate_page = buffer_migrate_page,
+ .migratepage = buffer_migrate_page,
};

static struct address_space_operations ext3_journalled_aops = {
Index: linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/xfs/linux-2.6/xfs_aops.c 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_aops.c 2005-11-09 10:54:03.000000000 -0800
@@ -1348,5 +1348,5 @@ struct address_space_operations linvfs_a
.commit_write = generic_commit_write,
.bmap = linvfs_bmap,
.direct_IO = linvfs_direct_IO,
- .migrate_page = buffer_migrate_page,
+ .migratepage = buffer_migrate_page,
};
Index: linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.14-mm1.orig/fs/xfs/linux-2.6/xfs_buf.c 2005-11-09 10:54:03.000000000 -0800
+++ linux-2.6.14-mm1/fs/xfs/linux-2.6/xfs_buf.c 2005-11-09 10:59:39.000000000 -0800
@@ -1568,7 +1568,7 @@ xfs_mapping_buftarg(
struct address_space *mapping;
static struct address_space_operations mapping_aops = {
.sync_page = block_sync_page,
- .migrate_page = fail_migrate_page,
+ .migratepage = fail_migrate_page,
};

inode = new_inode(bdev->bd_inode->i_sb);