2019-10-18 21:12:40

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 0/5] Fixes for THP in page cache

This set includes a few fixes for THP in page cache. They are based on
Linus's master branch.

Thanks,
Song

Changes v1 -> v2:
1. Return -EINVAL if the WARN() triggers. (Oleg Nesterov)
2. Include William Kucharski's fix to vmscan.c, which replaces half of
original (3/4).

Kirill A. Shutemov (3):
proc/meminfo: fix output alignment
mm/thp: fix node page state in split_huge_page_to_list()
mm/thp: allow drop THP from page cache

Song Liu (1):
uprobe: only do FOLL_SPLIT_PMD for uprobe register

William Kucharski (1):
mm: Support removing arbitrary sized pages from mapping

fs/proc/meminfo.c | 4 ++--
kernel/events/uprobes.c | 13 +++++++++++--
mm/huge_memory.c | 9 +++++++--
mm/truncate.c | 12 ++++++++++++
mm/vmscan.c | 5 +----
5 files changed, 33 insertions(+), 10 deletions(-)

--
2.17.1


2019-10-18 21:12:48

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 1/5] proc/meminfo: fix output alignment

From: "Kirill A. Shutemov" <[email protected]>

Add extra space for FileHugePages and FilePmdMapped, so the output is
aligned with other rows.

Fixes: 60fbf0ab5da1 ("mm,thp: stats for file backed THP")
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Song Liu <[email protected]>
Signed-off-by: Song Liu <[email protected]>
---
fs/proc/meminfo.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index ac9247371871..8c1f1bb1a5ce 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -132,9 +132,9 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR);
show_val_kb(m, "ShmemPmdMapped: ",
global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR);
- show_val_kb(m, "FileHugePages: ",
+ show_val_kb(m, "FileHugePages: ",
global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR);
- show_val_kb(m, "FilePmdMapped: ",
+ show_val_kb(m, "FilePmdMapped: ",
global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR);
#endif

--
2.17.1

2019-10-18 21:12:55

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 2/5] mm/thp: fix node page state in split_huge_page_to_list()

From: "Kirill A. Shutemov" <[email protected]>

Make sure split_huge_page_to_list() handle the state of shmem THP and
file THP properly.

Fixes: 60fbf0ab5da1 ("mm,thp: stats for file backed THP")
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Song Liu <[email protected]>
Signed-off-by: Song Liu <[email protected]>
---
mm/huge_memory.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c5cb6dcd6c69..13cc93785006 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2789,8 +2789,13 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
ds_queue->split_queue_len--;
list_del(page_deferred_list(head));
}
- if (mapping)
- __dec_node_page_state(page, NR_SHMEM_THPS);
+ if (mapping) {
+ if (PageSwapBacked(page))
+ __dec_node_page_state(page, NR_SHMEM_THPS);
+ else
+ __dec_node_page_state(page, NR_FILE_THPS);
+ }
+
spin_unlock(&ds_queue->split_queue_lock);
__split_huge_page(page, list, end, flags);
if (PageSwapCache(head)) {
--
2.17.1

2019-10-18 21:14:00

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 3/5] mm: Support removing arbitrary sized pages from mapping

From: William Kucharski <[email protected]>

__remove_mapping() assumes that pages can only be either base pages
or HPAGE_PMD_SIZE. Ask the page what size it is.

Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: William Kucharski <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Song Liu <[email protected]>
---
mm/vmscan.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c6659bb758a4..f870da1f4bb7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -932,10 +932,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
* Note that if SetPageDirty is always performed via set_page_dirty,
* and thus under the i_pages lock, then this ordering is not required.
*/
- if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
- refcount = 1 + HPAGE_PMD_NR;
- else
- refcount = 2;
+ refcount = 1 + compound_nr(page);
if (!page_ref_freeze(page, refcount))
goto cannot_free;
/* note: atomic_cmpxchg in page_ref_freeze provides the smp_rmb */
--
2.17.1

2019-10-18 21:14:10

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 4/5] mm/thp: allow drop THP from page cache

From: "Kirill A. Shutemov" <[email protected]>

Once a THP is added to the page cache, it cannot be dropped via
/proc/sys/vm/drop_caches. Fix this issue with proper handling in
invalidate_mapping_pages().

Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Song Liu <[email protected]>
Signed-off-by: Song Liu <[email protected]>
---
mm/truncate.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/mm/truncate.c b/mm/truncate.c
index 8563339041f6..dd9ebc1da356 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -592,6 +592,16 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
unlock_page(page);
continue;
}
+
+ /* Take a pin outside pagevec */
+ get_page(page);
+
+ /*
+ * Drop extra pins before trying to invalidate
+ * the huge page.
+ */
+ pagevec_remove_exceptionals(&pvec);
+ pagevec_release(&pvec);
}

ret = invalidate_inode_page(page);
@@ -602,6 +612,8 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
*/
if (!ret)
deactivate_file_page(page);
+ if (PageTransHuge(page))
+ put_page(page);
count += ret;
}
pagevec_remove_exceptionals(&pvec);
--
2.17.1

2019-10-18 21:14:16

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 5/5] uprobe: only do FOLL_SPLIT_PMD for uprobe register

Attaching uprobe to text section in THP splits the PMD mapped page table
into PTE mapped entries. On uprobe detach, we would like to regroup PMD
mapped page table entry to regain performance benefit of THP.

However, the regroup is broken For perf_event based trace_uprobe. This is
because perf_event based trace_uprobe calls uprobe_unregister twice on
close: first in TRACE_REG_PERF_CLOSE, then in TRACE_REG_PERF_UNREGISTER.
The second call will split the PMD mapped page table entry, which is not
the desired behavior.

Fix this by only use FOLL_SPLIT_PMD for uprobe register case.

Add a WARN() to confirm uprobe unregister never work on huge pages, and
abort the operation when this WARN() triggers.

Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Cc: Kirill A. Shutemov <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Song Liu <[email protected]>
---
kernel/events/uprobes.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 94d38a39d72e..c74761004ee5 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -474,14 +474,17 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
struct vm_area_struct *vma;
int ret, is_register, ref_ctr_updated = 0;
bool orig_page_huge = false;
+ unsigned int gup_flags = FOLL_FORCE;

is_register = is_swbp_insn(&opcode);
uprobe = container_of(auprobe, struct uprobe, arch);

retry:
+ if (is_register)
+ gup_flags |= FOLL_SPLIT_PMD;
/* Read the page with vaddr into memory */
- ret = get_user_pages_remote(NULL, mm, vaddr, 1,
- FOLL_FORCE | FOLL_SPLIT_PMD, &old_page, &vma, NULL);
+ ret = get_user_pages_remote(NULL, mm, vaddr, 1, gup_flags,
+ &old_page, &vma, NULL);
if (ret <= 0)
return ret;

@@ -489,6 +492,12 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
if (ret <= 0)
goto put_old;

+ if (WARN(!is_register && PageCompound(old_page),
+ "uprobe unregister should never work on compound page\n")) {
+ ret = -EINVAL;
+ goto put_old;
+ }
+
/* We are going to replace instruction, update ref_ctr. */
if (!ref_ctr_updated && uprobe->ref_ctr_offset) {
ret = update_ref_ctr(uprobe, mm, is_register ? 1 : -1);
--
2.17.1

2019-10-18 22:07:52

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] proc/meminfo: fix output alignment

On Thu, Oct 17, 2019 at 9:42 AM Song Liu <[email protected]> wrote:
>
> From: "Kirill A. Shutemov" <[email protected]>
>
> Add extra space for FileHugePages and FilePmdMapped, so the output is
> aligned with other rows.
>
> Fixes: 60fbf0ab5da1 ("mm,thp: stats for file backed THP")
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Tested-by: Song Liu <[email protected]>
> Signed-off-by: Song Liu <[email protected]>

Acked-by: Yang Shi <[email protected]>

> ---
> fs/proc/meminfo.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> index ac9247371871..8c1f1bb1a5ce 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -132,9 +132,9 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR);
> show_val_kb(m, "ShmemPmdMapped: ",
> global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR);
> - show_val_kb(m, "FileHugePages: ",
> + show_val_kb(m, "FileHugePages: ",
> global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR);
> - show_val_kb(m, "FilePmdMapped: ",
> + show_val_kb(m, "FilePmdMapped: ",
> global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR);
> #endif
>
> --
> 2.17.1
>
>

2019-10-18 22:08:30

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] mm/thp: fix node page state in split_huge_page_to_list()

On Thu, Oct 17, 2019 at 9:42 AM Song Liu <[email protected]> wrote:
>
> From: "Kirill A. Shutemov" <[email protected]>
>
> Make sure split_huge_page_to_list() handle the state of shmem THP and
> file THP properly.
>
> Fixes: 60fbf0ab5da1 ("mm,thp: stats for file backed THP")
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Tested-by: Song Liu <[email protected]>
> Signed-off-by: Song Liu <[email protected]>

Acked-by: Yang Shi <[email protected]>

> ---
> mm/huge_memory.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index c5cb6dcd6c69..13cc93785006 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2789,8 +2789,13 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> ds_queue->split_queue_len--;
> list_del(page_deferred_list(head));
> }
> - if (mapping)
> - __dec_node_page_state(page, NR_SHMEM_THPS);
> + if (mapping) {
> + if (PageSwapBacked(page))
> + __dec_node_page_state(page, NR_SHMEM_THPS);
> + else
> + __dec_node_page_state(page, NR_FILE_THPS);
> + }
> +
> spin_unlock(&ds_queue->split_queue_lock);
> __split_huge_page(page, list, end, flags);
> if (PageSwapCache(head)) {
> --
> 2.17.1
>
>

2019-10-18 22:08:34

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] mm: Support removing arbitrary sized pages from mapping

On Thu, Oct 17, 2019 at 9:42 AM Song Liu <[email protected]> wrote:
>
> From: William Kucharski <[email protected]>
>
> __remove_mapping() assumes that pages can only be either base pages
> or HPAGE_PMD_SIZE. Ask the page what size it is.
>
> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> Signed-off-by: William Kucharski <[email protected]>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Signed-off-by: Song Liu <[email protected]>

Acked-by: Yang Shi <[email protected]>

> ---
> mm/vmscan.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c6659bb758a4..f870da1f4bb7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -932,10 +932,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> * Note that if SetPageDirty is always performed via set_page_dirty,
> * and thus under the i_pages lock, then this ordering is not required.
> */
> - if (unlikely(PageTransHuge(page)) && PageSwapCache(page))
> - refcount = 1 + HPAGE_PMD_NR;
> - else
> - refcount = 2;
> + refcount = 1 + compound_nr(page);
> if (!page_ref_freeze(page, refcount))
> goto cannot_free;
> /* note: atomic_cmpxchg in page_ref_freeze provides the smp_rmb */
> --
> 2.17.1
>
>

2019-10-18 22:21:16

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] mm/thp: allow drop THP from page cache

On Thu, Oct 17, 2019 at 9:42 AM Song Liu <[email protected]> wrote:
>
> From: "Kirill A. Shutemov" <[email protected]>
>
> Once a THP is added to the page cache, it cannot be dropped via
> /proc/sys/vm/drop_caches. Fix this issue with proper handling in
> invalidate_mapping_pages().
>
> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Tested-by: Song Liu <[email protected]>
> Signed-off-by: Song Liu <[email protected]>
> ---
> mm/truncate.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 8563339041f6..dd9ebc1da356 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -592,6 +592,16 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
> unlock_page(page);
> continue;
> }
> +
> + /* Take a pin outside pagevec */
> + get_page(page);
> +
> + /*
> + * Drop extra pins before trying to invalidate
> + * the huge page.
> + */
> + pagevec_remove_exceptionals(&pvec);
> + pagevec_release(&pvec);

Shall we skip the outer pagevec_remove_exceptions() if it has been done here?

> }
>
> ret = invalidate_inode_page(page);
> @@ -602,6 +612,8 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
> */
> if (!ret)
> deactivate_file_page(page);
> + if (PageTransHuge(page))
> + put_page(page);
> count += ret;
> }
> pagevec_remove_exceptionals(&pvec);
> --
> 2.17.1
>
>

2019-10-19 08:23:03

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uprobe: only do FOLL_SPLIT_PMD for uprobe register

* Song Liu <[email protected]> [2019-10-17 09:42:22]:

> Attaching uprobe to text section in THP splits the PMD mapped page table
> into PTE mapped entries. On uprobe detach, we would like to regroup PMD
> mapped page table entry to regain performance benefit of THP.
>
> However, the regroup is broken For perf_event based trace_uprobe. This is
> because perf_event based trace_uprobe calls uprobe_unregister twice on
> close: first in TRACE_REG_PERF_CLOSE, then in TRACE_REG_PERF_UNREGISTER.
> The second call will split the PMD mapped page table entry, which is not
> the desired behavior.
>
> Fix this by only use FOLL_SPLIT_PMD for uprobe register case.
>
> Add a WARN() to confirm uprobe unregister never work on huge pages, and
> abort the operation when this WARN() triggers.
>
> Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Srikar Dronamraju <[email protected]>
> Cc: Oleg Nesterov <[email protected]>
> Signed-off-by: Song Liu <[email protected]>
> ---

Looks good to me.

Reviewed-by: Srikar Dronamraju <[email protected]>

> kernel/events/uprobes.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 94d38a39d72e..c74761004ee5 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -474,14 +474,17 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
> struct vm_area_struct *vma;
> int ret, is_register, ref_ctr_updated = 0;
> bool orig_page_huge = false;
> + unsigned int gup_flags = FOLL_FORCE;
>
> is_register = is_swbp_insn(&opcode);
> uprobe = container_of(auprobe, struct uprobe, arch);
>
> retry:
> + if (is_register)
> + gup_flags |= FOLL_SPLIT_PMD;
> /* Read the page with vaddr into memory */
> - ret = get_user_pages_remote(NULL, mm, vaddr, 1,
> - FOLL_FORCE | FOLL_SPLIT_PMD, &old_page, &vma, NULL);
> + ret = get_user_pages_remote(NULL, mm, vaddr, 1, gup_flags,
> + &old_page, &vma, NULL);
> if (ret <= 0)
> return ret;
>
> @@ -489,6 +492,12 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
> if (ret <= 0)
> goto put_old;
>
> + if (WARN(!is_register && PageCompound(old_page),
> + "uprobe unregister should never work on compound page\n")) {
> + ret = -EINVAL;
> + goto put_old;
> + }
> +
> /* We are going to replace instruction, update ref_ctr. */
> if (!ref_ctr_updated && uprobe->ref_ctr_offset) {
> ret = update_ref_ctr(uprobe, mm, is_register ? 1 : -1);
> --
> 2.17.1
>

--
Thanks and Regards
Srikar Dronamraju

2019-10-19 08:27:43

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] mm/thp: allow drop THP from page cache

On Thu, Oct 17, 2019 at 02:46:38PM -0700, Yang Shi wrote:
> On Thu, Oct 17, 2019 at 9:42 AM Song Liu <[email protected]> wrote:
> >
> > From: "Kirill A. Shutemov" <[email protected]>
> >
> > Once a THP is added to the page cache, it cannot be dropped via
> > /proc/sys/vm/drop_caches. Fix this issue with proper handling in
> > invalidate_mapping_pages().
> >
> > Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Tested-by: Song Liu <[email protected]>
> > Signed-off-by: Song Liu <[email protected]>
> > ---
> > mm/truncate.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/mm/truncate.c b/mm/truncate.c
> > index 8563339041f6..dd9ebc1da356 100644
> > --- a/mm/truncate.c
> > +++ b/mm/truncate.c
> > @@ -592,6 +592,16 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
> > unlock_page(page);
> > continue;
> > }
> > +
> > + /* Take a pin outside pagevec */
> > + get_page(page);
> > +
> > + /*
> > + * Drop extra pins before trying to invalidate
> > + * the huge page.
> > + */
> > + pagevec_remove_exceptionals(&pvec);
> > + pagevec_release(&pvec);
>
> Shall we skip the outer pagevec_remove_exceptions() if it has been done here?

It will be NOP and skipping would complicate the code.

--
Kirill A. Shutemov

2019-10-19 09:20:38

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] mm/thp: allow drop THP from page cache

On Fri, Oct 18, 2019 at 6:32 AM Kirill A. Shutemov <[email protected]> wrote:
>
> On Thu, Oct 17, 2019 at 02:46:38PM -0700, Yang Shi wrote:
> > On Thu, Oct 17, 2019 at 9:42 AM Song Liu <[email protected]> wrote:
> > >
> > > From: "Kirill A. Shutemov" <[email protected]>
> > >
> > > Once a THP is added to the page cache, it cannot be dropped via
> > > /proc/sys/vm/drop_caches. Fix this issue with proper handling in
> > > invalidate_mapping_pages().
> > >
> > > Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> > > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > > Tested-by: Song Liu <[email protected]>
> > > Signed-off-by: Song Liu <[email protected]>
> > > ---
> > > mm/truncate.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/mm/truncate.c b/mm/truncate.c
> > > index 8563339041f6..dd9ebc1da356 100644
> > > --- a/mm/truncate.c
> > > +++ b/mm/truncate.c
> > > @@ -592,6 +592,16 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
> > > unlock_page(page);
> > > continue;
> > > }
> > > +
> > > + /* Take a pin outside pagevec */
> > > + get_page(page);
> > > +
> > > + /*
> > > + * Drop extra pins before trying to invalidate
> > > + * the huge page.
> > > + */
> > > + pagevec_remove_exceptionals(&pvec);
> > > + pagevec_release(&pvec);
> >
> > Shall we skip the outer pagevec_remove_exceptions() if it has been done here?
>
> It will be NOP and skipping would complicate the code.

Yes, it would be. Anyway, it looks ok too. Acked-by: Yang Shi
<[email protected]>

>
> --
> Kirill A. Shutemov