2020-03-16 22:25:02

by Yang Shi

[permalink] [raw]
Subject: [v2 PATCH 1/2] mm: swap: make page_evictable() inline

When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
down with a couple of vm-scalability's test cases (lru-file-readonce,
lru-file-readtwice and lru-file-mmap-read). I didn't see that much down
on my VM (32c-64g-2nodes). It might be caused by the test configuration,
which is 32c-256g with NUMA disabled and the tests were run in root memcg,
so the tests actually stress only one inactive and active lru. It
sounds not very usual in mordern production environment.

That commit did two major changes:
1. Call page_evictable()
2. Use smp_mb to force the PG_lru set visible

It looks they contribute the most overhead. The page_evictable() is a
function which does function prologue and epilogue, and that was used by
page reclaim path only. However, lru add is a very hot path, so it
sounds better to make it inline. However, it calls page_mapping() which
is not inlined either, but the disassemble shows it doesn't do push and
pop operations and it sounds not very straightforward to inline it.

Other than this, it sounds smp_mb() is not necessary for x86 since
SetPageLRU is atomic which enforces memory barrier already, replace it
with smp_mb__after_atomic() in the following patch.

With the two fixes applied, the tests can get back around 5% on that
test bench and get back normal on my VM. Since the test bench
configuration is not that usual and I also saw around 6% up on the
latest upstream, so it sounds good enough IMHO.

The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
mainline w/ inline fix
150MB 154MB

With this patch the throughput gets 2.67% up. The data with using
smp_mb__after_atomic() is showed in the following patch.

Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
Cc: Shakeel Butt <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
v2: * Solved the comments from Willy.

include/linux/pagemap.h | 2 +-
include/linux/swap.h | 24 +++++++++++++++++++++++-
mm/vmscan.c | 23 -----------------------
3 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ccb14b6..654ce76 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -70,7 +70,7 @@ static inline void mapping_clear_unevictable(struct address_space *mapping)
clear_bit(AS_UNEVICTABLE, &mapping->flags);
}

-static inline int mapping_unevictable(struct address_space *mapping)
+static inline bool mapping_unevictable(struct address_space *mapping)
{
if (mapping)
return test_bit(AS_UNEVICTABLE, &mapping->flags);
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1e99f7a..d296c70 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -374,7 +374,29 @@ extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
#define node_reclaim_mode 0
#endif

-extern int page_evictable(struct page *page);
+/**
+ * page_evictable - test whether a page is evictable
+ * @page: the page to test
+ *
+ * Test whether page is evictable--i.e., should be placed on active/inactive
+ * lists vs unevictable list.
+ *
+ * Reasons page might not be evictable:
+ * (1) page's mapping marked unevictable
+ * (2) page is part of an mlocked VMA
+ *
+ */
+static inline bool page_evictable(struct page *page)
+{
+ bool ret;
+
+ /* Prevent address_space of inode and swap cache from being freed */
+ rcu_read_lock();
+ ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
+ rcu_read_unlock();
+ return ret;
+}
+
extern void check_move_unevictable_pages(struct pagevec *pvec);

extern int kswapd_run(int nid);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8763705..855c395 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4277,29 +4277,6 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
}
#endif

-/*
- * page_evictable - test whether a page is evictable
- * @page: the page to test
- *
- * Test whether page is evictable--i.e., should be placed on active/inactive
- * lists vs unevictable list.
- *
- * Reasons page might not be evictable:
- * (1) page's mapping marked unevictable
- * (2) page is part of an mlocked VMA
- *
- */
-int page_evictable(struct page *page)
-{
- int ret;
-
- /* Prevent address_space of inode and swap cache from being freed */
- rcu_read_lock();
- ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
- rcu_read_unlock();
- return ret;
-}
-
/**
* check_move_unevictable_pages - check pages for evictability and move to
* appropriate zone lru list
--
1.8.3.1


2020-03-16 22:25:14

by Yang Shi

[permalink] [raw]
Subject: [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set

Memory barrier is needed after setting LRU bit, but smp_mb() is too
strong. Some architectures, i.e. x86, imply memory barrier with atomic
operations, so replacing it with smp_mb__after_atomic() sounds better,
which is nop on strong ordered machines, and full memory barriers on
others. With this change the vm-calability cases would perform better
on x86, I saw total 6% improvement with this patch and previous inline
fix.

The test data (lru-file-readtwice throughput) against v5.6-rc4:
mainline w/ inline fix w/ both (adding this)
150MB 154MB 159MB

Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
Cc: Shakeel Butt <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
v2: * Solved the comment from Vlastimil and added his Ack.

mm/swap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index cf39d24..74ea08a 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -931,7 +931,6 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,

VM_BUG_ON_PAGE(PageLRU(page), page);

- SetPageLRU(page);
/*
* Page becomes evictable in two ways:
* 1) Within LRU lock [munlock_vma_page() and __munlock_pagevec()].
@@ -958,7 +957,8 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
* looking at the same page) and the evictable page will be stranded
* in an unevictable LRU.
*/
- smp_mb();
+ SetPageLRU(page);
+ smp_mb__after_atomic();

if (page_evictable(page)) {
lru = page_lru(page);
--
1.8.3.1

2020-03-16 23:48:46

by Shakeel Butt

[permalink] [raw]
Subject: Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline

On Mon, Mar 16, 2020 at 3:24 PM Yang Shi <[email protected]> wrote:
>
> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
> down with a couple of vm-scalability's test cases (lru-file-readonce,
> lru-file-readtwice and lru-file-mmap-read). I didn't see that much down
> on my VM (32c-64g-2nodes). It might be caused by the test configuration,
> which is 32c-256g with NUMA disabled and the tests were run in root memcg,
> so the tests actually stress only one inactive and active lru. It
> sounds not very usual in mordern production environment.
>
> That commit did two major changes:
> 1. Call page_evictable()
> 2. Use smp_mb to force the PG_lru set visible
>
> It looks they contribute the most overhead. The page_evictable() is a
> function which does function prologue and epilogue, and that was used by
> page reclaim path only. However, lru add is a very hot path, so it
> sounds better to make it inline. However, it calls page_mapping() which
> is not inlined either, but the disassemble shows it doesn't do push and
> pop operations and it sounds not very straightforward to inline it.
>
> Other than this, it sounds smp_mb() is not necessary for x86 since
> SetPageLRU is atomic which enforces memory barrier already, replace it
> with smp_mb__after_atomic() in the following patch.
>
> With the two fixes applied, the tests can get back around 5% on that
> test bench and get back normal on my VM. Since the test bench
> configuration is not that usual and I also saw around 6% up on the
> latest upstream, so it sounds good enough IMHO.
>
> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
> mainline w/ inline fix
> 150MB 154MB
>
> With this patch the throughput gets 2.67% up. The data with using
> smp_mb__after_atomic() is showed in the following patch.
>
> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
> Cc: Shakeel Butt <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>


So, I tested on a real machine with limiting the 'dd' on a single node
and reading 100 GiB sparse file (less than a single node). I just ran
a single instance to not cause the lru lock contention. The cmd I used
is "dd if=file-100GiB of=/dev/null bs=4k". I ran the cmd 10 times with
drop_caches in between and measured the time it took.

Without patch: 56.64143 +- 0.672 sec

With patches: 56.10 +- 0.21 sec

Reviewed-and-Tested-by: Shakeel Butt <[email protected]>

2020-03-16 23:49:11

by Shakeel Butt

[permalink] [raw]
Subject: Re: [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set

On Mon, Mar 16, 2020 at 3:24 PM Yang Shi <[email protected]> wrote:
>
> Memory barrier is needed after setting LRU bit, but smp_mb() is too
> strong. Some architectures, i.e. x86, imply memory barrier with atomic
> operations, so replacing it with smp_mb__after_atomic() sounds better,
> which is nop on strong ordered machines, and full memory barriers on
> others. With this change the vm-calability cases would perform better
> on x86, I saw total 6% improvement with this patch and previous inline
> fix.
>
> The test data (lru-file-readtwice throughput) against v5.6-rc4:
> mainline w/ inline fix w/ both (adding this)
> 150MB 154MB 159MB
>
> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
> Cc: Shakeel Butt <[email protected]>
> Acked-by: Vlastimil Babka <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>

Reviewed-and-Tested-by: Shakeel Butt <[email protected]>

2020-03-17 02:07:39

by Yang Shi

[permalink] [raw]
Subject: Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline



On 3/16/20 4:46 PM, Shakeel Butt wrote:
> On Mon, Mar 16, 2020 at 3:24 PM Yang Shi <[email protected]> wrote:
>> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
>> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
>> down with a couple of vm-scalability's test cases (lru-file-readonce,
>> lru-file-readtwice and lru-file-mmap-read). I didn't see that much down
>> on my VM (32c-64g-2nodes). It might be caused by the test configuration,
>> which is 32c-256g with NUMA disabled and the tests were run in root memcg,
>> so the tests actually stress only one inactive and active lru. It
>> sounds not very usual in mordern production environment.
>>
>> That commit did two major changes:
>> 1. Call page_evictable()
>> 2. Use smp_mb to force the PG_lru set visible
>>
>> It looks they contribute the most overhead. The page_evictable() is a
>> function which does function prologue and epilogue, and that was used by
>> page reclaim path only. However, lru add is a very hot path, so it
>> sounds better to make it inline. However, it calls page_mapping() which
>> is not inlined either, but the disassemble shows it doesn't do push and
>> pop operations and it sounds not very straightforward to inline it.
>>
>> Other than this, it sounds smp_mb() is not necessary for x86 since
>> SetPageLRU is atomic which enforces memory barrier already, replace it
>> with smp_mb__after_atomic() in the following patch.
>>
>> With the two fixes applied, the tests can get back around 5% on that
>> test bench and get back normal on my VM. Since the test bench
>> configuration is not that usual and I also saw around 6% up on the
>> latest upstream, so it sounds good enough IMHO.
>>
>> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
>> mainline w/ inline fix
>> 150MB 154MB
>>
>> With this patch the throughput gets 2.67% up. The data with using
>> smp_mb__after_atomic() is showed in the following patch.
>>
>> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
>> Cc: Shakeel Butt <[email protected]>
>> Cc: Vlastimil Babka <[email protected]>
>> Cc: Matthew Wilcox <[email protected]>
>> Signed-off-by: Yang Shi <[email protected]>
>
> So, I tested on a real machine with limiting the 'dd' on a single node
> and reading 100 GiB sparse file (less than a single node). I just ran
> a single instance to not cause the lru lock contention. The cmd I used
> is "dd if=file-100GiB of=/dev/null bs=4k". I ran the cmd 10 times with
> drop_caches in between and measured the time it took.
>
> Without patch: 56.64143 +- 0.672 sec
>
> With patches: 56.10 +- 0.21 sec
>
> Reviewed-and-Tested-by: Shakeel Butt <[email protected]>

Thanks Shakeel. It'd better to add your test result in the commit log too.

2020-03-17 03:30:10

by kernel test robot

[permalink] [raw]
Subject: Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline

Hi Yang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[also build test ERROR on linus/master v5.6-rc6 next-20200316]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Yang-Shi/mm-swap-make-page_evictable-inline/20200317-094836
base: git://git.cmpxchg.org/linux-mmotm.git master
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.5.0-5) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from include/linux/suspend.h:5:0,
from arch/x86/kernel/asm-offsets.c:13:
include/linux/swap.h: In function 'page_evictable':
>> include/linux/swap.h:395:9: error: implicit declaration of function 'mapping_unevictable'; did you mean 'mapping_deny_writable'? [-Werror=implicit-function-declaration]
ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
^~~~~~~~~~~~~~~~~~~
mapping_deny_writable
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:99: arch/x86/kernel/asm-offsets.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1139: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:179: sub-make] Error 2
11 real 4 user 5 sys 89.51% cpu make prepare

vim +395 include/linux/swap.h

376
377 /**
378 * page_evictable - test whether a page is evictable
379 * @page: the page to test
380 *
381 * Test whether page is evictable--i.e., should be placed on active/inactive
382 * lists vs unevictable list.
383 *
384 * Reasons page might not be evictable:
385 * (1) page's mapping marked unevictable
386 * (2) page is part of an mlocked VMA
387 *
388 */
389 static inline bool page_evictable(struct page *page)
390 {
391 bool ret;
392
393 /* Prevent address_space of inode and swap cache from being freed */
394 rcu_read_lock();
> 395 ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
396 rcu_read_unlock();
397 return ret;
398 }
399

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.66 kB)
.config.gz (7.11 kB)
Download all attachments

2020-03-17 03:31:08

by kernel test robot

[permalink] [raw]
Subject: Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline

Hi Yang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[also build test ERROR on linus/master v5.6-rc6 next-20200316]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Yang-Shi/mm-swap-make-page_evictable-inline/20200317-094836
base: git://git.cmpxchg.org/linux-mmotm.git master
config: openrisc-simple_smp_defconfig (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=9.2.0 make.cross ARCH=openrisc

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from include/linux/suspend.h:5,
from init/do_mounts.c:7:
include/linux/swap.h: In function 'page_evictable':
include/linux/swap.h:395:9: error: implicit declaration of function 'mapping_unevictable'; did you mean 'mapping_deny_writable'? [-Werror=implicit-function-declaration]
395 | ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
| ^~~~~~~~~~~~~~~~~~~
| mapping_deny_writable
In file included from include/linux/mempolicy.h:16,
from include/linux/shmem_fs.h:7,
from init/do_mounts.c:21:
include/linux/pagemap.h: At top level:
>> include/linux/pagemap.h:73:20: error: conflicting types for 'mapping_unevictable'
73 | static inline bool mapping_unevictable(struct address_space *mapping)
| ^~~~~~~~~~~~~~~~~~~
In file included from include/linux/suspend.h:5,
from init/do_mounts.c:7:
include/linux/swap.h:395:9: note: previous implicit declaration of 'mapping_unevictable' was here
395 | ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
| ^~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors

vim +/mapping_unevictable +73 include/linux/pagemap.h

72
> 73 static inline bool mapping_unevictable(struct address_space *mapping)
74 {
75 if (mapping)
76 return test_bit(AS_UNEVICTABLE, &mapping->flags);
77 return !!mapping;
78 }
79

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.70 kB)
.config.gz (7.85 kB)
Download all attachments

2020-03-17 03:32:33

by Yang Shi

[permalink] [raw]
Subject: Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline



On 3/16/20 8:00 PM, kbuild test robot wrote:
> Hi Yang,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on mmotm/master]
> [also build test ERROR on linus/master v5.6-rc6 next-20200316]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
>
> url: https://github.com/0day-ci/linux/commits/Yang-Shi/mm-swap-make-page_evictable-inline/20200317-094836
> base: git://git.cmpxchg.org/linux-mmotm.git master
> config: i386-tinyconfig (attached as .config)
> compiler: gcc-7 (Debian 7.5.0-5) 7.5.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <[email protected]>
>
> All errors (new ones prefixed by >>):
>
> In file included from include/linux/suspend.h:5:0,
> from arch/x86/kernel/asm-offsets.c:13:
> include/linux/swap.h: In function 'page_evictable':
>>> include/linux/swap.h:395:9: error: implicit declaration of function 'mapping_unevictable'; did you mean 'mapping_deny_writable'? [-Werror=implicit-function-declaration]

It looks pagemap.h need to be included. The below patch should be able
to fix the build error:

diff --git a/include/linux/swap.h b/include/linux/swap.h
index d296c70..e8b8bbe 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -12,6 +12,7 @@
?#include <linux/fs.h>
?#include <linux/atomic.h>
?#include <linux/page-flags.h>
+#include <linux/pagemap.h>
?#include <asm/page.h>

?struct notifier_block;


> ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
> ^~~~~~~~~~~~~~~~~~~
> mapping_deny_writable
> cc1: some warnings being treated as errors
> make[2]: *** [scripts/Makefile.build:99: arch/x86/kernel/asm-offsets.s] Error 1
> make[2]: Target '__build' not remade because of errors.
> make[1]: *** [Makefile:1139: prepare0] Error 2
> make[1]: Target 'prepare' not remade because of errors.
> make: *** [Makefile:179: sub-make] Error 2
> 11 real 4 user 5 sys 89.51% cpu make prepare
>
> vim +395 include/linux/swap.h
>
> 376
> 377 /**
> 378 * page_evictable - test whether a page is evictable
> 379 * @page: the page to test
> 380 *
> 381 * Test whether page is evictable--i.e., should be placed on active/inactive
> 382 * lists vs unevictable list.
> 383 *
> 384 * Reasons page might not be evictable:
> 385 * (1) page's mapping marked unevictable
> 386 * (2) page is part of an mlocked VMA
> 387 *
> 388 */
> 389 static inline bool page_evictable(struct page *page)
> 390 {
> 391 bool ret;
> 392
> 393 /* Prevent address_space of inode and swap cache from being freed */
> 394 rcu_read_lock();
> > 395 ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
> 396 rcu_read_unlock();
> 397 return ret;
> 398 }
> 399
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/[email protected]

2020-03-17 09:01:03

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline

On 3/16/20 11:24 PM, Yang Shi wrote:
> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
> down with a couple of vm-scalability's test cases (lru-file-readonce,
> lru-file-readtwice and lru-file-mmap-read). I didn't see that much down
> on my VM (32c-64g-2nodes). It might be caused by the test configuration,
> which is 32c-256g with NUMA disabled and the tests were run in root memcg,
> so the tests actually stress only one inactive and active lru. It
> sounds not very usual in mordern production environment.
>
> That commit did two major changes:
> 1. Call page_evictable()
> 2. Use smp_mb to force the PG_lru set visible
>
> It looks they contribute the most overhead. The page_evictable() is a
> function which does function prologue and epilogue, and that was used by
> page reclaim path only. However, lru add is a very hot path, so it
> sounds better to make it inline. However, it calls page_mapping() which
> is not inlined either, but the disassemble shows it doesn't do push and
> pop operations and it sounds not very straightforward to inline it.
>
> Other than this, it sounds smp_mb() is not necessary for x86 since
> SetPageLRU is atomic which enforces memory barrier already, replace it
> with smp_mb__after_atomic() in the following patch.
>
> With the two fixes applied, the tests can get back around 5% on that
> test bench and get back normal on my VM. Since the test bench
> configuration is not that usual and I also saw around 6% up on the
> latest upstream, so it sounds good enough IMHO.
>
> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
> mainline w/ inline fix
> 150MB 154MB
>
> With this patch the throughput gets 2.67% up. The data with using
> smp_mb__after_atomic() is showed in the following patch.
>
> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
> Cc: Shakeel Butt <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

Thanks.