2019-01-04 17:26:10

by Ashish Mhetre

[permalink] [raw]
Subject: [PATCH] mm: Expose lazy vfree pages to control via sysctl

From: Hiroshi Doyu <[email protected]>

The purpose of lazy_max_pages is to gather virtual address space till it
reaches the lazy_max_pages limit and then purge with a TLB flush and hence
reduce the number of global TLB flushes.
The default value of lazy_max_pages with one CPU is 32MB and with 4 CPUs it
is 96MB i.e. for 4 cores, 96MB of vmalloc space will be gathered before it
is purged with a TLB flush.
This feature has shown random latency issues. For example, we have seen
that the kernel thread for some camera application spent 30ms in
__purge_vmap_area_lazy() with 4 CPUs.
So, create "/proc/sys/lazy_vfree_pages" file to control lazy vfree pages.
With this sysctl, the behaviour of lazy_vfree_pages can be controlled and
the systems which can't tolerate latency issues can also disable it.
This is one of the way through which lazy_vfree_pages can be controlled as
proposed in this patch. The other possible solution would be to configure
lazy_vfree_pages through kernel cmdline.

Signed-off-by: Hiroshi Doyu <[email protected]>
Signed-off-by: Ashish Mhetre <[email protected]>
---
kernel/sysctl.c | 8 ++++++++
mm/vmalloc.c | 5 ++++-
2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 3ae223f..49523efc 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -111,6 +111,7 @@ extern int pid_max;
extern int pid_max_min, pid_max_max;
extern int percpu_pagelist_fraction;
extern int latencytop_enabled;
+extern int sysctl_lazy_vfree_pages;
extern unsigned int sysctl_nr_open_min, sysctl_nr_open_max;
#ifndef CONFIG_MMU
extern int sysctl_nr_trim_pages;
@@ -1251,6 +1252,13 @@ static struct ctl_table kern_table[] = {

static struct ctl_table vm_table[] = {
{
+ .procname = "lazy_vfree_pages",
+ .data = &sysctl_lazy_vfree_pages,
+ .maxlen = sizeof(sysctl_lazy_vfree_pages),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+ {
.procname = "overcommit_memory",
.data = &sysctl_overcommit_memory,
.maxlen = sizeof(sysctl_overcommit_memory),
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 97d4b25..fa07966 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -619,13 +619,16 @@ static void unmap_vmap_area(struct vmap_area *va)
* code, and it will be simple to change the scale factor if we find that it
* becomes a problem on bigger systems.
*/
+
+int sysctl_lazy_vfree_pages = 32UL * 1024 * 1024 / PAGE_SIZE;
+
static unsigned long lazy_max_pages(void)
{
unsigned int log;

log = fls(num_online_cpus());

- return log * (32UL * 1024 * 1024 / PAGE_SIZE);
+ return log * sysctl_lazy_vfree_pages;
}

static atomic_t vmap_lazy_nr = ATOMIC_INIT(0);
--
2.7.4



2019-01-04 19:05:02

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm: Expose lazy vfree pages to control via sysctl

On Fri, Jan 04, 2019 at 09:05:41PM +0530, Ashish Mhetre wrote:
> From: Hiroshi Doyu <[email protected]>
>
> The purpose of lazy_max_pages is to gather virtual address space till it
> reaches the lazy_max_pages limit and then purge with a TLB flush and hence
> reduce the number of global TLB flushes.
> The default value of lazy_max_pages with one CPU is 32MB and with 4 CPUs it
> is 96MB i.e. for 4 cores, 96MB of vmalloc space will be gathered before it
> is purged with a TLB flush.
> This feature has shown random latency issues. For example, we have seen
> that the kernel thread for some camera application spent 30ms in
> __purge_vmap_area_lazy() with 4 CPUs.

You're not the first to report something like this. Looking through the
kernel logs, I see:

commit 763b218ddfaf56761c19923beb7e16656f66ec62
Author: Joel Fernandes <[email protected]>
Date: Mon Dec 12 16:44:26 2016 -0800

mm: add preempt points into __purge_vmap_area_lazy()

commit f9e09977671b618aeb25ddc0d4c9a84d5b5cde9d
Author: Christoph Hellwig <[email protected]>
Date: Mon Dec 12 16:44:23 2016 -0800

mm: turn vmap_purge_lock into a mutex

commit 80c4bd7a5e4368b680e0aeb57050a1b06eb573d8
Author: Chris Wilson <[email protected]>
Date: Fri May 20 16:57:38 2016 -0700

mm/vmalloc: keep a separate lazy-free list

So the first thing I want to do is to confirm that you see this problem
on a modern kernel. We've had trouble with NVidia before reporting
historical problems as if they were new.

2019-01-04 19:06:00

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] mm: Expose lazy vfree pages to control via sysctl

Hi Hiroshi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20 next-20190103]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Ashish-Mhetre/mm-Expose-lazy-vfree-pages-to-control-via-sysctl/20190105-003852
config: sh-rsk7269_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sh

All errors (new ones prefixed by >>):

>> kernel/sysctl.o:(.data+0x2d4): undefined reference to `sysctl_lazy_vfree_pages'

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (1.02 kB)
.config.gz (10.57 kB)
Download all attachments

2019-01-04 19:07:08

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] mm: Expose lazy vfree pages to control via sysctl

Hi Hiroshi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20 next-20190103]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Ashish-Mhetre/mm-Expose-lazy-vfree-pages-to-control-via-sysctl/20190105-003852
config: c6x-evmc6678_defconfig (attached as .config)
compiler: c6x-elf-gcc (GCC) 8.1.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=8.1.0 make.cross ARCH=c6x

All errors (new ones prefixed by >>):

>> kernel/sysctl.o:(.fardata+0x2d4): undefined reference to `sysctl_lazy_vfree_pages'

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (1.01 kB)
.config.gz (5.13 kB)
Download all attachments

2019-01-06 08:44:23

by Ashish Mhetre

[permalink] [raw]
Subject: Re: [PATCH] mm: Expose lazy vfree pages to control via sysctl

Matthew, this issue was last reported in September 2018 on K4.9.
I verified that the optimization patches mentioned by you were not
present in our downstream kernel when we faced the issue. I will check
whether issue still persist on new kernel with all these patches and
come back.

On 04/01/19 11:33 PM, Matthew Wilcox wrote:
> On Fri, Jan 04, 2019 at 09:05:41PM +0530, Ashish Mhetre wrote:
>> From: Hiroshi Doyu <[email protected]>
>>
>> The purpose of lazy_max_pages is to gather virtual address space till it
>> reaches the lazy_max_pages limit and then purge with a TLB flush and hence
>> reduce the number of global TLB flushes.
>> The default value of lazy_max_pages with one CPU is 32MB and with 4 CPUs it
>> is 96MB i.e. for 4 cores, 96MB of vmalloc space will be gathered before it
>> is purged with a TLB flush.
>> This feature has shown random latency issues. For example, we have seen
>> that the kernel thread for some camera application spent 30ms in
>> __purge_vmap_area_lazy() with 4 CPUs.
>
> You're not the first to report something like this. Looking through the
> kernel logs, I see:
>
> commit 763b218ddfaf56761c19923beb7e16656f66ec62
> Author: Joel Fernandes <[email protected]>
> Date: Mon Dec 12 16:44:26 2016 -0800
>
> mm: add preempt points into __purge_vmap_area_lazy()
>
> commit f9e09977671b618aeb25ddc0d4c9a84d5b5cde9d
> Author: Christoph Hellwig <[email protected]>
> Date: Mon Dec 12 16:44:23 2016 -0800
>
> mm: turn vmap_purge_lock into a mutex
>
> commit 80c4bd7a5e4368b680e0aeb57050a1b06eb573d8
> Author: Chris Wilson <[email protected]>
> Date: Fri May 20 16:57:38 2016 -0700
>
> mm/vmalloc: keep a separate lazy-free list
>
> So the first thing I want to do is to confirm that you see this problem
> on a modern kernel. We've had trouble with NVidia before reporting
> historical problems as if they were new.
>

2019-01-21 08:08:41

by Ashish Mhetre

[permalink] [raw]
Subject: Re: [PATCH] mm: Expose lazy vfree pages to control via sysctl

The issue is not seen on new kernel. This patch won't be needed. Thanks.

On 06/01/19 2:12 PM, Ashish Mhetre wrote:
> Matthew, this issue was last reported in September 2018 on K4.9.
> I verified that the optimization patches mentioned by you were not
> present in our downstream kernel when we faced the issue. I will check
> whether issue still persist on new kernel with all these patches and
> come back.
>
> On 04/01/19 11:33 PM, Matthew Wilcox wrote:
>> On Fri, Jan 04, 2019 at 09:05:41PM +0530, Ashish Mhetre wrote:
>>> From: Hiroshi Doyu <[email protected]>
>>>
>>> The purpose of lazy_max_pages is to gather virtual address space till it
>>> reaches the lazy_max_pages limit and then purge with a TLB flush and
>>> hence
>>> reduce the number of global TLB flushes.
>>> The default value of lazy_max_pages with one CPU is 32MB and with 4
>>> CPUs it
>>> is 96MB i.e. for 4 cores, 96MB of vmalloc space will be gathered
>>> before it
>>> is purged with a TLB flush.
>>> This feature has shown random latency issues. For example, we have seen
>>> that the kernel thread for some camera application spent 30ms in
>>> __purge_vmap_area_lazy() with 4 CPUs.
>>
>> You're not the first to report something like this.  Looking through the
>> kernel logs, I see:
>>
>> commit 763b218ddfaf56761c19923beb7e16656f66ec62
>> Author: Joel Fernandes <[email protected]>
>> Date:   Mon Dec 12 16:44:26 2016 -0800
>>
>>      mm: add preempt points into __purge_vmap_area_lazy()
>>
>> commit f9e09977671b618aeb25ddc0d4c9a84d5b5cde9d
>> Author: Christoph Hellwig <[email protected]>
>> Date:   Mon Dec 12 16:44:23 2016 -0800
>>
>>      mm: turn vmap_purge_lock into a mutex
>>
>> commit 80c4bd7a5e4368b680e0aeb57050a1b06eb573d8
>> Author: Chris Wilson <[email protected]>
>> Date:   Fri May 20 16:57:38 2016 -0700
>>
>>      mm/vmalloc: keep a separate lazy-free list
>>
>> So the first thing I want to do is to confirm that you see this problem
>> on a modern kernel.  We've had trouble with NVidia before reporting
>> historical problems as if they were new.
>>