Reading /proc/meminfo is really slow, as it requires recomputing the
vmalloc data every time, which is a lot of work, when most (all?)
consumers of meminfo don't even care about those statistics.
Linus mentioned[*] that he wanted to just remove these fields
from this file, and just break the ABI if no-one complains
(and if someone does, do some caching here instead, or just printing
the vmalloc fields as zero).
For people wanting the vmalloc stats, /proc/vmallocinfo should be more
interesting anyway.
As it turns out, Facebook also has workloads that were negatively
impacted by this, (while also not caring about the vmalloc info either).
The patch below removes all this unnecessary work.
[*] https://www.mail-archive.com/[email protected]/msg961636.html
Signed-off-by: Dave Jones <[email protected]>
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index d3ebf2e61853..63b990811ef3 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -11,7 +11,6 @@
#include <linux/swap.h>
#include <linux/vmstat.h>
#include <linux/atomic.h>
-#include <linux/vmalloc.h>
#ifdef CONFIG_CMA
#include <linux/cma.h>
#endif
@@ -27,7 +26,6 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
{
struct sysinfo i;
unsigned long committed;
- struct vmalloc_info vmi;
long cached;
long available;
unsigned long pagecache;
@@ -49,8 +47,6 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
if (cached < 0)
cached = 0;
- get_vmalloc_info(&vmi);
-
for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++)
pages[lru] = global_page_state(NR_LRU_BASE + lru);
@@ -132,9 +128,6 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
"WritebackTmp: %8lu kB\n"
"CommitLimit: %8lu kB\n"
"Committed_AS: %8lu kB\n"
- "VmallocTotal: %8lu kB\n"
- "VmallocUsed: %8lu kB\n"
- "VmallocChunk: %8lu kB\n"
#ifdef CONFIG_MEMORY_FAILURE
"HardwareCorrupted: %5lu kB\n"
#endif
@@ -189,10 +182,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
K(global_page_state(NR_BOUNCE)),
K(global_page_state(NR_WRITEBACK_TEMP)),
K(vm_commit_limit()),
- K(committed),
- (unsigned long)VMALLOC_TOTAL >> 10,
- vmi.used >> 10,
- vmi.largest_chunk >> 10
+ K(committed)
#ifdef CONFIG_MEMORY_FAILURE
, atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10)
#endif
On Mon, Nov 2, 2015 at 10:36 AM, Dave Jones <[email protected]> wrote:
> Reading /proc/meminfo is really slow, as it requires recomputing the
> vmalloc data every time, which is a lot of work, when most (all?)
> consumers of meminfo don't even care about those statistics.
Ahh. My version of this patch (which I actually committed yesterday,
since I remembered - will wonders never cease?) leaves the fields
around in the /proc/meminfo file, but just makes the values be zero.
It also removes the actual function to compute the data that nobody
uses any more.
I agree that we can eventually look at even removing the fields
entirely, but that's much more likely to break things. I can imagine
system tools that just root around for values, and break and complain
when they don't exist, even if all they do is report them (rather than
actually *use* them for anythign).
I guess I should just push out my tree. I didn't want to keep people
from testing plain 4.3, so I didn't push out yesterday.
Can you test what is now (where "now" means "it might take a minute or
two to mirror out") in my git repo?
Linus
On Mon, Nov 02, 2015 at 11:29:15AM -0800, Linus Torvalds wrote:
> On Mon, Nov 2, 2015 at 10:36 AM, Dave Jones <[email protected]> wrote:
> > Reading /proc/meminfo is really slow, as it requires recomputing the
> > vmalloc data every time, which is a lot of work, when most (all?)
> > consumers of meminfo don't even care about those statistics.
>
> Ahh. My version of this patch (which I actually committed yesterday,
> since I remembered - will wonders never cease?) leaves the fields
> around in the /proc/meminfo file, but just makes the values be zero.
> It also removes the actual function to compute the data that nobody
> uses any more.
>
> I agree that we can eventually look at even removing the fields
> entirely, but that's much more likely to break things. I can imagine
> system tools that just root around for values, and break and complain
> when they don't exist, even if all they do is report them (rather than
> actually *use* them for anythign).
>
> I guess I should just push out my tree. I didn't want to keep people
> from testing plain 4.3, so I didn't push out yesterday.
>
> Can you test what is now (where "now" means "it might take a minute or
> two to mirror out") in my git repo?
That looks like it'll do the job just as well yeah, and I suppose is
a touch more conservative than my "burn it all down" approach.
Dave