2015-04-08 16:15:59

by Shawn Bohrer

[permalink] [raw]
Subject: HugePages_Rsvd leak

I've noticed on a number of my systems that after shutting down my
application that uses huge pages that I'm left with some pages still
in HugePages_Rsvd. It is possible that I still have something using
huge pages that I'm not aware of but so far my attempts to find
anything using huge pages have failed. I've run some simple tests
using map_hugetlb.c from the kernel source and can see that pages that
have been reserved but not allocated still show up in
/proc/<pid>/smaps and /proc/<pid>/numa_maps. Are there any cases
where this is not true?

[root@dev106 ~]# grep HugePages /proc/meminfo
AnonHugePages: 241664 kB
HugePages_Total: 512
HugePages_Free: 512
HugePages_Rsvd: 384
HugePages_Surp: 0
Hugepagesize: 2048 kB
[root@dev106 ~]# grep "KernelPageSize:.*2048" /proc/*/smaps
[root@dev106 ~]# grep "VmFlags:.*ht" /proc/*/smaps
[root@dev106 ~]# grep huge /proc/*/numa_maps
[root@dev106 ~]# grep Huge /proc/meminfo
AnonHugePages: 241664 kB
HugePages_Total: 512
HugePages_Free: 512
HugePages_Rsvd: 384
HugePages_Surp: 0
Hugepagesize: 2048 kB

So here I have 384 pages reserved and I can't find anything that is
using them. This is on a machine running 3.14.33. I can possibly try
running a newer kernel if there is a belief that this has been fixed.
I'm also happy to provide more information or try some debug patches
if there are ideas on how to track this down. I'm not entirely sure
how hard this is to reproduce but nearly every machine I've looked at
is in this state so it must not be too hard.

Thanks,
Shawn


2015-04-08 19:29:14

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: HugePages_Rsvd leak

On Wed, 2015-04-08 at 11:15 -0500, Shawn Bohrer wrote:
> AnonHugePages: 241664 kB
> HugePages_Total: 512
> HugePages_Free: 512
> HugePages_Rsvd: 384
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
>
> So here I have 384 pages reserved and I can't find anything that is
> using them.

The output clearly shows all available hugepages are free, Why are you
assuming that reserved implies allocated/in use? This is not true,
please read one of the millions of docs out there -- you can start with:
https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt

2015-04-08 20:19:33

by Shawn Bohrer

[permalink] [raw]
Subject: Re: HugePages_Rsvd leak

On Wed, Apr 08, 2015 at 12:29:03PM -0700, Davidlohr Bueso wrote:
> On Wed, 2015-04-08 at 11:15 -0500, Shawn Bohrer wrote:
> > AnonHugePages: 241664 kB
> > HugePages_Total: 512
> > HugePages_Free: 512
> > HugePages_Rsvd: 384
> > HugePages_Surp: 0
> > Hugepagesize: 2048 kB
> >
> > So here I have 384 pages reserved and I can't find anything that is
> > using them.
>
> The output clearly shows all available hugepages are free, Why are you
> assuming that reserved implies allocated/in use? This is not true,
> please read one of the millions of docs out there -- you can start with:
> https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt

As that fine document states:

HugePages_Rsvd is short for "reserved," and is the number of huge pages for
which a commitment to allocate from the pool has been made,
but no allocation has yet been made. Reserved huge pages
guarantee that an application will be able to allocate a
huge page from the pool of huge pages at fault time.

Thus in my example above while I have 512 pages free 384 are reserved
and therefore if a new application comes along it can only reserve/use
the remaining 128 pages.

For example:

[scratch]$ grep Huge /proc/meminfo
AnonHugePages: 0 kB
HugePages_Total: 1
HugePages_Free: 1
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB

[scratch]$ cat map_hugetlb.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>

#define LENGTH (2UL*1024*1024)
#define PROTECTION (PROT_READ | PROT_WRITE)
#define ADDR (void *)(0x0UL)
#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)

int main(void)
{
void *addr;
addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
if (addr == MAP_FAILED) {
perror("mmap");
exit(1);
}

getchar();

munmap(addr, LENGTH);
return 0;
}

[scratch]$ make map_hugetlb
cc map_hugetlb.c -o map_hugetlb

[scratch]$ ./map_hugetlb &
[1] 7359
[1]+ Stopped ./map_hugetlb

[scratch]$ grep Huge /proc/meminfo
AnonHugePages: 0 kB
HugePages_Total: 1
HugePages_Free: 1
HugePages_Rsvd: 1
HugePages_Surp: 0
Hugepagesize: 2048 kB

[scratch]$ ./map_hugetlb
mmap: Cannot allocate memory


As you can see I still have 1 huge page free but that one huge page is
reserved by PID 7358. If I then try to run a new map_hugetlb process
the mmap fails because even though I have 1 page free it is reserved.

Furthermore we can find that 7358 has that page in the following ways:

[scratch]$ sudo grep "KernelPageSize:.*2048" /proc/*/smaps
/proc/7359/smaps:KernelPageSize: 2048 kB
[scratch]$ sudo grep "VmFlags:.*ht" /proc/*/smaps
/proc/7359/smaps:VmFlags: rd wr mr mw me de ht sd
[scratch]$ sudo grep -w huge /proc/*/numa_maps
/proc/7359/numa_maps:7f3233000000 default file=/anon_hugepage\040(deleted) huge

Which leads back to my original question. I have machines that have a
non-zero HugePages_Rsvd count but I cannot find any processes that
seem to have those pages reserved using the three methods shown above.
Is there some other way to identify which process has those pages
reserved? Or is there possibly a leak which is failing to decrement
the reserve count?

Thanks,
Shawn

2015-04-08 21:16:17

by Mike Kravetz

[permalink] [raw]
Subject: Re: HugePages_Rsvd leak

On 04/08/2015 09:15 AM, Shawn Bohrer wrote:
> I've noticed on a number of my systems that after shutting down my
> application that uses huge pages that I'm left with some pages still
> in HugePages_Rsvd. It is possible that I still have something using
> huge pages that I'm not aware of but so far my attempts to find
> anything using huge pages have failed. I've run some simple tests
> using map_hugetlb.c from the kernel source and can see that pages that
> have been reserved but not allocated still show up in
> /proc/<pid>/smaps and /proc/<pid>/numa_maps. Are there any cases
> where this is not true?

Just a quick question. Are you using hugetlb filesystem(s)?

If so, you might want to take a look at files residing in the
filesystem(s). As an experiment, I had a program do a simple
mmap() of a file in a hugetlb filesystem. The program just
created the mapping, and did not actually fault/allocate any
huge pages. The result was the reservation (HugePages_Rsvd)
of sufficient huge pages to cover the mapping. When the program
exited, the reservations remained. If I remove (unlink) the
file the reservations will be removed.

--
Mike Kravetz

2015-04-08 21:42:46

by Shawn Bohrer

[permalink] [raw]
Subject: Re: HugePages_Rsvd leak

On Wed, Apr 08, 2015 at 02:16:05PM -0700, Mike Kravetz wrote:
> On 04/08/2015 09:15 AM, Shawn Bohrer wrote:
> >I've noticed on a number of my systems that after shutting down my
> >application that uses huge pages that I'm left with some pages still
> >in HugePages_Rsvd. It is possible that I still have something using
> >huge pages that I'm not aware of but so far my attempts to find
> >anything using huge pages have failed. I've run some simple tests
> >using map_hugetlb.c from the kernel source and can see that pages that
> >have been reserved but not allocated still show up in
> >/proc/<pid>/smaps and /proc/<pid>/numa_maps. Are there any cases
> >where this is not true?
>
> Just a quick question. Are you using hugetlb filesystem(s)?

I can't say for sure that nothing is using hugetlbfs. It is mounted
but as far as I can tell on the affected system(s) it is empty.

[root@dev106 ~]# grep hugetlbfs /proc/mounts
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
[root@dev106 ~]# ls -al /dev/hugepages/
total 0
drwxr-xr-x 2 root root 0 Apr 8 16:22 .
drwxr-xr-x 16 root root 4360 Apr 8 03:53 ..
[root@dev106 ~]# lsof | grep hugepages

> If so, you might want to take a look at files residing in the
> filesystem(s). As an experiment, I had a program do a simple
> mmap() of a file in a hugetlb filesystem. The program just
> created the mapping, and did not actually fault/allocate any
> huge pages. The result was the reservation (HugePages_Rsvd)
> of sufficient huge pages to cover the mapping. When the program
> exited, the reservations remained. If I remove (unlink) the
> file the reservations will be removed.

That makes sense but I don't think it is the issue here.

Thanks,
Shawn