2011-03-07 13:07:16

by Petr Holasek

[permalink] [raw]
Subject: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

/proc/meminfo file shows data for all used sizes of hugepages
on system, not only for default hugepage size.

Signed-off-by: Petr Holasek <[email protected]>
---
mm/hugetlb.c | 26 ++++++++++++++------------
1 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bb0b7c1..7919849 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1963,18 +1963,20 @@ out:

void hugetlb_report_meminfo(struct seq_file *m)
{
- struct hstate *h = &default_hstate;
- seq_printf(m,
- "HugePages_Total: %5lu\n"
- "HugePages_Free: %5lu\n"
- "HugePages_Rsvd: %5lu\n"
- "HugePages_Surp: %5lu\n"
- "Hugepagesize: %8lu kB\n",
- h->nr_huge_pages,
- h->free_huge_pages,
- h->resv_huge_pages,
- h->surplus_huge_pages,
- 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
+ struct hstate *h;
+
+ for_each_hstate(h)
+ seq_printf(m,
+ "HugePages_Total: %5lu\n"
+ "HugePages_Free: %5lu\n"
+ "HugePages_Rsvd: %5lu\n"
+ "HugePages_Surp: %5lu\n"
+ "Hugepagesize: %8lu kB\n",
+ h->nr_huge_pages,
+ h->free_huge_pages,
+ h->resv_huge_pages,
+ h->surplus_huge_pages,
+ 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
}

int hugetlb_report_node_meminfo(int nid, char *buf)
--
1.7.1


2011-03-07 19:47:04

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> + for_each_hstate(h)
> + seq_printf(m,
> + "HugePages_Total: %5lu\n"
> + "HugePages_Free: %5lu\n"
> + "HugePages_Rsvd: %5lu\n"
> + "HugePages_Surp: %5lu\n"
> + "Hugepagesize: %8lu kB\n",
> + h->nr_huge_pages,
> + h->free_huge_pages,
> + h->resv_huge_pages,
> + h->surplus_huge_pages,
> + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> }

It sounds like now we'll get a meminfo that looks like:

...
AnonHugePages: 491520 kB
HugePages_Total: 5
HugePages_Free: 2
HugePages_Rsvd: 3
HugePages_Surp: 1
Hugepagesize: 2048 kB
HugePages_Total: 2
HugePages_Free: 1
HugePages_Rsvd: 1
HugePages_Surp: 1
Hugepagesize: 1048576 kB
DirectMap4k: 12160 kB
DirectMap2M: 2082816 kB
DirectMap1G: 2097152 kB

At best, that's a bit confusing. There aren't any other entries in
meminfo that occur more than once. Plus, this information is available
in the sysfs interface. Why isn't that sufficient?

Could we do something where we keep the default hpage_size looking like
it does now, but append the size explicitly for the new entries?

HugePages_Total(1G): 2
HugePages_Free(1G): 1
HugePages_Rsvd(1G): 1
HugePages_Surp(1G): 1

-- Dave

2011-03-07 20:13:20

by Eric B Munson

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, 07 Mar 2011, Dave Hansen wrote:

> On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > + for_each_hstate(h)
> > + seq_printf(m,
> > + "HugePages_Total: %5lu\n"
> > + "HugePages_Free: %5lu\n"
> > + "HugePages_Rsvd: %5lu\n"
> > + "HugePages_Surp: %5lu\n"
> > + "Hugepagesize: %8lu kB\n",
> > + h->nr_huge_pages,
> > + h->free_huge_pages,
> > + h->resv_huge_pages,
> > + h->surplus_huge_pages,
> > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > }
>
> It sounds like now we'll get a meminfo that looks like:
>
> ...
> AnonHugePages: 491520 kB
> HugePages_Total: 5
> HugePages_Free: 2
> HugePages_Rsvd: 3
> HugePages_Surp: 1
> Hugepagesize: 2048 kB
> HugePages_Total: 2
> HugePages_Free: 1
> HugePages_Rsvd: 1
> HugePages_Surp: 1
> Hugepagesize: 1048576 kB
> DirectMap4k: 12160 kB
> DirectMap2M: 2082816 kB
> DirectMap1G: 2097152 kB
>
> At best, that's a bit confusing. There aren't any other entries in
> meminfo that occur more than once. Plus, this information is available
> in the sysfs interface. Why isn't that sufficient?
>
> Could we do something where we keep the default hpage_size looking like
> it does now, but append the size explicitly for the new entries?
>
> HugePages_Total(1G): 2
> HugePages_Free(1G): 1
> HugePages_Rsvd(1G): 1
> HugePages_Surp(1G): 1
>
> -- Dave

I second that, this will help minimize the change to userspace tools that currently read
meminfo for huge page information.

Eric


Attachments:
(No filename) (1.84 kB)
signature.asc (490.00 B)
Digital signature
Download all attachments

2011-03-07 22:52:35

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, 07 Mar 2011 11:46:54 -0800
Dave Hansen <[email protected]> wrote:

> On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > + for_each_hstate(h)
> > + seq_printf(m,
> > + "HugePages_Total: %5lu\n"
> > + "HugePages_Free: %5lu\n"
> > + "HugePages_Rsvd: %5lu\n"
> > + "HugePages_Surp: %5lu\n"
> > + "Hugepagesize: %8lu kB\n",
> > + h->nr_huge_pages,
> > + h->free_huge_pages,
> > + h->resv_huge_pages,
> > + h->surplus_huge_pages,
> > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > }
>
> It sounds like now we'll get a meminfo that looks like:
>
> ...
> AnonHugePages: 491520 kB
> HugePages_Total: 5
> HugePages_Free: 2
> HugePages_Rsvd: 3
> HugePages_Surp: 1
> Hugepagesize: 2048 kB
> HugePages_Total: 2
> HugePages_Free: 1
> HugePages_Rsvd: 1
> HugePages_Surp: 1
> Hugepagesize: 1048576 kB
> DirectMap4k: 12160 kB
> DirectMap2M: 2082816 kB
> DirectMap1G: 2097152 kB
>
> At best, that's a bit confusing. There aren't any other entries in
> meminfo that occur more than once. Plus, this information is available
> in the sysfs interface. Why isn't that sufficient?
>
> Could we do something where we keep the default hpage_size looking like
> it does now, but append the size explicitly for the new entries?
>
> HugePages_Total(1G): 2
> HugePages_Free(1G): 1
> HugePages_Rsvd(1G): 1
> HugePages_Surp(1G): 1
>

Let's not change the existing interface, please.

Adding new fields: OK.
Changing the way in whcih existing fields are calculated: OKish.
Renaming existing fields: not OK.

2011-03-07 23:16:54

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, Mar 07, 2011 at 02:51:49PM -0800, Andrew Morton wrote:
> On Mon, 07 Mar 2011 11:46:54 -0800
> Dave Hansen <[email protected]> wrote:
>
> > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > + for_each_hstate(h)
> > > + seq_printf(m,
> > > + "HugePages_Total: %5lu\n"
> > > + "HugePages_Free: %5lu\n"
> > > + "HugePages_Rsvd: %5lu\n"
> > > + "HugePages_Surp: %5lu\n"
> > > + "Hugepagesize: %8lu kB\n",
> > > + h->nr_huge_pages,
> > > + h->free_huge_pages,
> > > + h->resv_huge_pages,
> > > + h->surplus_huge_pages,
> > > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > }
> >
> > It sounds like now we'll get a meminfo that looks like:
> >
> > ...
> > AnonHugePages: 491520 kB
> > HugePages_Total: 5
> > HugePages_Free: 2
> > HugePages_Rsvd: 3
> > HugePages_Surp: 1
> > Hugepagesize: 2048 kB
> > HugePages_Total: 2
> > HugePages_Free: 1
> > HugePages_Rsvd: 1
> > HugePages_Surp: 1
> > Hugepagesize: 1048576 kB
> > DirectMap4k: 12160 kB
> > DirectMap2M: 2082816 kB
> > DirectMap1G: 2097152 kB
> >
> > At best, that's a bit confusing. There aren't any other entries in
> > meminfo that occur more than once. Plus, this information is available
> > in the sysfs interface. Why isn't that sufficient?
> >
> > Could we do something where we keep the default hpage_size looking like
> > it does now, but append the size explicitly for the new entries?
> >
> > HugePages_Total(1G): 2
> > HugePages_Free(1G): 1
> > HugePages_Rsvd(1G): 1
> > HugePages_Surp(1G): 1
> >
>
> Let's not change the existing interface, please.
>
> Adding new fields: OK.
> Changing the way in whcih existing fields are calculated: OKish.
> Renaming existing fields: not OK.

How about lining up multiple values in each field like this?

HugePages_Total: 5 2
HugePages_Free: 2 1
HugePages_Rsvd: 3 1
HugePages_Surp: 1 1
Hugepagesize: 2048 1048576 kB
...

This doesn't change the field names and the impact for user space
is still small?

Thanks,
Naoya

2011-03-07 23:25:01

by Eric B Munson

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

2011/3/7 Naoya Horiguchi <[email protected]>:
>> >
>> > Could we do something where we keep the default hpage_size looking like
>> > it does now, but append the size explicitly for the new entries?
>> >
>> > HugePages_Total(1G): ? ? ? 2
>> > HugePages_Free(1G): ? ? ? ?1
>> > HugePages_Rsvd(1G): ? ? ? ?1
>> > HugePages_Surp(1G): ? ? ? ?1
>> >
>>
>> Let's not change the existing interface, please.
>>
>> Adding new fields: OK.
>> Changing the way in whcih existing fields are calculated: OKish.
>> Renaming existing fields: not OK.
>
> How about lining up multiple values in each field like this?
>
> ?HugePages_Total: ? ? ? 5 2
> ?HugePages_Free: ? ? ? ?2 1
> ?HugePages_Rsvd: ? ? ? ?3 1
> ?HugePages_Surp: ? ? ? ?1 1
> ?Hugepagesize: ? ? ? 2048 1048576 kB
> ?...
>
> This doesn't change the field names and the impact for user space
> is still small?
>
> Thanks,
> Naoya
>

I don't like this either, Dave's suggestion impacts userspace the
least, as code that looks for default huge page size pool info will
still find it, but it won't match the sized entries. Your suggestion
means that I need to change how libhugetlbfs, for instance, parses
meminfo.

2011-03-07 23:26:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Tue, 8 Mar 2011 08:14:49 +0900
Naoya Horiguchi <[email protected]> wrote:

> On Mon, Mar 07, 2011 at 02:51:49PM -0800, Andrew Morton wrote:
> > On Mon, 07 Mar 2011 11:46:54 -0800
> > Dave Hansen <[email protected]> wrote:
> >
> > > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > > + for_each_hstate(h)
> > > > + seq_printf(m,
> > > > + "HugePages_Total: %5lu\n"
> > > > + "HugePages_Free: %5lu\n"
> > > > + "HugePages_Rsvd: %5lu\n"
> > > > + "HugePages_Surp: %5lu\n"
> > > > + "Hugepagesize: %8lu kB\n",
> > > > + h->nr_huge_pages,
> > > > + h->free_huge_pages,
> > > > + h->resv_huge_pages,
> > > > + h->surplus_huge_pages,
> > > > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > > }
> > >
> > > It sounds like now we'll get a meminfo that looks like:
> > >
> > > ...
> > > AnonHugePages: 491520 kB
> > > HugePages_Total: 5
> > > HugePages_Free: 2
> > > HugePages_Rsvd: 3
> > > HugePages_Surp: 1
> > > Hugepagesize: 2048 kB
> > > HugePages_Total: 2
> > > HugePages_Free: 1
> > > HugePages_Rsvd: 1
> > > HugePages_Surp: 1
> > > Hugepagesize: 1048576 kB
> > > DirectMap4k: 12160 kB
> > > DirectMap2M: 2082816 kB
> > > DirectMap1G: 2097152 kB
> > >
> > > At best, that's a bit confusing. There aren't any other entries in
> > > meminfo that occur more than once. Plus, this information is available
> > > in the sysfs interface. Why isn't that sufficient?
> > >
> > > Could we do something where we keep the default hpage_size looking like
> > > it does now, but append the size explicitly for the new entries?
> > >
> > > HugePages_Total(1G): 2
> > > HugePages_Free(1G): 1
> > > HugePages_Rsvd(1G): 1
> > > HugePages_Surp(1G): 1
> > >
> >
> > Let's not change the existing interface, please.
> >
> > Adding new fields: OK.
> > Changing the way in whcih existing fields are calculated: OKish.
> > Renaming existing fields: not OK.
>
> How about lining up multiple values in each field like this?
>
> HugePages_Total: 5 2
> HugePages_Free: 2 1
> HugePages_Rsvd: 3 1
> HugePages_Surp: 1 1
> Hugepagesize: 2048 1048576 kB
> ...
>
> This doesn't change the field names and the impact for user space
> is still small?

It might break some existing parsers, dunno.

It was a mistake to assume that all hugepages will have the same size
for all time, and we just have to live with that mistake.

I'd suggest that we leave meminfo alone, just ensuring that its output
makes some sense. Instead create a new interface which presents all
the required info in a sensible fashion and migrate usersapce reporting
tools over to that interface. Just let the meminfo field die a slow
death.

It's tempting to remove the meminfo hugepage fields altogether - most
parsers _should_ be able to cope with a CONFIG_HUGETLB=n kernel. But
that's breakage as well - some applications may be using meminfo to
detect whether the kernel supports huge pages!

2011-03-07 23:47:31

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, 7 Mar 2011, Andrew Morton wrote:

> > > > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > > > + for_each_hstate(h)
> > > > > + seq_printf(m,
> > > > > + "HugePages_Total: %5lu\n"
> > > > > + "HugePages_Free: %5lu\n"
> > > > > + "HugePages_Rsvd: %5lu\n"
> > > > > + "HugePages_Surp: %5lu\n"
> > > > > + "Hugepagesize: %8lu kB\n",
> > > > > + h->nr_huge_pages,
> > > > > + h->free_huge_pages,
> > > > > + h->resv_huge_pages,
> > > > > + h->surplus_huge_pages,
> > > > > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > > > }
> > > >
> > > > It sounds like now we'll get a meminfo that looks like:
> > > >
> > > > ...
> > > > AnonHugePages: 491520 kB
> > > > HugePages_Total: 5
> > > > HugePages_Free: 2
> > > > HugePages_Rsvd: 3
> > > > HugePages_Surp: 1
> > > > Hugepagesize: 2048 kB
> > > > HugePages_Total: 2
> > > > HugePages_Free: 1
> > > > HugePages_Rsvd: 1
> > > > HugePages_Surp: 1
> > > > Hugepagesize: 1048576 kB
> > > > DirectMap4k: 12160 kB
> > > > DirectMap2M: 2082816 kB
> > > > DirectMap1G: 2097152 kB
> > > >
> > > > At best, that's a bit confusing. There aren't any other entries in
> > > > meminfo that occur more than once. Plus, this information is available
> > > > in the sysfs interface. Why isn't that sufficient?
> > > >
> > > > Could we do something where we keep the default hpage_size looking like
> > > > it does now, but append the size explicitly for the new entries?
> > > >
> > > > HugePages_Total(1G): 2
> > > > HugePages_Free(1G): 1
> > > > HugePages_Rsvd(1G): 1
> > > > HugePages_Surp(1G): 1
> > > >
> > >
> > > Let's not change the existing interface, please.
> > >
> > > Adding new fields: OK.
> > > Changing the way in whcih existing fields are calculated: OKish.
> > > Renaming existing fields: not OK.
> >
> > How about lining up multiple values in each field like this?
> >
> > HugePages_Total: 5 2
> > HugePages_Free: 2 1
> > HugePages_Rsvd: 3 1
> > HugePages_Surp: 1 1
> > Hugepagesize: 2048 1048576 kB
> > ...
> >
> > This doesn't change the field names and the impact for user space
> > is still small?
>
> It might break some existing parsers, dunno.
>
> It was a mistake to assume that all hugepages will have the same size
> for all time, and we just have to live with that mistake.
>

I'm not sure it was a mistake: the kernel has a default hugepage size and
that's what the global /proc/sys/vm/nr_hugepages tunable uses, so it seems
appropriate that its statistics are exported in the global /proc/meminfo.

> I'd suggest that we leave meminfo alone, just ensuring that its output
> makes some sense. Instead create a new interface which presents all
> the required info in a sensible fashion and migrate usersapce reporting
> tools over to that interface. Just let the meminfo field die a slow
> death.
>

(Adding Nishanth to the cc)

It's already there, all this data is available for all the configured
hugepage sizes via /sys/kernel/mm/hugepages/hugepages-<size>kB/ as
described by Documentation/ABI/testing/sysfs-kernel-mm-hugepages.

It looks like Nishanth and others put quite a bit of effort into making as
stable of an API as possible for this information.

2011-03-08 00:57:19

by Nishanth Aravamudan

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

Hi David,

On 07.03.2011 [15:47:23 -0800], David Rientjes wrote:
> On Mon, 7 Mar 2011, Andrew Morton wrote:
>
> > > > > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > > > > + for_each_hstate(h)
> > > > > > + seq_printf(m,
> > > > > > + "HugePages_Total: %5lu\n"
> > > > > > + "HugePages_Free: %5lu\n"
> > > > > > + "HugePages_Rsvd: %5lu\n"
> > > > > > + "HugePages_Surp: %5lu\n"
> > > > > > + "Hugepagesize: %8lu kB\n",
> > > > > > + h->nr_huge_pages,
> > > > > > + h->free_huge_pages,
> > > > > > + h->resv_huge_pages,
> > > > > > + h->surplus_huge_pages,
> > > > > > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > > > > }
> > > > >
> > > > > It sounds like now we'll get a meminfo that looks like:
> > > > >
> > > > > ...
> > > > > AnonHugePages: 491520 kB
> > > > > HugePages_Total: 5
> > > > > HugePages_Free: 2
> > > > > HugePages_Rsvd: 3
> > > > > HugePages_Surp: 1
> > > > > Hugepagesize: 2048 kB
> > > > > HugePages_Total: 2
> > > > > HugePages_Free: 1
> > > > > HugePages_Rsvd: 1
> > > > > HugePages_Surp: 1
> > > > > Hugepagesize: 1048576 kB
> > > > > DirectMap4k: 12160 kB
> > > > > DirectMap2M: 2082816 kB
> > > > > DirectMap1G: 2097152 kB
> > > > >
> > > > > At best, that's a bit confusing. There aren't any other entries in
> > > > > meminfo that occur more than once. Plus, this information is available
> > > > > in the sysfs interface. Why isn't that sufficient?
> > > > >
> > > > > Could we do something where we keep the default hpage_size looking like
> > > > > it does now, but append the size explicitly for the new entries?
> > > > >
> > > > > HugePages_Total(1G): 2
> > > > > HugePages_Free(1G): 1
> > > > > HugePages_Rsvd(1G): 1
> > > > > HugePages_Surp(1G): 1
> > > > >
> > > >
> > > > Let's not change the existing interface, please.
> > > >
> > > > Adding new fields: OK.
> > > > Changing the way in whcih existing fields are calculated: OKish.
> > > > Renaming existing fields: not OK.
> > >
> > > How about lining up multiple values in each field like this?
> > >
> > > HugePages_Total: 5 2
> > > HugePages_Free: 2 1
> > > HugePages_Rsvd: 3 1
> > > HugePages_Surp: 1 1
> > > Hugepagesize: 2048 1048576 kB
> > > ...
> > >
> > > This doesn't change the field names and the impact for user space
> > > is still small?
> >
> > It might break some existing parsers, dunno.
> >
> > It was a mistake to assume that all hugepages will have the same size
> > for all time, and we just have to live with that mistake.
> >
>
> I'm not sure it was a mistake: the kernel has a default hugepage size and
> that's what the global /proc/sys/vm/nr_hugepages tunable uses, so it seems
> appropriate that its statistics are exported in the global /proc/meminfo.

Yep, the intent was for meminfo to (continue to) document the default
hugepage size's usage, and for any other size's statistics to be
accessed by the appropriate sysfs entries.

> > I'd suggest that we leave meminfo alone, just ensuring that its output
> > makes some sense. Instead create a new interface which presents all
> > the required info in a sensible fashion and migrate usersapce reporting
> > tools over to that interface. Just let the meminfo field die a slow
> > death.
> >
>
> (Adding Nishanth to the cc)
>
> It's already there, all this data is available for all the configured
> hugepage sizes via /sys/kernel/mm/hugepages/hugepages-<size>kB/ as
> described by Documentation/ABI/testing/sysfs-kernel-mm-hugepages.
>
> It looks like Nishanth and others put quite a bit of effort into
> making as stable of an API as possible for this information.

I'm not sure if libhugetlbfs already has a tool for parsing the values
there (i.e., to give an end-user a quick'n'dirty snapshot of overall
current hugepage usage). Eric? If not, probably something worth having.
I believe we also have the per-node information in sysfs too, in case
that's relevant to tooling.

Thanks,
Nish

--
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center

2011-03-08 01:28:03

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, Mar 07, 2011 at 02:05:55PM +0100, Petr Holasek wrote:
> /proc/meminfo file shows data for all used sizes of hugepages
> on system, not only for default hugepage size.

When I wrote that It was intentional to only report the
default page size here. The other page sizes are reported
in sysfs instead.

The reason was to avoid breaking any applications that
read /proc/meminfo today.

I suspect your patch will break them.

-Andi

2011-03-08 09:37:56

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Mon, Mar 07, 2011 at 04:57:06PM -0800, Nishanth Aravamudan wrote:
> > > > > > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > > > > > + for_each_hstate(h)
> > > > > > > + seq_printf(m,
> > > > > > > + "HugePages_Total: %5lu\n"
> > > > > > > + "HugePages_Free: %5lu\n"
> > > > > > > + "HugePages_Rsvd: %5lu\n"
> > > > > > > + "HugePages_Surp: %5lu\n"
> > > > > > > + "Hugepagesize: %8lu kB\n",
> > > > > > > + h->nr_huge_pages,
> > > > > > > + h->free_huge_pages,
> > > > > > > + h->resv_huge_pages,
> > > > > > > + h->surplus_huge_pages,
> > > > > > > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > > > > > }
> > > > > >
> > > > > > It sounds like now we'll get a meminfo that looks like:
> > > > > >
> > > > > > ...
> > > > > > AnonHugePages: 491520 kB
> > > > > > HugePages_Total: 5
> > > > > > HugePages_Free: 2
> > > > > > HugePages_Rsvd: 3
> > > > > > HugePages_Surp: 1
> > > > > > Hugepagesize: 2048 kB
> > > > > > HugePages_Total: 2
> > > > > > HugePages_Free: 1
> > > > > > HugePages_Rsvd: 1
> > > > > > HugePages_Surp: 1
> > > > > > Hugepagesize: 1048576 kB
> > > > > > DirectMap4k: 12160 kB
> > > > > > DirectMap2M: 2082816 kB
> > > > > > DirectMap1G: 2097152 kB
> > > > > >
> > > > > > At best, that's a bit confusing. There aren't any other entries in
> > > > > > meminfo that occur more than once. Plus, this information is available
> > > > > > in the sysfs interface. Why isn't that sufficient?
> > > > > >
> > > > > > Could we do something where we keep the default hpage_size looking like
> > > > > > it does now, but append the size explicitly for the new entries?
> > > > > >
> > > > > > HugePages_Total(1G): 2
> > > > > > HugePages_Free(1G): 1
> > > > > > HugePages_Rsvd(1G): 1
> > > > > > HugePages_Surp(1G): 1
> > > > > >
> > > > >
> > > > > Let's not change the existing interface, please.
> > > > >
> > > > > Adding new fields: OK.
> > > > > Changing the way in whcih existing fields are calculated: OKish.
> > > > > Renaming existing fields: not OK.
> > > >
> > > > How about lining up multiple values in each field like this?
> > > >
> > > > HugePages_Total: 5 2
> > > > HugePages_Free: 2 1
> > > > HugePages_Rsvd: 3 1
> > > > HugePages_Surp: 1 1
> > > > Hugepagesize: 2048 1048576 kB
> > > > ...
> > > >
> > > > This doesn't change the field names and the impact for user space
> > > > is still small?
> > >
> > > It might break some existing parsers, dunno.
> > >
> > > It was a mistake to assume that all hugepages will have the same size
> > > for all time, and we just have to live with that mistake.
> > >
> >
> > I'm not sure it was a mistake: the kernel has a default hugepage size and
> > that's what the global /proc/sys/vm/nr_hugepages tunable uses, so it seems
> > appropriate that its statistics are exported in the global /proc/meminfo.
>
> Yep, the intent was for meminfo to (continue to) document the default
> hugepage size's usage, and for any other size's statistics to be
> accessed by the appropriate sysfs entries.
>

Agreed. The suggested changes to the interface here is very likely to
break libhugetlbfs.

> > > I'd suggest that we leave meminfo alone, just ensuring that its output
> > > makes some sense. Instead create a new interface which presents all
> > > the required info in a sensible fashion and migrate usersapce reporting
> > > tools over to that interface. Just let the meminfo field die a slow
> > > death.
> > >
> >
> > (Adding Nishanth to the cc)
> >
> > It's already there, all this data is available for all the configured
> > hugepage sizes via /sys/kernel/mm/hugepages/hugepages-<size>kB/ as
> > described by Documentation/ABI/testing/sysfs-kernel-mm-hugepages.
> >
> > It looks like Nishanth and others put quite a bit of effort into
> > making as stable of an API as possible for this information.
>
> I'm not sure if libhugetlbfs already has a tool for parsing the values
> there (i.e., to give an end-user a quick'n'dirty snapshot of overall
> current hugepage usage). Eric?

I'm not Eric, but it does. It's called hugeadm and here is an example of
its output

hydra:~# hugeadm --pool-list
Size Minimum Current Maximum Default
2097152 16 16 16 *
1073741824 2 2 2

> If not, probably something worth having.
> I believe we also have the per-node information in sysfs too, in case
> that's relevant to tooling.
>

The kernel interfaces are sufficient at the moment at exporting all the
information. If hugeadm is providing insufficient information, I'd
prefer to see it enhanced than the sysfs or meminfo interfaces changed.

--
Mel Gorman
SUSE Labs

2011-03-08 11:22:09

by Petr Holasek

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On 03/08/2011 12:25 AM, Andrew Morton wrote:
> On Tue, 8 Mar 2011 08:14:49 +0900
> Naoya Horiguchi<[email protected]> wrote:
>
>> On Mon, Mar 07, 2011 at 02:51:49PM -0800, Andrew Morton wrote:
>>> On Mon, 07 Mar 2011 11:46:54 -0800
>>> Dave Hansen<[email protected]> wrote:
>>>
>>>> On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
>>>>> + for_each_hstate(h)
>>>>> + seq_printf(m,
>>>>> + "HugePages_Total: %5lu\n"
>>>>> + "HugePages_Free: %5lu\n"
>>>>> + "HugePages_Rsvd: %5lu\n"
>>>>> + "HugePages_Surp: %5lu\n"
>>>>> + "Hugepagesize: %8lu kB\n",
>>>>> + h->nr_huge_pages,
>>>>> + h->free_huge_pages,
>>>>> + h->resv_huge_pages,
>>>>> + h->surplus_huge_pages,
>>>>> + 1UL<< (huge_page_order(h) + PAGE_SHIFT - 10));
>>>>> }
>>>>
>>>> It sounds like now we'll get a meminfo that looks like:
>>>>
>>>> ...
>>>> AnonHugePages: 491520 kB
>>>> HugePages_Total: 5
>>>> HugePages_Free: 2
>>>> HugePages_Rsvd: 3
>>>> HugePages_Surp: 1
>>>> Hugepagesize: 2048 kB
>>>> HugePages_Total: 2
>>>> HugePages_Free: 1
>>>> HugePages_Rsvd: 1
>>>> HugePages_Surp: 1
>>>> Hugepagesize: 1048576 kB
>>>> DirectMap4k: 12160 kB
>>>> DirectMap2M: 2082816 kB
>>>> DirectMap1G: 2097152 kB
>>>>
>>>> At best, that's a bit confusing. There aren't any other entries in
>>>> meminfo that occur more than once. Plus, this information is available
>>>> in the sysfs interface. Why isn't that sufficient?
>>>>
>>>> Could we do something where we keep the default hpage_size looking like
>>>> it does now, but append the size explicitly for the new entries?
>>>>
>>>> HugePages_Total(1G): 2
>>>> HugePages_Free(1G): 1
>>>> HugePages_Rsvd(1G): 1
>>>> HugePages_Surp(1G): 1
>>>>
>>>
>>> Let's not change the existing interface, please.
>>>
>>> Adding new fields: OK.
>>> Changing the way in whcih existing fields are calculated: OKish.
>>> Renaming existing fields: not OK.
>>
>> How about lining up multiple values in each field like this?
>>
>> HugePages_Total: 5 2
>> HugePages_Free: 2 1
>> HugePages_Rsvd: 3 1
>> HugePages_Surp: 1 1
>> Hugepagesize: 2048 1048576 kB
>> ...
>>
>> This doesn't change the field names and the impact for user space
>> is still small?
>
> It might break some existing parsers, dunno.
>
> It was a mistake to assume that all hugepages will have the same size
> for all time, and we just have to live with that mistake.
>
> I'd suggest that we leave meminfo alone, just ensuring that its output
> makes some sense. Instead create a new interface which presents all
> the required info in a sensible fashion and migrate usersapce reporting
> tools over to that interface. Just let the meminfo field die a slow
> death.

The main idea behind this patch is to unify hugetlb interfaces in
/proc/meminfo
and sysfs. When somebody wants to find out all important informations
about hugepage
pools (as hugeadm from libhugetlbfs does), he has to determine default
hugepage size
from /proc/meminfo and then go into
/sys/kernel/mm/hugepages/hugepages-<size>kB/
for informations about next nodes.

I agree with idea of throwing away of meminfo hugepage fields in the future,
but before doing this, sysfs part of interface should indicate default
hugepage
size. And meminfo could possibly show data for all hugepage sizes on
system. So when
these parts will be independent, it is no problem to let meminfo fields
die.

>
> It's tempting to remove the meminfo hugepage fields altogether - most
> parsers _should_ be able to cope with a CONFIG_HUGETLB=n kernel. But
> that's breakage as well - some applications may be using meminfo to
> detect whether the kernel supports huge pages!

2011-03-08 13:52:06

by Eric B Munson

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of hugepages

On Tue, 08 Mar 2011, Petr Holasek wrote:

> On 03/08/2011 12:25 AM, Andrew Morton wrote:
> >On Tue, 8 Mar 2011 08:14:49 +0900
> >Naoya Horiguchi<[email protected]> wrote:
> >
> >>On Mon, Mar 07, 2011 at 02:51:49PM -0800, Andrew Morton wrote:
> >>>On Mon, 07 Mar 2011 11:46:54 -0800
> >>>Dave Hansen<[email protected]> wrote:
> >>>
> >>>>On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> >>>>>+ for_each_hstate(h)
> >>>>>+ seq_printf(m,
> >>>>>+ "HugePages_Total: %5lu\n"
> >>>>>+ "HugePages_Free: %5lu\n"
> >>>>>+ "HugePages_Rsvd: %5lu\n"
> >>>>>+ "HugePages_Surp: %5lu\n"
> >>>>>+ "Hugepagesize: %8lu kB\n",
> >>>>>+ h->nr_huge_pages,
> >>>>>+ h->free_huge_pages,
> >>>>>+ h->resv_huge_pages,
> >>>>>+ h->surplus_huge_pages,
> >>>>>+ 1UL<< (huge_page_order(h) + PAGE_SHIFT - 10));
> >>>>> }
> >>>>
> >>>>It sounds like now we'll get a meminfo that looks like:
> >>>>
> >>>>...
> >>>>AnonHugePages: 491520 kB
> >>>>HugePages_Total: 5
> >>>>HugePages_Free: 2
> >>>>HugePages_Rsvd: 3
> >>>>HugePages_Surp: 1
> >>>>Hugepagesize: 2048 kB
> >>>>HugePages_Total: 2
> >>>>HugePages_Free: 1
> >>>>HugePages_Rsvd: 1
> >>>>HugePages_Surp: 1
> >>>>Hugepagesize: 1048576 kB
> >>>>DirectMap4k: 12160 kB
> >>>>DirectMap2M: 2082816 kB
> >>>>DirectMap1G: 2097152 kB
> >>>>
> >>>>At best, that's a bit confusing. There aren't any other entries in
> >>>>meminfo that occur more than once. Plus, this information is available
> >>>>in the sysfs interface. Why isn't that sufficient?
> >>>>
> >>>>Could we do something where we keep the default hpage_size looking like
> >>>>it does now, but append the size explicitly for the new entries?
> >>>>
> >>>>HugePages_Total(1G): 2
> >>>>HugePages_Free(1G): 1
> >>>>HugePages_Rsvd(1G): 1
> >>>>HugePages_Surp(1G): 1
> >>>>
> >>>
> >>>Let's not change the existing interface, please.
> >>>
> >>>Adding new fields: OK.
> >>>Changing the way in whcih existing fields are calculated: OKish.
> >>>Renaming existing fields: not OK.
> >>
> >>How about lining up multiple values in each field like this?
> >>
> >> HugePages_Total: 5 2
> >> HugePages_Free: 2 1
> >> HugePages_Rsvd: 3 1
> >> HugePages_Surp: 1 1
> >> Hugepagesize: 2048 1048576 kB
> >> ...
> >>
> >>This doesn't change the field names and the impact for user space
> >>is still small?
> >
> >It might break some existing parsers, dunno.
> >
> >It was a mistake to assume that all hugepages will have the same size
> >for all time, and we just have to live with that mistake.
> >
> >I'd suggest that we leave meminfo alone, just ensuring that its output
> >makes some sense. Instead create a new interface which presents all
> >the required info in a sensible fashion and migrate usersapce reporting
> >tools over to that interface. Just let the meminfo field die a slow
> >death.
>
> The main idea behind this patch is to unify hugetlb interfaces in
> /proc/meminfo
> and sysfs. When somebody wants to find out all important
> informations about hugepage
> pools (as hugeadm from libhugetlbfs does), he has to determine
> default hugepage size
> from /proc/meminfo and then go into
> /sys/kernel/mm/hugepages/hugepages-<size>kB/
> for informations about next nodes.
>
> I agree with idea of throwing away of meminfo hugepage fields in the future,
> but before doing this, sysfs part of interface should indicate
> default hugepage
> size. And meminfo could possibly show data for all hugepage sizes on
> system. So when
> these parts will be independent, it is no problem to let meminfo
> fields die.

I think the two best options here are:

1. Use hugeadm (packaged with libhugetlbfs) to indicate the kernel default huge
page size to any userspace tool that needs to know.

2. Add an marker for the kernel default huge page size in sysfs.

I don't have a strong opinion about which is "right".

>
> >
> >It's tempting to remove the meminfo hugepage fields altogether - most
> >parsers _should_ be able to cope with a CONFIG_HUGETLB=n kernel. But
> >that's breakage as well - some applications may be using meminfo to
> >detect whether the kernel supports huge pages!
>


Attachments:
(No filename) (4.46 kB)
signature.asc (490.00 B)
Digital signature
Download all attachments