This patch leverages the addition of explicit accounting for pages used by
shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat" -- in order to
make the users of sysinfo(2) and si_meminfo*() friends aware of that
vmstat entry consistently across the interfaces.
Signed-off-by: Rafael Aquini <[email protected]>
---
drivers/base/node.c | 2 +-
fs/proc/meminfo.c | 2 +-
mm/page_alloc.c | 3 ++-
3 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 8f7ed99..c6d3ae0 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -126,7 +126,7 @@ static ssize_t node_read_meminfo(struct device *dev,
nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(node_page_state(nid, NR_ANON_PAGES)),
- nid, K(node_page_state(nid, NR_SHMEM)),
+ nid, K(i.sharedram),
nid, node_page_state(nid, NR_KERNEL_STACK) *
THREAD_SIZE / 1024,
nid, K(node_page_state(nid, NR_PAGETABLE)),
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 7445af0..aa1eee0 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -168,7 +168,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
K(global_page_state(NR_WRITEBACK)),
K(global_page_state(NR_ANON_PAGES)),
K(global_page_state(NR_FILE_MAPPED)),
- K(global_page_state(NR_SHMEM)),
+ K(i.sharedram),
K(global_page_state(NR_SLAB_RECLAIMABLE) +
global_page_state(NR_SLAB_UNRECLAIMABLE)),
K(global_page_state(NR_SLAB_RECLAIMABLE)),
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 20d17f8..f72ea38 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3040,7 +3040,7 @@ static inline void show_node(struct zone *zone)
void si_meminfo(struct sysinfo *val)
{
val->totalram = totalram_pages;
- val->sharedram = 0;
+ val->sharedram = global_page_state(NR_SHMEM);
val->freeram = global_page_state(NR_FREE_PAGES);
val->bufferram = nr_blockdev_pages();
val->totalhigh = totalhigh_pages;
@@ -3060,6 +3060,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
managed_pages += pgdat->node_zones[zone_type].managed_pages;
val->totalram = managed_pages;
+ val->sharedram = node_page_state(nid, NR_SHMEM);
val->freeram = node_page_state(nid, NR_FREE_PAGES);
#ifdef CONFIG_HIGHMEM
val->totalhigh = pgdat->node_zones[ZONE_HIGHMEM].managed_pages;
--
1.9.3
> -----Original Message-----
> From: Rafael Aquini [mailto:[email protected]]
> Sent: Wednesday, June 25, 2014 2:40 PM
> To: [email protected]
> Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki JP; [email protected]
> Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces
>
> This patch leverages the addition of explicit accounting for pages used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem
> vmstat" -- in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry consistently across the
> interfaces.
Why?
Traditionally sysinfo.sharedram was not used for shmem. It was totally strange semantics and completely outdated feature.
So, we may reuse it for another purpose. But I'm not sure its benefit.
Why don't you use /proc/meminfo?
I'm afraid userland programs get a confusion.
>
> Signed-off-by: Rafael Aquini <[email protected]>
> ---
> drivers/base/node.c | 2 +-
> fs/proc/meminfo.c | 2 +-
> mm/page_alloc.c | 3 ++-
> 3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c index 8f7ed99..c6d3ae0 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -126,7 +126,7 @@ static ssize_t node_read_meminfo(struct device *dev,
> nid, K(node_page_state(nid, NR_FILE_PAGES)),
> nid, K(node_page_state(nid, NR_FILE_MAPPED)),
> nid, K(node_page_state(nid, NR_ANON_PAGES)),
> - nid, K(node_page_state(nid, NR_SHMEM)),
> + nid, K(i.sharedram),
> nid, node_page_state(nid, NR_KERNEL_STACK) *
> THREAD_SIZE / 1024,
> nid, K(node_page_state(nid, NR_PAGETABLE)), diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index
> 7445af0..aa1eee0 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -168,7 +168,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> K(global_page_state(NR_WRITEBACK)),
> K(global_page_state(NR_ANON_PAGES)),
> K(global_page_state(NR_FILE_MAPPED)),
> - K(global_page_state(NR_SHMEM)),
> + K(i.sharedram),
> K(global_page_state(NR_SLAB_RECLAIMABLE) +
> global_page_state(NR_SLAB_UNRECLAIMABLE)),
> K(global_page_state(NR_SLAB_RECLAIMABLE)),
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 20d17f8..f72ea38 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3040,7 +3040,7 @@ static inline void show_node(struct zone *zone) void si_meminfo(struct sysinfo *val) {
> val->totalram = totalram_pages;
> - val->sharedram = 0;
> + val->sharedram = global_page_state(NR_SHMEM);
> val->freeram = global_page_state(NR_FREE_PAGES);
> val->bufferram = nr_blockdev_pages();
> val->totalhigh = totalhigh_pages;
> @@ -3060,6 +3060,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
> for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
> managed_pages += pgdat->node_zones[zone_type].managed_pages;
> val->totalram = managed_pages;
> + val->sharedram = node_page_state(nid, NR_SHMEM);
> val->freeram = node_page_state(nid, NR_FREE_PAGES); #ifdef CONFIG_HIGHMEM
> val->totalhigh = pgdat->node_zones[ZONE_HIGHMEM].managed_pages;
> --
> 1.9.3
On Wed, Jun 25, 2014 at 12:41:17PM -0700, Motohiro Kosaki wrote:
>
>
> > -----Original Message-----
> > From: Rafael Aquini [mailto:[email protected]]
> > Sent: Wednesday, June 25, 2014 2:40 PM
> > To: [email protected]
> > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki JP; [email protected]
> > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces
> >
> > This patch leverages the addition of explicit accounting for pages used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem
> > vmstat" -- in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry consistently across the
> > interfaces.
>
> Why?
Because we do not report consistently across the interfaces we declare
exporting that data. Check sysinfo(2) manpage, for instance:
[...]
struct sysinfo {
long uptime; /* Seconds since boot */
unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
unsigned long totalram; /* Total usable main memory size */
unsigned long freeram; /* Available memory size */
unsigned long sharedram; /* Amount of shared memory */ <<<<<
[...]
userspace tools resorting to sysinfo() syscall will get a hardcoded 0
for shared memory which is reported differently from /proc/meminfo.
Also, si_meminfo() & si_meminfo_node() are utilized within the kernel to
gather statistics for /proc/meminfo & friends, and so we can leverage collecting
sharedmem from those calls as well, just as we do for totalram, freeram & bufferram.
Regards,
-- Rafael
> Traditionally sysinfo.sharedram was not used for shmem. It was totally strange semantics and completely outdated feature.
> So, we may reuse it for another purpose. But I'm not sure its benefit.
>
> Why don't you use /proc/meminfo?
> I'm afraid userland programs get a confusion.
>
>
> >
> > Signed-off-by: Rafael Aquini <[email protected]>
> > ---
> > drivers/base/node.c | 2 +-
> > fs/proc/meminfo.c | 2 +-
> > mm/page_alloc.c | 3 ++-
> > 3 files changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/node.c b/drivers/base/node.c index 8f7ed99..c6d3ae0 100644
> > --- a/drivers/base/node.c
> > +++ b/drivers/base/node.c
> > @@ -126,7 +126,7 @@ static ssize_t node_read_meminfo(struct device *dev,
> > nid, K(node_page_state(nid, NR_FILE_PAGES)),
> > nid, K(node_page_state(nid, NR_FILE_MAPPED)),
> > nid, K(node_page_state(nid, NR_ANON_PAGES)),
> > - nid, K(node_page_state(nid, NR_SHMEM)),
> > + nid, K(i.sharedram),
> > nid, node_page_state(nid, NR_KERNEL_STACK) *
> > THREAD_SIZE / 1024,
> > nid, K(node_page_state(nid, NR_PAGETABLE)), diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index
> > 7445af0..aa1eee0 100644
> > --- a/fs/proc/meminfo.c
> > +++ b/fs/proc/meminfo.c
> > @@ -168,7 +168,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> > K(global_page_state(NR_WRITEBACK)),
> > K(global_page_state(NR_ANON_PAGES)),
> > K(global_page_state(NR_FILE_MAPPED)),
> > - K(global_page_state(NR_SHMEM)),
> > + K(i.sharedram),
> > K(global_page_state(NR_SLAB_RECLAIMABLE) +
> > global_page_state(NR_SLAB_UNRECLAIMABLE)),
> > K(global_page_state(NR_SLAB_RECLAIMABLE)),
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 20d17f8..f72ea38 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3040,7 +3040,7 @@ static inline void show_node(struct zone *zone) void si_meminfo(struct sysinfo *val) {
> > val->totalram = totalram_pages;
> > - val->sharedram = 0;
> > + val->sharedram = global_page_state(NR_SHMEM);
> > val->freeram = global_page_state(NR_FREE_PAGES);
> > val->bufferram = nr_blockdev_pages();
> > val->totalhigh = totalhigh_pages;
> > @@ -3060,6 +3060,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
> > for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
> > managed_pages += pgdat->node_zones[zone_type].managed_pages;
> > val->totalram = managed_pages;
> > + val->sharedram = node_page_state(nid, NR_SHMEM);
> > val->freeram = node_page_state(nid, NR_FREE_PAGES); #ifdef CONFIG_HIGHMEM
> > val->totalhigh = pgdat->node_zones[ZONE_HIGHMEM].managed_pages;
> > --
> > 1.9.3
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Wednesday, June 25, 2014 3:41 PM
> To: Rafael Aquini; [email protected]
> Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki JP; [email protected]
> Subject: RE: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces
>
>
>
> > -----Original Message-----
> > From: Rafael Aquini [mailto:[email protected]]
> > Sent: Wednesday, June 25, 2014 2:40 PM
> > To: [email protected]
> > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro
> > Kosaki JP; [email protected]
> > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo()
> > interfaces
> >
> > This patch leverages the addition of explicit accounting for pages
> > used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat" --
> > in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry consistently across the interfaces.
>
> Why?
> Traditionally sysinfo.sharedram was not used for shmem. It was totally strange semantics and completely outdated feature.
> So, we may reuse it for another purpose. But I'm not sure its benefit.
>
> Why don't you use /proc/meminfo?
> I'm afraid userland programs get a confusion.
For the record. This is historical implementation at linux-2.3.12. I.e. account sum of page count.
void si_meminfo(struct sysinfo *val)
{
int i;
i = max_mapnr;
val->totalram = 0;
val->sharedram = 0;
val->freeram = nr_free_pages << PAGE_SHIFT;
val->bufferram = atomic_read(&buffermem);
while (i-- > 0) {
if (PageReserved(mem_map+i))
continue;
val->totalram++;
if (!page_count(mem_map+i))
continue;
val->sharedram += page_count(mem_map+i) - 1;
}
val->totalram <<= PAGE_SHIFT;
val->sharedram <<= PAGE_SHIFT;
return;
}
> -----Original Message-----
> From: Rafael Aquini [mailto:[email protected]]
> Sent: Wednesday, June 25, 2014 4:16 PM
> To: Motohiro Kosaki
> Cc: [email protected]; Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki JP; linux-
> [email protected]
> Subject: Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces
>
> On Wed, Jun 25, 2014 at 12:41:17PM -0700, Motohiro Kosaki wrote:
> >
> >
> > > -----Original Message-----
> > > From: Rafael Aquini [mailto:[email protected]]
> > > Sent: Wednesday, June 25, 2014 2:40 PM
> > > To: [email protected]
> > > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner;
> > > Motohiro Kosaki JP; [email protected]
> > > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo()
> > > interfaces
> > >
> > > This patch leverages the addition of explicit accounting for pages
> > > used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat"
> > > -- in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry consistently across the interfaces.
> >
> > Why?
>
> Because we do not report consistently across the interfaces we declare exporting that data. Check sysinfo(2) manpage, for instance:
> [...]
> struct sysinfo {
> long uptime; /* Seconds since boot */
> unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
> unsigned long totalram; /* Total usable main memory size */
> unsigned long freeram; /* Available memory size */
> unsigned long sharedram; /* Amount of shared memory */ <<<<< [...]
>
> userspace tools resorting to sysinfo() syscall will get a hardcoded 0 for shared memory which is reported differently from
> /proc/meminfo.
>
> Also, si_meminfo() & si_meminfo_node() are utilized within the kernel to gather statistics for /proc/meminfo & friends, and so we
> can leverage collecting sharedmem from those calls as well, just as we do for totalram, freeram & bufferram.
But "Amount of shared memory" didn't mean amout of shmem. It actually meant amout of page of page-count>=2.
Again, there is a possibility to change the semantics. But I don't have enough userland knowledge to do. Please investigate
and explain why your change don't break any userland.
On Wed, Jun 25, 2014 at 01:27:53PM -0700, Motohiro Kosaki wrote:
>
>
> > -----Original Message-----
> > From: Rafael Aquini [mailto:[email protected]]
> > Sent: Wednesday, June 25, 2014 4:16 PM
> > To: Motohiro Kosaki
> > Cc: [email protected]; Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki JP; linux-
> > [email protected]
> > Subject: Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces
> >
> > On Wed, Jun 25, 2014 at 12:41:17PM -0700, Motohiro Kosaki wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Rafael Aquini [mailto:[email protected]]
> > > > Sent: Wednesday, June 25, 2014 2:40 PM
> > > > To: [email protected]
> > > > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner;
> > > > Motohiro Kosaki JP; [email protected]
> > > > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo()
> > > > interfaces
> > > >
> > > > This patch leverages the addition of explicit accounting for pages
> > > > used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat"
> > > > -- in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry consistently across the interfaces.
> > >
> > > Why?
> >
> > Because we do not report consistently across the interfaces we declare exporting that data. Check sysinfo(2) manpage, for instance:
> > [...]
> > struct sysinfo {
> > long uptime; /* Seconds since boot */
> > unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
> > unsigned long totalram; /* Total usable main memory size */
> > unsigned long freeram; /* Available memory size */
> > unsigned long sharedram; /* Amount of shared memory */ <<<<< [...]
> >
> > userspace tools resorting to sysinfo() syscall will get a hardcoded 0 for shared memory which is reported differently from
> > /proc/meminfo.
> >
> > Also, si_meminfo() & si_meminfo_node() are utilized within the kernel to gather statistics for /proc/meminfo & friends, and so we
> > can leverage collecting sharedmem from those calls as well, just as we do for totalram, freeram & bufferram.
>
> But "Amount of shared memory" didn't mean amout of shmem. It actually meant amout of page of page-count>=2.
> Again, there is a possibility to change the semantics. But I don't have enough userland knowledge to do. Please investigate
> and explain why your change don't break any userland.
I agree that reporting the amount of shared pages in that historically fashion
might not be interesting for userspace tools resorting to sysinfo(2),
nowadays.
OTOH, our documentation implies we do return shared memory there, and FWIW,
considering the other places we do export the "shared memory" concept to
userspace nowadays, we are suggesting it's the amount of tmpfs/shmem, and not the
amount of shared mapped pages it historiacally represented once. What is really
confusing is having a field that supposedely/expectedely would return the amount
of shmem to userspace queries, but instead returns a hard-coded zero (0).
I could easily find out that there were some user complaint/confusion on this
semantic inconsistency in the past, as in:
https://groups.google.com/forum/#!topic/comp.os.linux.development.system/ogWVn6XdvGA
or in:
http://marc.info/?l=net-snmp-cvs&m=132148788500667
which suggests users seem to always have understood it as being shmem/tmpfs
usage, as the /proc/meminfo field "MemShared" was tied direclty to
sysinfo.sharedram. Historically we reported shared memory that way, and
when it wasn't accurately meaning that anymore a 0 was hardcoded there to
potentially not break compatibility with older tools (older than 2.4).
In 2.6 we got rid of meminfo's "MemShared" until 2009, when you sort of
re-introduced it re-branded as Shmem. IMO, we should leverage what we
have in kernel now and take this change to make the exposed data consistent
across the interfaces that export it today -- sysinfo(2) & /proc/meminfo.
This is not a hard requirement, though, but rather a simple maintenance
nitpick from code review.
Regards,
-- Rafael
> I agree that reporting the amount of shared pages in that historically fashion
> might not be interesting for userspace tools resorting to sysinfo(2),
> nowadays.
>
> OTOH, our documentation implies we do return shared memory there, and FWIW,
> considering the other places we do export the "shared memory" concept to
> userspace nowadays, we are suggesting it's the amount of tmpfs/shmem, and not the
> amount of shared mapped pages it historiacally represented once. What is really
> confusing is having a field that supposedely/expectedely would return the amount
> of shmem to userspace queries, but instead returns a hard-coded zero (0).
>
> I could easily find out that there were some user complaint/confusion on this
> semantic inconsistency in the past, as in:
> https://groups.google.com/forum/#!topic/comp.os.linux.development.system/ogWVn6XdvGA
>
> or in:
> http://marc.info/?l=net-snmp-cvs&m=132148788500667
>
> which suggests users seem to always have understood it as being shmem/tmpfs
> usage, as the /proc/meminfo field "MemShared" was tied direclty to
> sysinfo.sharedram. Historically we reported shared memory that way, and
> when it wasn't accurately meaning that anymore a 0 was hardcoded there to
> potentially not break compatibility with older tools (older than 2.4).
> In 2.6 we got rid of meminfo's "MemShared" until 2009, when you sort of
> re-introduced it re-branded as Shmem. IMO, we should leverage what we
> have in kernel now and take this change to make the exposed data consistent
> across the interfaces that export it today -- sysinfo(2) & /proc/meminfo.
>
> This is not a hard requirement, though, but rather a simple maintenance
> nitpick from code review.
Ok, ack then. But please update a patch description and repost w/
ccing [email protected]. Someone might have a specific concern
about a compatibility.