2014-07-01 13:02:11

by Jerome Marchand

[permalink] [raw]
Subject: [PATCH 0/5] mm, shmem: Enhance per-process accounting of shared memnory

There are several shortcomings with the accounting of shared memory
(sysV shm, shared anonymous mapping, mapping to a tmpfs file). The
values in /proc/<pid>/status and statm don't allow to distinguish
between shmem memory and a shared mapping to a regular file, even
though theirs implication on memory usage are quite different: at
reclaim, file mapping can be dropped or write back on disk while shmem
needs a place in swap. As for shmem pages that are swapped-out or in
swap cache, they aren't accounted at all.

This series addresses these issues by adding new fields to status and
smaps file in /proc/<pid>/. The accounting of resident shared memory is
made in the same way as it's currently done for resident memory and
general swap (a counter in mm_rss_stat), but this approach proved
impractical for paged-out shared memory (it would requires a rmap walk
each time a page is paged-in).

/proc/<pid>/smaps also lacks proper accounting of shared memory since
shmem subsystem hides all implementation detail to generic mm code.
This series adds the shmem_locate() function that returns the location
of a particular page (resident, in swap or swap cache). Called from
smaps code, it allows to show more detailled accounting of shmem
mappings in smaps.

Patch 1 adds a counter to keep track of resident shmem memory.
Patch 2 adds a function to allow generic code to know the physical
location of a shmem page.
Patch 3 adds simple helper function.
Patch 4 accounts swapped-out shmem in /proc/<pid>/status.
Patch 5 adds shmem specific fields to /proc/<pid>/smaps.

Thanks,
Jerome

Jerome Marchand (5):
mm, shmem: Add shmem resident memory accounting
mm, shmem: add shmem_locate function
mm, shmem: add shmem_vma() helper
mm, shmem: Add shmem swap memory accounting
mm, shmem: show location of non-resident shmem pages in smaps

Documentation/filesystems/proc.txt | 15 ++++
arch/s390/mm/pgtable.c | 2 +-
fs/proc/task_mmu.c | 139 +++++++++++++++++++++++++++++++++++--
include/linux/mm.h | 20 ++++++
include/linux/mm_types.h | 7 +-
kernel/events/uprobes.c | 2 +-
mm/filemap_xip.c | 2 +-
mm/memory.c | 37 ++++++++--
mm/rmap.c | 8 +--
mm/shmem.c | 37 ++++++++++
10 files changed, 249 insertions(+), 20 deletions(-)

--
1.9.3


2014-07-01 13:02:16

by Jerome Marchand

[permalink] [raw]
Subject: [PATCH 1/5] mm, shmem: Add shmem resident memory accounting

Currently looking at /proc/<pid>/status or statm, there is no way to
distinguish shmem pages from pages mapped to a regular file (shmem
pages are mapped to /dev/zero), even though their implication in
actual memory use is quite different.
This patch adds MM_SHMEMPAGES counter to mm_rss_stat. It keeps track of
resident shmem memory size. Its value is exposed in the new VmShm line
of /proc/<pid>/status.

Signed-off-by: Jerome Marchand <[email protected]>
---
Documentation/filesystems/proc.txt | 2 ++
arch/s390/mm/pgtable.c | 2 +-
fs/proc/task_mmu.c | 9 ++++++---
include/linux/mm.h | 7 +++++++
include/linux/mm_types.h | 7 ++++---
kernel/events/uprobes.c | 2 +-
mm/filemap_xip.c | 2 +-
mm/memory.c | 37 +++++++++++++++++++++++++++++++------
mm/rmap.c | 8 ++++----
9 files changed, 57 insertions(+), 19 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index ddc531a..1c49957 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -171,6 +171,7 @@ read the file /proc/PID/status:
VmLib: 1412 kB
VmPTE: 20 kb
VmSwap: 0 kB
+ VmShm: 0 kB
Threads: 1
SigQ: 0/28578
SigPnd: 0000000000000000
@@ -228,6 +229,7 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7)
VmLib size of shared library code
VmPTE size of page table entries
VmSwap size of swap usage (the number of referred swapents)
+ VmShm size of resident shmem memory
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 37b8241..9fe31b0 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -612,7 +612,7 @@ static void gmap_zap_swap_entry(swp_entry_t entry, struct mm_struct *mm)
if (PageAnon(page))
dec_mm_counter(mm, MM_ANONPAGES);
else
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);
}
free_swap_and_cache(entry);
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cfa63ee..4e60751 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -21,7 +21,7 @@

void task_mem(struct seq_file *m, struct mm_struct *mm)
{
- unsigned long data, text, lib, swap;
+ unsigned long data, text, lib, swap, shmem;
unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;

/*
@@ -42,6 +42,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK)) >> 10;
lib = (mm->exec_vm << (PAGE_SHIFT-10)) - text;
swap = get_mm_counter(mm, MM_SWAPENTS);
+ shmem = get_mm_counter(mm, MM_SHMEMPAGES);
seq_printf(m,
"VmPeak:\t%8lu kB\n"
"VmSize:\t%8lu kB\n"
@@ -54,7 +55,8 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
"VmExe:\t%8lu kB\n"
"VmLib:\t%8lu kB\n"
"VmPTE:\t%8lu kB\n"
- "VmSwap:\t%8lu kB\n",
+ "VmSwap:\t%8lu kB\n"
+ "VmShm:\t%8lu kB\n",
hiwater_vm << (PAGE_SHIFT-10),
total_vm << (PAGE_SHIFT-10),
mm->locked_vm << (PAGE_SHIFT-10),
@@ -65,7 +67,8 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
mm->stack_vm << (PAGE_SHIFT-10), text, lib,
(PTRS_PER_PTE * sizeof(pte_t) *
atomic_long_read(&mm->nr_ptes)) >> 10,
- swap << (PAGE_SHIFT-10));
+ swap << (PAGE_SHIFT-10),
+ shmem << (PAGE_SHIFT-10));
}

unsigned long task_vsize(struct mm_struct *mm)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e03dd29..e69ee9d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1299,6 +1299,13 @@ static inline void dec_mm_counter(struct mm_struct *mm, int member)
atomic_long_dec(&mm->rss_stat.count[member]);
}

+static inline void dec_mm_file_counters(struct mm_struct *mm, struct page *page)
+{
+ dec_mm_counter(mm, MM_FILEPAGES);
+ if (PageSwapBacked(page))
+ dec_mm_counter(mm, MM_SHMEMPAGES);
+}
+
static inline unsigned long get_mm_rss(struct mm_struct *mm)
{
return get_mm_counter(mm, MM_FILEPAGES) +
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 21bff4b..e0307c8 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -322,9 +322,10 @@ struct core_state {
};

enum {
- MM_FILEPAGES,
- MM_ANONPAGES,
- MM_SWAPENTS,
+ MM_FILEPAGES, /* Resident file mapping pages (includes /dev/zero) */
+ MM_ANONPAGES, /* Resident anonymous pages */
+ MM_SWAPENTS, /* Anonymous swap entries */
+ MM_SHMEMPAGES, /* Resident shared memory pages */
NR_MM_COUNTERS
};

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 1d0af8a..6c28c72 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -188,7 +188,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
lru_cache_add_active_or_unevictable(kpage, vma);

if (!PageAnon(page)) {
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);
inc_mm_counter(mm, MM_ANONPAGES);
}

diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
index d8d9fe3..4bd4836 100644
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -194,7 +194,7 @@ retry:
flush_cache_page(vma, address, pte_pfn(*pte));
pteval = ptep_clear_flush(vma, address, pte);
page_remove_rmap(page);
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);
BUG_ON(pte_dirty(pteval));
pte_unmap_unlock(pte, ptl);
/* must invalidate_page _before_ freeing the page */
diff --git a/mm/memory.c b/mm/memory.c
index 09e2cd0..c394fc7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -176,6 +176,20 @@ static void check_sync_rss_stat(struct task_struct *task)

#endif /* SPLIT_RSS_COUNTING */

+static void inc_mm_file_counters_fast(struct mm_struct *mm, struct page *page)
+{
+ inc_mm_counter_fast(mm, MM_FILEPAGES);
+ if (PageSwapBacked(page))
+ inc_mm_counter_fast(mm, MM_SHMEMPAGES);
+}
+
+static void dec_mm_file_counters_fast(struct mm_struct *mm, struct page *page)
+{
+ dec_mm_counter_fast(mm, MM_FILEPAGES);
+ if (PageSwapBacked(page))
+ dec_mm_counter_fast(mm, MM_SHMEMPAGES);
+}
+
#ifdef HAVE_GENERIC_MMU_GATHER

static int tlb_next_batch(struct mmu_gather *tlb)
@@ -832,8 +846,11 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,

if (PageAnon(page))
rss[MM_ANONPAGES]++;
- else
+ else {
rss[MM_FILEPAGES]++;
+ if (PageSwapBacked(page))
+ rss[MM_SHMEMPAGES]++;
+ }

if (is_write_migration_entry(entry) &&
is_cow_mapping(vm_flags)) {
@@ -875,8 +892,11 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
page_dup_rmap(page);
if (PageAnon(page))
rss[MM_ANONPAGES]++;
- else
+ else {
rss[MM_FILEPAGES]++;
+ if (PageSwapBacked(page))
+ rss[MM_SHMEMPAGES]++;
+ }
}

out_set_pte:
@@ -1140,6 +1160,8 @@ again:
likely(!(vma->vm_flags & VM_SEQ_READ)))
mark_page_accessed(page);
rss[MM_FILEPAGES]--;
+ if (PageSwapBacked(page))
+ rss[MM_SHMEMPAGES]--;
}
page_remove_rmap(page);
if (unlikely(page_mapcount(page) < 0))
@@ -1171,8 +1193,11 @@ again:

if (PageAnon(page))
rss[MM_ANONPAGES]--;
- else
+ else {
rss[MM_FILEPAGES]--;
+ if (PageSwapBacked(page))
+ rss[MM_SHMEMPAGES]--;
+ }
}
if (unlikely(!free_swap_and_cache(entry)))
print_bad_pte(vma, addr, ptent, NULL);
@@ -1495,7 +1520,7 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,

/* Ok, finally just insert the thing.. */
get_page(page);
- inc_mm_counter_fast(mm, MM_FILEPAGES);
+ inc_mm_file_counters_fast(mm, page);
page_add_file_rmap(page);
set_pte_at(mm, addr, pte, mk_pte(page, prot));

@@ -2217,7 +2242,7 @@ gotten:
if (likely(pte_same(*page_table, orig_pte))) {
if (old_page) {
if (!PageAnon(old_page)) {
- dec_mm_counter_fast(mm, MM_FILEPAGES);
+ dec_mm_file_counters_fast(mm, old_page);
inc_mm_counter_fast(mm, MM_ANONPAGES);
}
} else
@@ -2751,7 +2776,7 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
page_add_new_anon_rmap(page, vma, address);
} else {
- inc_mm_counter_fast(vma->vm_mm, MM_FILEPAGES);
+ inc_mm_file_counters_fast(vma->vm_mm, page);
page_add_file_rmap(page);
}
set_pte_at(vma->vm_mm, address, pte, entry);
diff --git a/mm/rmap.c b/mm/rmap.c
index 7928ddd..d40a65b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1168,7 +1168,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
if (PageAnon(page))
dec_mm_counter(mm, MM_ANONPAGES);
else
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);
}
set_pte_at(mm, address, pte,
swp_entry_to_pte(make_hwpoison_entry(page)));
@@ -1181,7 +1181,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
if (PageAnon(page))
dec_mm_counter(mm, MM_ANONPAGES);
else
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);
} else if (PageAnon(page)) {
swp_entry_t entry = { .val = page_private(page) };
pte_t swp_pte;
@@ -1225,7 +1225,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
entry = make_migration_entry(page, pte_write(pteval));
set_pte_at(mm, address, pte, swp_entry_to_pte(entry));
} else
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);

page_remove_rmap(page);
page_cache_release(page);
@@ -1376,7 +1376,7 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,

page_remove_rmap(page);
page_cache_release(page);
- dec_mm_counter(mm, MM_FILEPAGES);
+ dec_mm_file_counters(mm, page);
(*mapcount)--;
}
pte_unmap_unlock(pte - 1, ptl);
--
1.9.3

2014-07-01 13:02:23

by Jerome Marchand

[permalink] [raw]
Subject: [PATCH 4/5] mm, shmem: Add shmem swap memory accounting

Adds get_mm_shswap() which compute the size of swaped out shmem. It
does so by pagewalking the mm and using the new shmem_locate() function
to get the physical location of shmem pages.
The result is displayed in the new VmShSw line of /proc/<pid>/status.
Use mm_walk an shmem_locate() to account paged out shmem pages.

It significantly slows down /proc/<pid>/status acccess speed when
there is a big shmem mapping. If that is an issue, we can drop this
patch and only display this counter in the inherently slower
/proc/<pid>/smaps file (cf. next patch).

Signed-off-by: Jerome Marchand <[email protected]>
---
Documentation/filesystems/proc.txt | 2 +
fs/proc/task_mmu.c | 80 ++++++++++++++++++++++++++++++++++++--
2 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 1c49957..1a15c56 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -172,6 +172,7 @@ read the file /proc/PID/status:
VmPTE: 20 kb
VmSwap: 0 kB
VmShm: 0 kB
+ VmShSw: 0 kB
Threads: 1
SigQ: 0/28578
SigPnd: 0000000000000000
@@ -230,6 +231,7 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7)
VmPTE size of page table entries
VmSwap size of swap usage (the number of referred swapents)
VmShm size of resident shmem memory
+ VmShSw size of paged out shmem memory
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4e60751..73f0ce4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -19,9 +19,80 @@
#include <asm/tlbflush.h>
#include "internal.h"

+struct shswap_stats {
+ struct vm_area_struct *vma;
+ unsigned long shswap;
+};
+
+#ifdef CONFIG_SHMEM
+static int shswap_pte(pte_t *pte, unsigned long addr, unsigned long end,
+ struct mm_walk *walk)
+{
+ struct shswap_stats *shss = walk->private;
+ struct vm_area_struct *vma = shss->vma;
+ pgoff_t pgoff = linear_page_index(vma, addr);
+ pte_t ptent = *pte;
+
+ if (pte_none(ptent) &&
+ shmem_locate(vma, pgoff, NULL) == SHMEM_SWAP)
+ shss->shswap += end - addr;
+
+ return 0;
+}
+
+static int shswap_pte_hole(unsigned long addr, unsigned long end,
+ struct mm_walk *walk)
+{
+ struct shswap_stats *shss = walk->private;
+ struct vm_area_struct *vma = shss->vma;
+ pgoff_t pgoff;
+
+ for (; addr != end; addr += PAGE_SIZE) {
+ pgoff = linear_page_index(vma, addr);
+
+ if (shmem_locate(vma, pgoff, NULL) == SHMEM_SWAP)
+ shss->shswap += PAGE_SIZE;
+ }
+
+ return 0;
+}
+
+static unsigned long get_mm_shswap(struct mm_struct *mm)
+{
+ struct vm_area_struct *vma;
+ struct shswap_stats shss;
+ struct mm_walk shswap_walk = {
+ .pte_entry = shswap_pte,
+ .pte_hole = shswap_pte_hole,
+ .mm = mm,
+ .private = &shss,
+ };
+
+ memset(&shss, 0, sizeof(shss));
+
+ down_read(&mm->mmap_sem);
+ for (vma = mm->mmap; vma; vma = vma->vm_next)
+ if (shmem_vma(vma)) {
+ shss.vma = vma;
+ walk_page_range(vma->vm_start, vma->vm_end,
+ &shswap_walk);
+ }
+ up_read(&mm->mmap_sem);
+
+ return shss.shswap;
+}
+
+#else
+
+static unsigned long get_mm_shswap(struct mm_struct *mm)
+{
+ return 0;
+}
+#endif
+
void task_mem(struct seq_file *m, struct mm_struct *mm)
{
- unsigned long data, text, lib, swap, shmem;
+ unsigned long data, text, lib, swap, shmem, shswap;
unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;

/*
@@ -43,6 +114,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
lib = (mm->exec_vm << (PAGE_SHIFT-10)) - text;
swap = get_mm_counter(mm, MM_SWAPENTS);
shmem = get_mm_counter(mm, MM_SHMEMPAGES);
+ shswap = get_mm_shswap(mm);
seq_printf(m,
"VmPeak:\t%8lu kB\n"
"VmSize:\t%8lu kB\n"
@@ -56,7 +128,8 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
"VmLib:\t%8lu kB\n"
"VmPTE:\t%8lu kB\n"
"VmSwap:\t%8lu kB\n"
- "VmShm:\t%8lu kB\n",
+ "VmShm:\t%8lu kB\n"
+ "VmShSw:\t%8lu kB\n",
hiwater_vm << (PAGE_SHIFT-10),
total_vm << (PAGE_SHIFT-10),
mm->locked_vm << (PAGE_SHIFT-10),
@@ -68,7 +141,8 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
(PTRS_PER_PTE * sizeof(pte_t) *
atomic_long_read(&mm->nr_ptes)) >> 10,
swap << (PAGE_SHIFT-10),
- shmem << (PAGE_SHIFT-10));
+ shmem << (PAGE_SHIFT-10),
+ shswap >> 10);
}

unsigned long task_vsize(struct mm_struct *mm)
--
1.9.3

2014-07-01 13:02:19

by Jerome Marchand

[permalink] [raw]
Subject: [PATCH 3/5] mm, shmem: Add shmem_vma() helper

Add a simple helper to check if a vm area belongs to shmem.

Signed-off-by: Jerome Marchand <[email protected]>
---
include/linux/mm.h | 6 ++++++
mm/shmem.c | 8 ++++++++
2 files changed, 14 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 34099fa..04a58d1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1074,11 +1074,17 @@ int shmem_zero_setup(struct vm_area_struct *);

extern int shmem_locate(struct vm_area_struct *vma, pgoff_t pgoff, int *count);
bool shmem_mapping(struct address_space *mapping);
+bool shmem_vma(struct vm_area_struct *vma);
+
#else
static inline bool shmem_mapping(struct address_space *mapping)
{
return false;
}
+static inline bool shmem_vma(struct vm_area_struct *vma)
+{
+ return false;
+}
#endif

extern int can_do_mlock(void);
diff --git a/mm/shmem.c b/mm/shmem.c
index 11b37a7..be87a20 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1447,6 +1447,14 @@ bool shmem_mapping(struct address_space *mapping)
return mapping->backing_dev_info == &shmem_backing_dev_info;
}

+bool shmem_vma(struct vm_area_struct *vma)
+{
+ return (vma->vm_file &&
+ vma->vm_file->f_dentry->d_inode->i_mapping->backing_dev_info
+ == &shmem_backing_dev_info);
+
+}
+
#ifdef CONFIG_TMPFS
static const struct inode_operations shmem_symlink_inode_operations;
static const struct inode_operations shmem_short_symlink_operations;
--
1.9.3

2014-07-01 13:02:31

by Jerome Marchand

[permalink] [raw]
Subject: [PATCH 5/5] mm, shmem: Show location of non-resident shmem pages in smaps

Adds ShmOther, ShmOrphan, ShmSwapCache and ShmSwap lines to
/proc/<pid>/smaps for shmem mappings.

ShmOther: amount of memory that is currently resident in memory, not
present in the page table of this process but present in the page
table of an other process.
ShmOrphan: amount of memory that is currently resident in memory but
not present in any process page table. This can happens when a process
unmaps a shared mapping it has accessed before or exits. Despite being
resident, this memory is not currently accounted to any process.
ShmSwapcache: amount of memory currently in swap cache
ShmSwap: amount of memory that is paged out on disk.

Signed-off-by: Jerome Marchand <[email protected]>
---
Documentation/filesystems/proc.txt | 11 ++++++++
fs/proc/task_mmu.c | 56 +++++++++++++++++++++++++++++++++++++-
2 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 1a15c56..a65ab59 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -422,6 +422,10 @@ Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 374 kB
+ShmOther: 124 kB
+ShmOrphan: 0 kB
+ShmSwapCache: 12 kB
+ShmSwap: 36 kB
VmFlags: rd ex mr mw me de

the first of these lines shows the same information as is displayed for the
@@ -437,6 +441,13 @@ a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
and a page is modified, the file page is replaced by a private anonymous copy.
"Swap" shows how much would-be-anonymous memory is also used, but out on
swap.
+The ShmXXX lines only appears for shmem mapping. They show the amount of memory
+from the mapping that is currently:
+ - resident in RAM, not present in the page table of this process but present
+ in the page table of an other process (ShmOther)
+ - resident in RAM but not present in the page table of any process (ShmOrphan)
+ - in swap cache (ShmSwapCache)
+ - paged out on swap (ShmSwap).

"VmFlags" field deserves a separate description. This member represents the kernel
flags associated with the particular virtual memory area in two letter encoded
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 73f0ce4..9b1de55 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -518,9 +518,33 @@ struct mem_size_stats {
unsigned long anonymous_thp;
unsigned long swap;
unsigned long nonlinear;
+ unsigned long shmem_resident_other;
+ unsigned long shmem_swapcache;
+ unsigned long shmem_swap;
+ unsigned long shmem_orphan;
u64 pss;
};

+void update_shmem_stats(struct mem_size_stats *mss, struct vm_area_struct *vma,
+ pgoff_t pgoff, unsigned long size)
+{
+ int count = 0;
+
+ switch (shmem_locate(vma, pgoff, &count)) {
+ case SHMEM_RESIDENT:
+ if (count)
+ mss->shmem_resident_other += size;
+ else
+ mss->shmem_orphan += size;
+ break;
+ case SHMEM_SWAPCACHE:
+ mss->shmem_swapcache += size;
+ break;
+ case SHMEM_SWAP:
+ mss->shmem_swap += size;
+ break;
+ }
+}

static void smaps_pte_entry(pte_t ptent, unsigned long addr,
unsigned long ptent_size, struct mm_walk *walk)
@@ -543,7 +567,8 @@ static void smaps_pte_entry(pte_t ptent, unsigned long addr,
} else if (pte_file(ptent)) {
if (pte_to_pgoff(ptent) != pgoff)
mss->nonlinear += ptent_size;
- }
+ } else if (pte_none(ptent) && shmem_vma(vma))
+ update_shmem_stats(mss, vma, pgoff, ptent_size);

if (!page)
return;
@@ -604,6 +629,21 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
return 0;
}

+static int smaps_pte_hole(unsigned long addr, unsigned long end,
+ struct mm_walk *walk)
+{
+ struct mem_size_stats *mss = walk->private;
+ struct vm_area_struct *vma = mss->vma;
+ pgoff_t pgoff;
+
+ for (; addr != end; addr += PAGE_SIZE) {
+ pgoff = linear_page_index(vma, addr);
+ update_shmem_stats(mss, vma, pgoff, PAGE_SIZE);
+ }
+
+ return 0;
+}
+
static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
{
/*
@@ -670,6 +710,10 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
.private = &mss,
};

+ /* Only walk the holes when it'a a shmem mapping */
+ if (shmem_vma(vma))
+ smaps_walk.pte_hole = smaps_pte_hole;
+
memset(&mss, 0, sizeof mss);
mss.vma = vma;
/* mmap_sem is held in m_start */
@@ -712,6 +756,16 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
if (vma->vm_flags & VM_NONLINEAR)
seq_printf(m, "Nonlinear: %8lu kB\n",
mss.nonlinear >> 10);
+ if (shmem_vma(vma))
+ seq_printf(m,
+ "ShmOther: %8lu kB\n"
+ "ShmOrphan: %8lu kB\n"
+ "ShmSwapCache: %8lu kB\n"
+ "ShmSwap: %8lu kB\n",
+ mss.shmem_resident_other >> 10,
+ mss.shmem_orphan >> 10,
+ mss.shmem_swapcache >> 10,
+ mss.shmem_swap >> 10);

show_smap_vma_flags(m, vma);

--
1.9.3

2014-07-01 13:03:09

by Jerome Marchand

[permalink] [raw]
Subject: [PATCH 2/5] mm, shmem: Add shmem_locate function

The shmem subsytem is kind of a black box: the generic mm code can't
always know where a specific page physically is. This patch adds the
shmem_locate() function to find out the physical location of shmem
pages (resident, in swap or swapcache). If the optional argument count
isn't NULL and the page is resident, it also returns the mapcount value
of this page.
This is intended to allow finer accounting of shmem/tmpfs pages.

Signed-off-by: Jerome Marchand <[email protected]>
---
include/linux/mm.h | 7 +++++++
mm/shmem.c | 29 +++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index e69ee9d..34099fa 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1066,6 +1066,13 @@ extern bool skip_free_areas_node(unsigned int flags, int nid);

int shmem_zero_setup(struct vm_area_struct *);
#ifdef CONFIG_SHMEM
+
+#define SHMEM_NOTPRESENT 1 /* page is not present in memory */
+#define SHMEM_RESIDENT 2 /* page is resident in RAM */
+#define SHMEM_SWAPCACHE 3 /* page is in swap cache */
+#define SHMEM_SWAP 4 /* page is paged out */
+
+extern int shmem_locate(struct vm_area_struct *vma, pgoff_t pgoff, int *count);
bool shmem_mapping(struct address_space *mapping);
#else
static inline bool shmem_mapping(struct address_space *mapping)
diff --git a/mm/shmem.c b/mm/shmem.c
index 5e5d860..11b37a7 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1305,6 +1305,35 @@ static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
return ret;
}

+int shmem_locate(struct vm_area_struct *vma, pgoff_t pgoff, int *count)
+{
+ struct address_space *mapping = file_inode(vma->vm_file)->i_mapping;
+ struct page *page;
+ swp_entry_t swap;
+ int ret;
+
+ page = find_get_entry(mapping, pgoff);
+ if (!page) /* Not yet initialised? */
+ return SHMEM_NOTPRESENT;
+
+ if (!radix_tree_exceptional_entry(page)) {
+ ret = SHMEM_RESIDENT;
+ if (count)
+ *count = page_mapcount(page);
+ goto out;
+ }
+
+ swap = radix_to_swp_entry(page);
+ page = find_get_page(swap_address_space(swap), swap.val);
+ if (!page)
+ return SHMEM_SWAP;
+ ret = SHMEM_SWAPCACHE;
+
+out:
+ page_cache_release(page);
+ return ret;
+}
+
#ifdef CONFIG_NUMA
static int shmem_set_policy(struct vm_area_struct *vma, struct mempolicy *mpol)
{
--
1.9.3