2022-06-07 02:24:53

by Yosry Ahmed

[permalink] [raw]
Subject: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.

Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g.
KVM mmu. This provides more insights on the kernel memory used
by a workload.

This stat will be used by subsequent patches to count KVM mmu
memory usage.

Signed-off-by: Yosry Ahmed <[email protected]>
---
Documentation/admin-guide/cgroup-v2.rst | 5 +++++
Documentation/filesystems/proc.rst | 4 ++++
drivers/base/node.c | 2 ++
fs/proc/meminfo.c | 2 ++
include/linux/mmzone.h | 1 +
mm/memcontrol.c | 1 +
mm/page_alloc.c | 6 +++++-
mm/vmstat.c | 1 +
8 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 69d7a6983f781..307a284b99189 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back.
pagetables
Amount of memory allocated for page tables.

+ sec_pagetables
+ Amount of memory allocated for secondary page tables,
+ this currently includes KVM mmu allocations on x86
+ and arm64.
+
percpu (npn)
Amount of memory used for storing per-cpu kernel
data structures.
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 061744c436d99..894d6317f3bdc 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -973,6 +973,7 @@ You may not have all of these fields.
SReclaimable: 159856 kB
SUnreclaim: 124508 kB
PageTables: 24448 kB
+ SecPageTables: 0 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
@@ -1067,6 +1068,9 @@ SUnreclaim
PageTables
amount of memory dedicated to the lowest level of page
tables.
+SecPageTables
+ amount of memory dedicated to secondary page tables, this
+ currently includes KVM mmu allocations on x86 and arm64.
NFS_Unstable
Always zero. Previous counted pages which had been written to
the server, but has not been committed to stable storage.
diff --git a/drivers/base/node.c b/drivers/base/node.c
index ec8bb24a5a227..9fe716832546f 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -433,6 +433,7 @@ static ssize_t node_read_meminfo(struct device *dev,
"Node %d ShadowCallStack:%8lu kB\n"
#endif
"Node %d PageTables: %8lu kB\n"
+ "Node %d SecPageTables: %8lu kB\n"
"Node %d NFS_Unstable: %8lu kB\n"
"Node %d Bounce: %8lu kB\n"
"Node %d WritebackTmp: %8lu kB\n"
@@ -459,6 +460,7 @@ static ssize_t node_read_meminfo(struct device *dev,
nid, node_page_state(pgdat, NR_KERNEL_SCS_KB),
#endif
nid, K(node_page_state(pgdat, NR_PAGETABLE)),
+ nid, K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)),
nid, 0UL,
nid, K(sum_zone_node_page_state(nid, NR_BOUNCE)),
nid, K(node_page_state(pgdat, NR_WRITEBACK_TEMP)),
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 6fa761c9cc78e..fad29024eb2e0 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -108,6 +108,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
#endif
show_val_kb(m, "PageTables: ",
global_node_page_state(NR_PAGETABLE));
+ show_val_kb(m, "SecPageTables: ",
+ global_node_page_state(NR_SECONDARY_PAGETABLE));

show_val_kb(m, "NFS_Unstable: ", 0);
show_val_kb(m, "Bounce: ",
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 46ffab808f037..81d109e6c623a 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -219,6 +219,7 @@ enum node_stat_item {
NR_KERNEL_SCS_KB, /* measured in KiB */
#endif
NR_PAGETABLE, /* used for pagetables */
+ NR_SECONDARY_PAGETABLE, /* secondary pagetables, e.g. kvm shadow pagetables */
#ifdef CONFIG_SWAP
NR_SWAPCACHE,
#endif
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 598fece89e2b7..ee1c3d464857c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1398,6 +1398,7 @@ static const struct memory_stat memory_stats[] = {
{ "kernel", MEMCG_KMEM },
{ "kernel_stack", NR_KERNEL_STACK_KB },
{ "pagetables", NR_PAGETABLE },
+ { "sec_pagetables", NR_SECONDARY_PAGETABLE },
{ "percpu", MEMCG_PERCPU_B },
{ "sock", MEMCG_SOCK },
{ "vmalloc", MEMCG_VMALLOC },
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e42038382c12..29a7e9cd28c74 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5932,7 +5932,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
" active_file:%lu inactive_file:%lu isolated_file:%lu\n"
" unevictable:%lu dirty:%lu writeback:%lu\n"
" slab_reclaimable:%lu slab_unreclaimable:%lu\n"
- " mapped:%lu shmem:%lu pagetables:%lu bounce:%lu\n"
+ " mapped:%lu shmem:%lu pagetables:%lu\n"
+ " sec_pagetables:%lu bounce:%lu\n"
" kernel_misc_reclaimable:%lu\n"
" free:%lu free_pcp:%lu free_cma:%lu\n",
global_node_page_state(NR_ACTIVE_ANON),
@@ -5949,6 +5950,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
global_node_page_state(NR_FILE_MAPPED),
global_node_page_state(NR_SHMEM),
global_node_page_state(NR_PAGETABLE),
+ global_node_page_state(NR_SECONDARY_PAGETABLE),
global_zone_page_state(NR_BOUNCE),
global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE),
global_zone_page_state(NR_FREE_PAGES),
@@ -5982,6 +5984,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
" shadow_call_stack:%lukB"
#endif
" pagetables:%lukB"
+ " sec_pagetables:%lukB"
" all_unreclaimable? %s"
"\n",
pgdat->node_id,
@@ -6007,6 +6010,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
node_page_state(pgdat, NR_KERNEL_SCS_KB),
#endif
K(node_page_state(pgdat, NR_PAGETABLE)),
+ K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)),
pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES ?
"yes" : "no");
}
diff --git a/mm/vmstat.c b/mm/vmstat.c
index b75b1a64b54cb..06eb52fe5be94 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1240,6 +1240,7 @@ const char * const vmstat_text[] = {
"nr_shadow_call_stack",
#endif
"nr_page_table_pages",
+ "nr_sec_page_table_pages",
#ifdef CONFIG_SWAP
"nr_swapcached",
#endif
--
2.36.1.255.ge46751e96f-goog


2022-06-10 20:04:23

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.

On Mon, Jun 6, 2022 at 3:21 PM Yosry Ahmed <[email protected]> wrote:
>
> Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g.
> KVM mmu. This provides more insights on the kernel memory used
> by a workload.
>
> This stat will be used by subsequent patches to count KVM mmu
> memory usage.
>
> Signed-off-by: Yosry Ahmed <[email protected]>

Acked-by: Shakeel Butt <[email protected]>

2022-06-13 03:27:42

by Huang, Shaoqin

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.



On 6/7/2022 6:20 AM, Yosry Ahmed wrote:
> Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g.
> KVM mmu. This provides more insights on the kernel memory used
> by a workload.
>
> This stat will be used by subsequent patches to count KVM mmu
> memory usage.
>
> Signed-off-by: Yosry Ahmed <[email protected]>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 5 +++++
> Documentation/filesystems/proc.rst | 4 ++++
> drivers/base/node.c | 2 ++
> fs/proc/meminfo.c | 2 ++
> include/linux/mmzone.h | 1 +
> mm/memcontrol.c | 1 +
> mm/page_alloc.c | 6 +++++-
> mm/vmstat.c | 1 +
> 8 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 69d7a6983f781..307a284b99189 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back.
> pagetables
> Amount of memory allocated for page tables.
>
> + sec_pagetables
> + Amount of memory allocated for secondary page tables,
> + this currently includes KVM mmu allocations on x86
> + and arm64.
> +
> percpu (npn)
> Amount of memory used for storing per-cpu kernel
> data structures.
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d99..894d6317f3bdc 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -973,6 +973,7 @@ You may not have all of these fields.
> SReclaimable: 159856 kB
> SUnreclaim: 124508 kB
> PageTables: 24448 kB
> + SecPageTables: 0 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> @@ -1067,6 +1068,9 @@ SUnreclaim
> PageTables
> amount of memory dedicated to the lowest level of page
> tables.
> +SecPageTables
> + amount of memory dedicated to secondary page tables, this
> + currently includes KVM mmu allocations on x86 and arm64.

Just a notice. This patch in the latest 5.19.0-rc2+ have a conflict in
Documentation/filesystems/proc.rst file. But that's not a problem.

> NFS_Unstable
> Always zero. Previous counted pages which had been written to
> the server, but has not been committed to stable storage.
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index ec8bb24a5a227..9fe716832546f 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -433,6 +433,7 @@ static ssize_t node_read_meminfo(struct device *dev,
> "Node %d ShadowCallStack:%8lu kB\n"
> #endif
> "Node %d PageTables: %8lu kB\n"
> + "Node %d SecPageTables: %8lu kB\n"
> "Node %d NFS_Unstable: %8lu kB\n"
> "Node %d Bounce: %8lu kB\n"
> "Node %d WritebackTmp: %8lu kB\n"
> @@ -459,6 +460,7 @@ static ssize_t node_read_meminfo(struct device *dev,
> nid, node_page_state(pgdat, NR_KERNEL_SCS_KB),
> #endif
> nid, K(node_page_state(pgdat, NR_PAGETABLE)),
> + nid, K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)),
> nid, 0UL,
> nid, K(sum_zone_node_page_state(nid, NR_BOUNCE)),
> nid, K(node_page_state(pgdat, NR_WRITEBACK_TEMP)),
> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> index 6fa761c9cc78e..fad29024eb2e0 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -108,6 +108,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> #endif
> show_val_kb(m, "PageTables: ",
> global_node_page_state(NR_PAGETABLE));
> + show_val_kb(m, "SecPageTables: ",
> + global_node_page_state(NR_SECONDARY_PAGETABLE));
>
> show_val_kb(m, "NFS_Unstable: ", 0);
> show_val_kb(m, "Bounce: ",
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 46ffab808f037..81d109e6c623a 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -219,6 +219,7 @@ enum node_stat_item {
> NR_KERNEL_SCS_KB, /* measured in KiB */
> #endif
> NR_PAGETABLE, /* used for pagetables */
> + NR_SECONDARY_PAGETABLE, /* secondary pagetables, e.g. kvm shadow pagetables */
> #ifdef CONFIG_SWAP
> NR_SWAPCACHE,
> #endif
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 598fece89e2b7..ee1c3d464857c 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1398,6 +1398,7 @@ static const struct memory_stat memory_stats[] = {
> { "kernel", MEMCG_KMEM },
> { "kernel_stack", NR_KERNEL_STACK_KB },
> { "pagetables", NR_PAGETABLE },
> + { "sec_pagetables", NR_SECONDARY_PAGETABLE },
> { "percpu", MEMCG_PERCPU_B },
> { "sock", MEMCG_SOCK },
> { "vmalloc", MEMCG_VMALLOC },
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0e42038382c12..29a7e9cd28c74 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5932,7 +5932,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
> " active_file:%lu inactive_file:%lu isolated_file:%lu\n"
> " unevictable:%lu dirty:%lu writeback:%lu\n"
> " slab_reclaimable:%lu slab_unreclaimable:%lu\n"
> - " mapped:%lu shmem:%lu pagetables:%lu bounce:%lu\n"
> + " mapped:%lu shmem:%lu pagetables:%lu\n"
> + " sec_pagetables:%lu bounce:%lu\n"
> " kernel_misc_reclaimable:%lu\n"
> " free:%lu free_pcp:%lu free_cma:%lu\n",
> global_node_page_state(NR_ACTIVE_ANON),
> @@ -5949,6 +5950,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
> global_node_page_state(NR_FILE_MAPPED),
> global_node_page_state(NR_SHMEM),
> global_node_page_state(NR_PAGETABLE),
> + global_node_page_state(NR_SECONDARY_PAGETABLE),
> global_zone_page_state(NR_BOUNCE),
> global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE),
> global_zone_page_state(NR_FREE_PAGES),
> @@ -5982,6 +5984,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
> " shadow_call_stack:%lukB"
> #endif
> " pagetables:%lukB"
> + " sec_pagetables:%lukB"
> " all_unreclaimable? %s"
> "\n",
> pgdat->node_id,
> @@ -6007,6 +6010,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
> node_page_state(pgdat, NR_KERNEL_SCS_KB),
> #endif
> K(node_page_state(pgdat, NR_PAGETABLE)),
> + K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)),
> pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES ?
> "yes" : "no");
> }
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index b75b1a64b54cb..06eb52fe5be94 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1240,6 +1240,7 @@ const char * const vmstat_text[] = {
> "nr_shadow_call_stack",
> #endif
> "nr_page_table_pages",
> + "nr_sec_page_table_pages",
> #ifdef CONFIG_SWAP
> "nr_swapcached",
> #endif

2022-06-13 19:44:56

by Yosry Ahmed

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.

On Sun, Jun 12, 2022 at 8:18 PM Huang, Shaoqin <[email protected]> wrote:
>
>
>
> On 6/7/2022 6:20 AM, Yosry Ahmed wrote:
> > Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g.
> > KVM mmu. This provides more insights on the kernel memory used
> > by a workload.
> >
> > This stat will be used by subsequent patches to count KVM mmu
> > memory usage.
> >
> > Signed-off-by: Yosry Ahmed <[email protected]>
> > ---
> > Documentation/admin-guide/cgroup-v2.rst | 5 +++++
> > Documentation/filesystems/proc.rst | 4 ++++
> > drivers/base/node.c | 2 ++
> > fs/proc/meminfo.c | 2 ++
> > include/linux/mmzone.h | 1 +
> > mm/memcontrol.c | 1 +
> > mm/page_alloc.c | 6 +++++-
> > mm/vmstat.c | 1 +
> > 8 files changed, 21 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > index 69d7a6983f781..307a284b99189 100644
> > --- a/Documentation/admin-guide/cgroup-v2.rst
> > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > @@ -1312,6 +1312,11 @@ PAGE_SIZE multiple when read back.
> > pagetables
> > Amount of memory allocated for page tables.
> >
> > + sec_pagetables
> > + Amount of memory allocated for secondary page tables,
> > + this currently includes KVM mmu allocations on x86
> > + and arm64.
> > +
> > percpu (npn)
> > Amount of memory used for storing per-cpu kernel
> > data structures.
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 061744c436d99..894d6317f3bdc 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -973,6 +973,7 @@ You may not have all of these fields.
> > SReclaimable: 159856 kB
> > SUnreclaim: 124508 kB
> > PageTables: 24448 kB
> > + SecPageTables: 0 kB
> > NFS_Unstable: 0 kB
> > Bounce: 0 kB
> > WritebackTmp: 0 kB
> > @@ -1067,6 +1068,9 @@ SUnreclaim
> > PageTables
> > amount of memory dedicated to the lowest level of page
> > tables.
> > +SecPageTables
> > + amount of memory dedicated to secondary page tables, this
> > + currently includes KVM mmu allocations on x86 and arm64.
>
> Just a notice. This patch in the latest 5.19.0-rc2+ have a conflict in
> Documentation/filesystems/proc.rst file. But that's not a problem.

Thanks for pointing this out. Let me know if a rebase and resend is necessary.

<snip>

2022-06-27 16:09:16

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.

On Mon, Jun 06, 2022, Yosry Ahmed wrote:
> Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g.
> KVM mmu. This provides more insights on the kernel memory used
> by a workload.

Please provide more justification for NR_SECONDARY_PAGETABLE in the changelog.
Specially, answer the questions that were asked in the previous version:

1. Why not piggyback NR_PAGETABLE?
2. Why a "generic" NR_SECONDARY_PAGETABLE instead of NR_VIRT_PAGETABLE?

It doesn't have to be super long, but provide enough info so that reviewers and
future readers don't need to go spelunking to understand the motivation for the
new counter type.

And it's probably worth an explicit Link to Marc's question that prompted the long
discussion in the previous version, that way if someone does want the gory details
they have a link readily available.

Link: https://lore.kernel.org/all/[email protected]

2022-06-27 16:52:02

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.

On Mon, Jun 06, 2022, Yosry Ahmed wrote:
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d99..894d6317f3bdc 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -973,6 +973,7 @@ You may not have all of these fields.
> SReclaimable: 159856 kB
> SUnreclaim: 124508 kB
> PageTables: 24448 kB
> + SecPageTables: 0 kB

If/when you rebase, this should probably use all spaces and no tabs to match the
other fields. Given that it's documentation, I'm guessing the use of spaces is
deliberate.

2022-06-27 17:06:08

by Yosry Ahmed

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.

On Mon, Jun 27, 2022 at 9:07 AM Sean Christopherson <[email protected]> wrote:
>
> On Mon, Jun 06, 2022, Yosry Ahmed wrote:
> > Add NR_SECONDARY_PAGETABLE stat to count secondary page table uses, e.g.
> > KVM mmu. This provides more insights on the kernel memory used
> > by a workload.
>
> Please provide more justification for NR_SECONDARY_PAGETABLE in the changelog.
> Specially, answer the questions that were asked in the previous version:
>
> 1. Why not piggyback NR_PAGETABLE?
> 2. Why a "generic" NR_SECONDARY_PAGETABLE instead of NR_VIRT_PAGETABLE?
>
> It doesn't have to be super long, but provide enough info so that reviewers and
> future readers don't need to go spelunking to understand the motivation for the
> new counter type.

I added such justification in the cover letter, is it better to
include it here alternatively?
or do you think the description in the cover letter is lacking?

>
> And it's probably worth an explicit Link to Marc's question that prompted the long
> discussion in the previous version, that way if someone does want the gory details
> they have a link readily available.
>
> Link: https://lore.kernel.org/all/[email protected]

I will include the link in the next version.
Thanks!