2023-06-15 14:54:11

by Liam Ni

[permalink] [raw]
Subject: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

In a previous implementation,The ram page is calculated
by counting the number of holes,
however,the number of ram pages is calculated during hole calculation.
Therefore,we can directly calculate the amount of ram pages.

Signed-off-by: Liam Ni <[email protected]>
---
arch/x86/mm/numa.c | 4 ++--
include/linux/mm.h | 4 ++++
mm/mm_init.c | 33 +++++++++++++++++++++++++++++++++
3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 2aadb2019b4f..8ea0e956e3d7 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -461,12 +461,12 @@ static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
u64 s = mi->blk[i].start >> PAGE_SHIFT;
u64 e = mi->blk[i].end >> PAGE_SHIFT;
numaram += e - s;
- numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
+ numaram += __available_pages_in_range(mi->blk[i].nid, s, e);
if ((s64)numaram < 0)
numaram = 0;
}

- e820ram = max_pfn - absent_pages_in_range(0, max_pfn);
+ e820ram = available_pages_in_range(0, max_pfn);

/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
if ((s64)(e820ram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0daef3f2f029..f7f8c3476d1e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3041,8 +3041,12 @@ void free_area_init(unsigned long *max_zone_pfn);
unsigned long node_map_pfn_alignment(void);
unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn,
unsigned long end_pfn);
+unsigned long __available_pages_in_range(int nid, unsigned long start_pfn,
+ unsigned long end_pfn);
extern unsigned long absent_pages_in_range(unsigned long start_pfn,
unsigned long end_pfn);
+extern unsigned long available_pages_in_range(unsigned long start_pfn,
+ unsigned long end_pfn);
extern void get_pfn_range_for_nid(unsigned int nid,
unsigned long *start_pfn, unsigned long *end_pfn);

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 3ddd18a89b66..33451afbc040 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1165,6 +1165,39 @@ unsigned long __init absent_pages_in_range(unsigned long start_pfn,
return __absent_pages_in_range(MAX_NUMNODES, start_pfn, end_pfn);
}

+/*
+ * Return the number of avalible pages in a range on a node. If nid is MAX_NUMNODES,
+ * then all avalible pages in the requested range will be accounted for.
+ */
+unsigned long __init __available_pages_in_range(int nid,
+ unsigned long range_start_pfn,
+ unsigned long range_end_pfn)
+{
+ unsigned long nr_avalible;
+ unsigned long start_pfn, end_pfn;
+ int i;
+
+ for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
+ start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
+ end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
+ nr_avalible += end_pfn - start_pfn;
+ }
+ return nr_avalible;
+}
+
+/**
+ * available_pages_in_range - Return the number of available page frames within a range
+ * @start_pfn: The start PFN to start searching for holes
+ * @end_pfn: The end PFN to stop searching for holes
+ *
+ * Return: the number of available pages frames within a range.
+ */
+unsigned long __init available_pages_in_range(unsigned long start_pfn,
+ unsigned long end_pfn)
+{
+ return __available_pages_in_range(MAX_NUMNODES, start_pfn, end_pfn);
+}
+
/* Return the number of page frames in holes in a zone on a node */
static unsigned long __init zone_absent_pages_in_node(int nid,
unsigned long zone_type,
--
2.25.1



2023-06-15 17:28:58

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

Hi,

On Fri, Jun 16, 2023 at 12:20:16AM +1000, Liam Ni wrote:
> In a previous implementation,The ram page is calculated
> by counting the number of holes,
> however,the number of ram pages is calculated during hole calculation.
> Therefore,we can directly calculate the amount of ram pages.
>
> Signed-off-by: Liam Ni <[email protected]>
> ---
> arch/x86/mm/numa.c | 4 ++--
> include/linux/mm.h | 4 ++++
> mm/mm_init.c | 33 +++++++++++++++++++++++++++++++++
> 3 files changed, 39 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index 2aadb2019b4f..8ea0e956e3d7 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -461,12 +461,12 @@ static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> u64 s = mi->blk[i].start >> PAGE_SHIFT;
> u64 e = mi->blk[i].end >> PAGE_SHIFT;
> numaram += e - s;
> - numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> + numaram += __available_pages_in_range(mi->blk[i].nid, s, e);

This is wrong. You add number of pages in range core MM sees to the number
of pages covered by the numa_meminfo.

More generally, rather than traverse all the numa_meminfo's and for each of
them traverse all the regions in memblock.memory it's enough to count
memory in memblock that doesn't have the node assigned and compare it to
memblock_phys_mem_size().

> if ((s64)numaram < 0)
> numaram = 0;
> }
>
> - e820ram = max_pfn - absent_pages_in_range(0, max_pfn);
> + e820ram = available_pages_in_range(0, max_pfn);
>
> /* We seem to lose 3 pages somewhere. Allow 1M of slack. */
> if ((s64)(e820ram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0daef3f2f029..f7f8c3476d1e 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3041,8 +3041,12 @@ void free_area_init(unsigned long *max_zone_pfn);
> unsigned long node_map_pfn_alignment(void);
> unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn,
> unsigned long end_pfn);
> +unsigned long __available_pages_in_range(int nid, unsigned long start_pfn,
> + unsigned long end_pfn);
> extern unsigned long absent_pages_in_range(unsigned long start_pfn,
> unsigned long end_pfn);
> +extern unsigned long available_pages_in_range(unsigned long start_pfn,
> + unsigned long end_pfn);
> extern void get_pfn_range_for_nid(unsigned int nid,
> unsigned long *start_pfn, unsigned long *end_pfn);
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 3ddd18a89b66..33451afbc040 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1165,6 +1165,39 @@ unsigned long __init absent_pages_in_range(unsigned long start_pfn,
> return __absent_pages_in_range(MAX_NUMNODES, start_pfn, end_pfn);
> }
>
> +/*
> + * Return the number of avalible pages in a range on a node. If nid is MAX_NUMNODES,
> + * then all avalible pages in the requested range will be accounted for.
> + */
> +unsigned long __init __available_pages_in_range(int nid,
> + unsigned long range_start_pfn,
> + unsigned long range_end_pfn)
> +{
> + unsigned long nr_avalible;
> + unsigned long start_pfn, end_pfn;
> + int i;
> +
> + for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
> + start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
> + end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
> + nr_avalible += end_pfn - start_pfn;
> + }
> + return nr_avalible;
> +}
> +
> +/**
> + * available_pages_in_range - Return the number of available page frames within a range
> + * @start_pfn: The start PFN to start searching for holes
> + * @end_pfn: The end PFN to stop searching for holes
> + *
> + * Return: the number of available pages frames within a range.
> + */
> +unsigned long __init available_pages_in_range(unsigned long start_pfn,
> + unsigned long end_pfn)
> +{
> + return __available_pages_in_range(MAX_NUMNODES, start_pfn, end_pfn);
> +}
> +
> /* Return the number of page frames in holes in a zone on a node */
> static unsigned long __init zone_absent_pages_in_node(int nid,
> unsigned long zone_type,
> --
> 2.25.1
>

--
Sincerely yours,
Mike.

2023-06-17 11:11:01

by Liam Ni

[permalink] [raw]
Subject: Re: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

Hi,

On Fri, 16 Jun 2023 at 01:00, Mike Rapoport <[email protected]> wrote:
>
> Hi,
>
> On Fri, Jun 16, 2023 at 12:20:16AM +1000, Liam Ni wrote:
> > In a previous implementation,The ram page is calculated
> > by counting the number of holes,
> > however,the number of ram pages is calculated during hole calculation.
> > Therefore,we can directly calculate the amount of ram pages.
> >
> > Signed-off-by: Liam Ni <[email protected]>
> > ---
> > arch/x86/mm/numa.c | 4 ++--
> > include/linux/mm.h | 4 ++++
> > mm/mm_init.c | 33 +++++++++++++++++++++++++++++++++
> > 3 files changed, 39 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > index 2aadb2019b4f..8ea0e956e3d7 100644
> > --- a/arch/x86/mm/numa.c
> > +++ b/arch/x86/mm/numa.c
> > @@ -461,12 +461,12 @@ static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> > u64 s = mi->blk[i].start >> PAGE_SHIFT;
> > u64 e = mi->blk[i].end >> PAGE_SHIFT;
> > numaram += e - s;
> > - numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> > + numaram += __available_pages_in_range(mi->blk[i].nid, s, e);
>
> This is wrong. You add number of pages in range core MM sees to the number
> of pages covered by the numa_meminfo.
>
> More generally, rather than traverse all the numa_meminfo's and for each of
> them traverse all the regions in memblock.memory it's enough to count
> memory in memblock that doesn't have the node assigned and compare it to
> memblock_phys_mem_size().

Logic like below?
static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
{
u64 error_value;
error_value = pages_without_nid_in_range(0,max_pfn);
if ((error_value ) >= (1 << (20 - PAGE_SHIFT))) {
//print error information
return false;
}
return true;
}

I can't figure out why compare it with memblock_phys_mem_size().
The number of pages in memblock that doesn't have the node
assigned,which also means that these pages are not in numa_info.
So these pages can represent the number of lose pages.

Thanks
Liam Ni

2023-06-18 07:50:50

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

On Sat, Jun 17, 2023 at 06:47:11PM +0800, Liam Ni wrote:
> On Fri, 16 Jun 2023 at 01:00, Mike Rapoport <[email protected]> wrote:
> >
> > On Fri, Jun 16, 2023 at 12:20:16AM +1000, Liam Ni wrote:
> > > In a previous implementation,The ram page is calculated
> > > by counting the number of holes,
> > > however,the number of ram pages is calculated during hole calculation.
> > > Therefore,we can directly calculate the amount of ram pages.
> > >
> > > Signed-off-by: Liam Ni <[email protected]>
> > > ---
> > > arch/x86/mm/numa.c | 4 ++--
> > > include/linux/mm.h | 4 ++++
> > > mm/mm_init.c | 33 +++++++++++++++++++++++++++++++++
> > > 3 files changed, 39 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > > index 2aadb2019b4f..8ea0e956e3d7 100644
> > > --- a/arch/x86/mm/numa.c
> > > +++ b/arch/x86/mm/numa.c
> > > @@ -461,12 +461,12 @@ static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> > > u64 s = mi->blk[i].start >> PAGE_SHIFT;
> > > u64 e = mi->blk[i].end >> PAGE_SHIFT;
> > > numaram += e - s;
> > > - numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> > > + numaram += __available_pages_in_range(mi->blk[i].nid, s, e);
> >
> > This is wrong. You add number of pages in range core MM sees to the number
> > of pages covered by the numa_meminfo.
> >
> > More generally, rather than traverse all the numa_meminfo's and for each of
> > them traverse all the regions in memblock.memory it's enough to count
> > memory in memblock that doesn't have the node assigned and compare it to
> > memblock_phys_mem_size().
>
> Logic like below?
> static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> {
> u64 error_value;
> error_value = pages_without_nid_in_range(0,max_pfn);
> if ((error_value ) >= (1 << (20 - PAGE_SHIFT))) {
> //print error information
> return false;
> }
> return true;
> }
>
> I can't figure out why compare it with memblock_phys_mem_size().
> The number of pages in memblock that doesn't have the node
> assigned,which also means that these pages are not in numa_info.
> So these pages can represent the number of lose pages.

Yes, there is no need to compare with memblock_phys_mem_size() if you count
pages that don't have nid set in memblock.

> Thanks
> Liam Ni

--
Sincerely yours,
Mike.

2023-06-18 07:58:25

by Liam Ni

[permalink] [raw]
Subject: Re: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

Hi,

On Sun, 18 Jun 2023 at 15:09, Mike Rapoport <[email protected]> wrote:
>
> On Sat, Jun 17, 2023 at 06:47:11PM +0800, Liam Ni wrote:
> > On Fri, 16 Jun 2023 at 01:00, Mike Rapoport <[email protected]> wrote:
> > >
> > > On Fri, Jun 16, 2023 at 12:20:16AM +1000, Liam Ni wrote:
> > > > In a previous implementation,The ram page is calculated
> > > > by counting the number of holes,
> > > > however,the number of ram pages is calculated during hole calculation.
> > > > Therefore,we can directly calculate the amount of ram pages.
> > > >
> > > > Signed-off-by: Liam Ni <[email protected]>
> > > > ---
> > > > arch/x86/mm/numa.c | 4 ++--
> > > > include/linux/mm.h | 4 ++++
> > > > mm/mm_init.c | 33 +++++++++++++++++++++++++++++++++
> > > > 3 files changed, 39 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > > > index 2aadb2019b4f..8ea0e956e3d7 100644
> > > > --- a/arch/x86/mm/numa.c
> > > > +++ b/arch/x86/mm/numa.c
> > > > @@ -461,12 +461,12 @@ static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> > > > u64 s = mi->blk[i].start >> PAGE_SHIFT;
> > > > u64 e = mi->blk[i].end >> PAGE_SHIFT;
> > > > numaram += e - s;
> > > > - numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> > > > + numaram += __available_pages_in_range(mi->blk[i].nid, s, e);
> > >
> > > This is wrong. You add number of pages in range core MM sees to the number
> > > of pages covered by the numa_meminfo.
> > >
> > > More generally, rather than traverse all the numa_meminfo's and for each of
> > > them traverse all the regions in memblock.memory it's enough to count
> > > memory in memblock that doesn't have the node assigned and compare it to
> > > memblock_phys_mem_size().
> >
> > Logic like below?
> > static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> > {
> > u64 error_value;
> > error_value = pages_without_nid_in_range(0,max_pfn);
> > if ((error_value ) >= (1 << (20 - PAGE_SHIFT))) {
> > //print error information
> > return false;
> > }
> > return true;
> > }
> >
> > I can't figure out why compare it with memblock_phys_mem_size().
> > The number of pages in memblock that doesn't have the node
> > assigned,which also means that these pages are not in numa_info.
> > So these pages can represent the number of lose pages.
>
> Yes, there is no need to compare with memblock_phys_mem_size() if you count
> pages that don't have nid set in memblock.

Got it,i will send V2.

Liam.
>
> --
> Sincerely yours,
> Mike.

2023-06-21 18:04:25

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

On 6/15/23 07:20, Liam Ni wrote:
> In a previous implementation,The ram page is calculated
> by counting the number of holes,
> however,the number of ram pages is calculated during hole calculation.
> Therefore,we can directly calculate the amount of ram pages.

What problem does this patch solve?

2023-06-22 06:24:43

by Liam Ni

[permalink] [raw]
Subject: Re: [PATCH] x86,NUMA:Get the number of ram pages directly in numa_meminfo_cover_memory()

On Thu, 22 Jun 2023 at 01:49, Dave Hansen <[email protected]> wrote:
>
> On 6/15/23 07:20, Liam Ni wrote:
> > In a previous implementation,The ram page is calculated
> > by counting the number of holes,
> > however,the number of ram pages is calculated during hole calculation.
> > Therefore,we can directly calculate the amount of ram pages.
>
> What problem does this patch solve?

I has send patch V2 to improve the execution efficiency of
numa_meminfo_cover_memory()
by optimize the way of calculating missing pages.

patch V2:
https://lore.kernel.org/all/[email protected]/

In the previous implementation, We calculate missing pages as follows:
1. calculate numaram by traverse all the numa_meminfo's and for each of
them traverse all the regions in memblock.memory to prepare for
counting missing pages.

2. Traverse all the regions in memblock.memory again to get e820ram.

3. the missing page is (e820ram - numaram )

But,it's enough to count memory in memblock.memory that doesn't have
the node assigned.
we only need to traverse memblock.memory **once**.