2023-06-14 11:28:31

by Yajun Deng

[permalink] [raw]
Subject: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

When the system boots, only one cpu is enabled before smp_init().
So the spinlock is not needed in most cases, remove it.

Add spinlock in get_nid_for_pfn() because it is after smp_init().

Signed-off-by: Yajun Deng <[email protected]>
---
drivers/base/node.c | 11 +++++++++--
mm/mm_init.c | 18 +++---------------
2 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 9de524e56307..844102570ff2 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
static int __ref get_nid_for_pfn(unsigned long pfn)
{
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
- if (system_state < SYSTEM_RUNNING)
- return early_pfn_to_nid(pfn);
+ static DEFINE_SPINLOCK(early_pfn_lock);
+ int nid;
+
+ if (system_state < SYSTEM_RUNNING) {
+ spin_lock(&early_pfn_lock);
+ nid = early_pfn_to_nid(pfn);
+ spin_unlock(&early_pfn_lock);
+ return nid;
+ }
#endif
return pfn_to_nid(pfn);
}
diff --git a/mm/mm_init.c b/mm/mm_init.c
index d393631599a7..5895a30435ee 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -584,11 +584,11 @@ struct mminit_pfnnid_cache {
static struct mminit_pfnnid_cache early_pfnnid_cache __meminitdata;

/*
- * Required by SPARSEMEM. Given a PFN, return what node the PFN is on.
+ * Given a PFN, return what node the PFN is on.
*/
-static int __meminit __early_pfn_to_nid(unsigned long pfn,
- struct mminit_pfnnid_cache *state)
+int __meminit early_pfn_to_nid(unsigned long pfn)
{
+ struct mminit_pfnnid_cache *state = &early_pfnnid_cache;
unsigned long start_pfn, end_pfn;
int nid;

@@ -601,20 +601,8 @@ static int __meminit __early_pfn_to_nid(unsigned long pfn,
state->last_end = end_pfn;
state->last_nid = nid;
}
-
- return nid;
-}
-
-int __meminit early_pfn_to_nid(unsigned long pfn)
-{
- static DEFINE_SPINLOCK(early_pfn_lock);
- int nid;
-
- spin_lock(&early_pfn_lock);
- nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
if (nid < 0)
nid = first_online_node;
- spin_unlock(&early_pfn_lock);

return nid;
}
--
2.25.1



2023-06-14 11:30:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
> When the system boots, only one cpu is enabled before smp_init().
> So the spinlock is not needed in most cases, remove it.
>
> Add spinlock in get_nid_for_pfn() because it is after smp_init().

So this is two different things at once in the same patch?

Or are they the same problem and both need to go in to solve it?

And if a spinlock is not needed at early boot, is it really causing any
problems?

>
> Signed-off-by: Yajun Deng <[email protected]>
> ---
> drivers/base/node.c | 11 +++++++++--
> mm/mm_init.c | 18 +++---------------
> 2 files changed, 12 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 9de524e56307..844102570ff2 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
> static int __ref get_nid_for_pfn(unsigned long pfn)
> {
> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
> - if (system_state < SYSTEM_RUNNING)
> - return early_pfn_to_nid(pfn);
> + static DEFINE_SPINLOCK(early_pfn_lock);
> + int nid;
> +
> + if (system_state < SYSTEM_RUNNING) {
> + spin_lock(&early_pfn_lock);
> + nid = early_pfn_to_nid(pfn);
> + spin_unlock(&early_pfn_lock);

Adding an external lock for when you call a function is VERY dangerous
as you did not document this anywhere, and there's no way to enforce it
properly at all.

Does your change actually result in any boot time changes? How was this
tested?

thanks,

greg k-h

2023-06-14 11:37:55

by Yajun Deng

[permalink] [raw]
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

June 14, 2023 7:09 PM, "Greg KH" <[email protected]> wrote:

> On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
>
>> When the system boots, only one cpu is enabled before smp_init().
>> So the spinlock is not needed in most cases, remove it.
>>
>> Add spinlock in get_nid_for_pfn() because it is after smp_init().
>
> So this is two different things at once in the same patch?
>
> Or are they the same problem and both need to go in to solve it?
>
> And if a spinlock is not needed at early boot, is it really causing any
> problems?
>

They are the same problem.
I added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only
case need to add spinlock.
This patch tested on my x86 system.


>> Signed-off-by: Yajun Deng <[email protected]>
>> ---
>> drivers/base/node.c | 11 +++++++++--
>> mm/mm_init.c | 18 +++---------------
>> 2 files changed, 12 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index 9de524e56307..844102570ff2 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
>> static int __ref get_nid_for_pfn(unsigned long pfn)
>> {
>> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> - if (system_state < SYSTEM_RUNNING)
>> - return early_pfn_to_nid(pfn);
>> + static DEFINE_SPINLOCK(early_pfn_lock);
>> + int nid;
>> +
>> + if (system_state < SYSTEM_RUNNING) {
>> + spin_lock(&early_pfn_lock);
>> + nid = early_pfn_to_nid(pfn);
>> + spin_unlock(&early_pfn_lock);
>
> Adding an external lock for when you call a function is VERY dangerous
> as you did not document this anywhere, and there's no way to enforce it
> properly at all.
>

I should add a comment before early_pfn_to_nid().

> Does your change actually result in any boot time changes? How was this
> tested?
>

Just a bit.

> thanks,
>
> greg k-h

2023-06-14 12:00:08

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

Hi,

On Wed, Jun 14, 2023 at 11:28:32AM +0000, Yajun Deng wrote:
> June 14, 2023 7:09 PM, "Greg KH" <[email protected]> wrote:
>
> > On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
> >
> >> When the system boots, only one cpu is enabled before smp_init().
> >> So the spinlock is not needed in most cases, remove it.
> >>
> >> Add spinlock in get_nid_for_pfn() because it is after smp_init().
> >
> > So this is two different things at once in the same patch?
> >
> > Or are they the same problem and both need to go in to solve it?
> >
> > And if a spinlock is not needed at early boot, is it really causing any
> > problems?
> >
>
> They are the same problem.
> I added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only
> case need to add spinlock.
> This patch tested on my x86 system.

Are you sure it'll work on !x86?

> >> Signed-off-by: Yajun Deng <[email protected]>
> >> ---
> >> drivers/base/node.c | 11 +++++++++--
> >> mm/mm_init.c | 18 +++---------------
> >> 2 files changed, 12 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/base/node.c b/drivers/base/node.c
> >> index 9de524e56307..844102570ff2 100644
> >> --- a/drivers/base/node.c
> >> +++ b/drivers/base/node.c
> >> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
> >> static int __ref get_nid_for_pfn(unsigned long pfn)
> >> {
> >> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
> >> - if (system_state < SYSTEM_RUNNING)
> >> - return early_pfn_to_nid(pfn);
> >> + static DEFINE_SPINLOCK(early_pfn_lock);
> >> + int nid;
> >> +
> >> + if (system_state < SYSTEM_RUNNING) {
> >> + spin_lock(&early_pfn_lock);
> >> + nid = early_pfn_to_nid(pfn);
> >> + spin_unlock(&early_pfn_lock);
> >
> > Adding an external lock for when you call a function is VERY dangerous
> > as you did not document this anywhere, and there's no way to enforce it
> > properly at all.
> >
>
> I should add a comment before early_pfn_to_nid().
>
> > Does your change actually result in any boot time changes? How was this
> > tested?
> >
>
> Just a bit.

Just a bit tested? Or just a bit of boot time changes?
For the latter, do you have numbers?

--
Sincerely yours,
Mike.

2023-06-15 03:27:20

by Yajun Deng

[permalink] [raw]
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

June 14, 2023 7:53 PM, "Mike Rapoport" <[email protected]> wrote:

> Hi,
>
> On Wed, Jun 14, 2023 at 11:28:32AM +0000, Yajun Deng wrote:
>
>> June 14, 2023 7:09 PM, "Greg KH" <[email protected]> wrote:
>>
>> On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
>>
>> When the system boots, only one cpu is enabled before smp_init().
>> So the spinlock is not needed in most cases, remove it.
>>
>> Add spinlock in get_nid_for_pfn() because it is after smp_init().
>>
>> So this is two different things at once in the same patch?
>>
>> Or are they the same problem and both need to go in to solve it?
>>
>> And if a spinlock is not needed at early boot, is it really causing any
>> problems?
>>
>> They are the same problem.
>> I added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only
>> case need to add spinlock.
>> This patch tested on my x86 system.
>
> Are you sure it'll work on !x86?
>

I'm probably sure of that, although I don't have a !x86 machine.

early_pfn_to_nid() is called in smp_init() and kasan_init() on
different architectures. If it works well on x86, it'll work on
!x86.


>> Signed-off-by: Yajun Deng <[email protected]>
>> ---
>> drivers/base/node.c | 11 +++++++++--
>> mm/mm_init.c | 18 +++---------------
>> 2 files changed, 12 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index 9de524e56307..844102570ff2 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
>> static int __ref get_nid_for_pfn(unsigned long pfn)
>> {
>> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> - if (system_state < SYSTEM_RUNNING)
>> - return early_pfn_to_nid(pfn);
>> + static DEFINE_SPINLOCK(early_pfn_lock);
>> + int nid;
>> +
>> + if (system_state < SYSTEM_RUNNING) {
>> + spin_lock(&early_pfn_lock);
>> + nid = early_pfn_to_nid(pfn);
>> + spin_unlock(&early_pfn_lock);
>>
>> Adding an external lock for when you call a function is VERY dangerous
>> as you did not document this anywhere, and there's no way to enforce it
>> properly at all.
>>
>> I should add a comment before early_pfn_to_nid().
>>
>> Does your change actually result in any boot time changes? How was this
>> tested?
>>
>> Just a bit.
>
> Just a bit tested? Or just a bit of boot time changes?
> For the latter, do you have numbers?
>

For the latter, the most beneficial function is memmap_init_reserved_pages(),
the boot time changes depending on whether DEFERRED_STRUCT_PAGE_INIT
is defined or not.

-->memmap_init_reserved_pages()
-->for_each_reserved_mem_range()
reserve_bootmem_region()
-->for()
init_reserved_page()
--> early_pfn_to_nid()


If define CONFIG_DEFERRED_STRUCT_PAGE_INIT:

before:
memmap_init_reserved_pages() 1.87 seconds
after:
memmap_init_reserved_pages() 1.27 seconds

32% time reduction.


If not define CONFIG_DEFERRED_STRUCT_PAGE_INIT:

early_pfn_to_nid() is called by few,
boot time didn't change.


By the way, this machine has 190GB RAM.

> --
> Sincerely yours,
> Mike.

2023-06-15 07:07:28

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

On Thu, Jun 15, 2023 at 03:02:58AM +0000, Yajun Deng wrote:
> June 14, 2023 7:53 PM, "Mike Rapoport" <[email protected]> wrote:
>
> > Hi,
> >
> > On Wed, Jun 14, 2023 at 11:28:32AM +0000, Yajun Deng wrote:
> >
> >> June 14, 2023 7:09 PM, "Greg KH" <[email protected]> wrote:
> >>
> >> On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
> >>
> >> When the system boots, only one cpu is enabled before smp_init().
> >> So the spinlock is not needed in most cases, remove it.
> >>
> >> Add spinlock in get_nid_for_pfn() because it is after smp_init().
> >>
> >> So this is two different things at once in the same patch?
> >>
> >> Or are they the same problem and both need to go in to solve it?
> >>
> >> And if a spinlock is not needed at early boot, is it really causing any
> >> problems?
> >>
> >> They are the same problem.
> >> I added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only
> >> case need to add spinlock.
> >> This patch tested on my x86 system.
> >
> > Are you sure it'll work on !x86?
> >
>
> I'm probably sure of that, although I don't have a !x86 machine.
>
> early_pfn_to_nid() is called in smp_init() and kasan_init() on
> different architectures. If it works well on x86, it'll work on
> !x86.

This is often not true. Please verify that other architectures do not call
early_pfn_to_nid() after smp_init(). The explanation why it is safe should
be a part of the changelog.

> >> Signed-off-by: Yajun Deng <[email protected]>
> >> ---
> >> drivers/base/node.c | 11 +++++++++--
> >> mm/mm_init.c | 18 +++---------------
> >> 2 files changed, 12 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/base/node.c b/drivers/base/node.c
> >> index 9de524e56307..844102570ff2 100644
> >> --- a/drivers/base/node.c
> >> +++ b/drivers/base/node.c
> >> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
> >> static int __ref get_nid_for_pfn(unsigned long pfn)
> >> {
> >> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
> >> - if (system_state < SYSTEM_RUNNING)
> >> - return early_pfn_to_nid(pfn);
> >> + static DEFINE_SPINLOCK(early_pfn_lock);
> >> + int nid;
> >> +
> >> + if (system_state < SYSTEM_RUNNING) {
> >> + spin_lock(&early_pfn_lock);
> >> + nid = early_pfn_to_nid(pfn);
> >> + spin_unlock(&early_pfn_lock);
> >>
> >> Adding an external lock for when you call a function is VERY dangerous
> >> as you did not document this anywhere, and there's no way to enforce it
> >> properly at all.
> >>
> >> I should add a comment before early_pfn_to_nid().
> >>
> >> Does your change actually result in any boot time changes? How was this
> >> tested?
> >>
> >> Just a bit.
> >
> > Just a bit tested? Or just a bit of boot time changes?
> > For the latter, do you have numbers?
> >
>
> For the latter, the most beneficial function is memmap_init_reserved_pages(),
> the boot time changes depending on whether DEFERRED_STRUCT_PAGE_INIT
> is defined or not.
>
> -->memmap_init_reserved_pages()
> -->for_each_reserved_mem_range()
> reserve_bootmem_region()
> -->for()
> init_reserved_page()
> --> early_pfn_to_nid()

A better solution would be to pass nid to reserve_bootmem_range() and drop
the call to early_pfn_to_nid() in init_reserved_page().

Then there won't be lock contention and no need for fragile changes in the
locking.

> If define CONFIG_DEFERRED_STRUCT_PAGE_INIT:
>
> before:
> memmap_init_reserved_pages() 1.87 seconds
> after:
> memmap_init_reserved_pages() 1.27 seconds
>
> 32% time reduction.

These measurements should be part of the changelog.

> If not define CONFIG_DEFERRED_STRUCT_PAGE_INIT:
>
> early_pfn_to_nid() is called by few,
> boot time didn't change.
>
>
> By the way, this machine has 190GB RAM.
>
> > --
> > Sincerely yours,
> > Mike.

--
Sincerely yours,
Mike.

2023-06-15 07:12:29

by Yajun Deng

[permalink] [raw]
Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid()

June 15, 2023 2:20 PM, "Mike Rapoport" <[email protected]> wrote:

> On Thu, Jun 15, 2023 at 03:02:58AM +0000, Yajun Deng wrote:
>
>> June 14, 2023 7:53 PM, "Mike Rapoport" <[email protected]> wrote:
>>
>> Hi,
>>
>> On Wed, Jun 14, 2023 at 11:28:32AM +0000, Yajun Deng wrote:
>>
>> June 14, 2023 7:09 PM, "Greg KH" <[email protected]> wrote:
>>
>> On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng wrote:
>>
>> When the system boots, only one cpu is enabled before smp_init().
>> So the spinlock is not needed in most cases, remove it.
>>
>> Add spinlock in get_nid_for_pfn() because it is after smp_init().
>>
>> So this is two different things at once in the same patch?
>>
>> Or are they the same problem and both need to go in to solve it?
>>
>> And if a spinlock is not needed at early boot, is it really causing any
>> problems?
>>
>> They are the same problem.
>> I added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only
>> case need to add spinlock.
>> This patch tested on my x86 system.
>>
>> Are you sure it'll work on !x86?
>>
>> I'm probably sure of that, although I don't have a !x86 machine.
>>
>> early_pfn_to_nid() is called in smp_init() and kasan_init() on
>> different architectures. If it works well on x86, it'll work on
>> !x86.
>
> This is often not true. Please verify that other architectures do not call
> early_pfn_to_nid() after smp_init(). The explanation why it is safe should
> be a part of the changelog.
>
>> Signed-off-by: Yajun Deng <[email protected]>
>> ---
>> drivers/base/node.c | 11 +++++++++--
>> mm/mm_init.c | 18 +++---------------
>> 2 files changed, 12 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index 9de524e56307..844102570ff2 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -748,8 +748,15 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
>> static int __ref get_nid_for_pfn(unsigned long pfn)
>> {
>> #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> - if (system_state < SYSTEM_RUNNING)
>> - return early_pfn_to_nid(pfn);
>> + static DEFINE_SPINLOCK(early_pfn_lock);
>> + int nid;
>> +
>> + if (system_state < SYSTEM_RUNNING) {
>> + spin_lock(&early_pfn_lock);
>> + nid = early_pfn_to_nid(pfn);
>> + spin_unlock(&early_pfn_lock);
>>
>> Adding an external lock for when you call a function is VERY dangerous
>> as you did not document this anywhere, and there's no way to enforce it
>> properly at all.
>>
>> I should add a comment before early_pfn_to_nid().
>>
>> Does your change actually result in any boot time changes? How was this
>> tested?
>>
>> Just a bit.
>>
>> Just a bit tested? Or just a bit of boot time changes?
>> For the latter, do you have numbers?
>>
>> For the latter, the most beneficial function is memmap_init_reserved_pages(),
>> the boot time changes depending on whether DEFERRED_STRUCT_PAGE_INIT
>> is defined or not.
>>
>> -->memmap_init_reserved_pages()
>> -->for_each_reserved_mem_range()
>> reserve_bootmem_region()
>> -->for()
>> init_reserved_page()
>> --> early_pfn_to_nid()
>
> A better solution would be to pass nid to reserve_bootmem_range() and drop
> the call to early_pfn_to_nid() in init_reserved_page().
>
> Then there won't be lock contention and no need for fragile changes in the
> locking.
>

Great, I will try it.


>> If define CONFIG_DEFERRED_STRUCT_PAGE_INIT:
>>
>> before:
>> memmap_init_reserved_pages() 1.87 seconds
>> after:
>> memmap_init_reserved_pages() 1.27 seconds
>>
>> 32% time reduction.
>
> These measurements should be part of the changelog.
>
>> If not define CONFIG_DEFERRED_STRUCT_PAGE_INIT:
>>
>> early_pfn_to_nid() is called by few,
>> boot time didn't change.
>>
>> By the way, this machine has 190GB RAM.
>>
>> --
>> Sincerely yours,
>> Mike.
>
> --
> Sincerely yours,
> Mike.