2023-11-16 17:18:55

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] arm64: irq: set the correct node for VMAP stack

On Tue, Nov 14, 2023 at 05:16:43PM +0800, Huang Shijie wrote:
> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
> index 6ad5c6ef5329..e62d3cb3f74c 100644
> --- a/arch/arm64/kernel/irq.c
> +++ b/arch/arm64/kernel/irq.c
> @@ -57,7 +57,7 @@ static void init_irq_stacks(void)
> unsigned long *p;
>
> for_each_possible_cpu(cpu) {
> - p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
> + p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
> per_cpu(irq_stack_ptr, cpu) = p;
> }
> }

This looks alright to me, I don't have a better suggestion. The generic
code already has the cpu_to_node_map[] array populated by
early_map_cpu_to_node(), so let's reuse it.

> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> index eaa31e567d1e..90519d981471 100644
> --- a/drivers/base/arch_numa.c
> +++ b/drivers/base/arch_numa.c
> @@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
> unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
> EXPORT_SYMBOL(__per_cpu_offset);
>
> -static int __init early_cpu_to_node(int cpu)
> +int early_cpu_to_node(int cpu)
> {
> return cpu_to_node_map[cpu];
> }
> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
> index 1a3ad6d29833..fc8a9bd6a444 100644
> --- a/include/asm-generic/numa.h
> +++ b/include/asm-generic/numa.h
> @@ -38,6 +38,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid);
> void numa_store_cpu_info(unsigned int cpu);
> void numa_add_cpu(unsigned int cpu);
> void numa_remove_cpu(unsigned int cpu);
> +int early_cpu_to_node(int cpu);

Here I'd move this just below early_map_cpu_to_node() and, for
completeness, also add the dummy static inline for the !NUMA case.

--
Catalin


2023-11-17 02:51:33

by Shijie Huang

[permalink] [raw]
Subject: Re: [PATCH] arm64: irq: set the correct node for VMAP stack

Hi Catalin,

在 2023/11/17 1:18, Catalin Marinas 写道:
> On Tue, Nov 14, 2023 at 05:16:43PM +0800, Huang Shijie wrote:
>> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
>> index 6ad5c6ef5329..e62d3cb3f74c 100644
>> --- a/arch/arm64/kernel/irq.c
>> +++ b/arch/arm64/kernel/irq.c
>> @@ -57,7 +57,7 @@ static void init_irq_stacks(void)
>> unsigned long *p;
>>
>> for_each_possible_cpu(cpu) {
>> - p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
>> + p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
>> per_cpu(irq_stack_ptr, cpu) = p;
>> }
>> }
> This looks alright to me, I don't have a better suggestion. The generic
> code already has the cpu_to_node_map[] array populated by
> early_map_cpu_to_node(), so let's reuse it.
>
>> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
>> index eaa31e567d1e..90519d981471 100644
>> --- a/drivers/base/arch_numa.c
>> +++ b/drivers/base/arch_numa.c
>> @@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
>> unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
>> EXPORT_SYMBOL(__per_cpu_offset);
>>
>> -static int __init early_cpu_to_node(int cpu)
>> +int early_cpu_to_node(int cpu)
>> {
>> return cpu_to_node_map[cpu];
>> }
>> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
>> index 1a3ad6d29833..fc8a9bd6a444 100644
>> --- a/include/asm-generic/numa.h
>> +++ b/include/asm-generic/numa.h
>> @@ -38,6 +38,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid);
>> void numa_store_cpu_info(unsigned int cpu);
>> void numa_add_cpu(unsigned int cpu);
>> void numa_remove_cpu(unsigned int cpu);
>> +int early_cpu_to_node(int cpu);
> Here I'd move this just below early_map_cpu_to_node() and, for
> completeness, also add the dummy static inline for the !NUMA case.

Thanks a lot.  It seems there is no need for me to send the V2 for this.


Thanks

Huang Shijie


2023-11-18 15:52:34

by Huang Shijie

[permalink] [raw]
Subject: [PATCH v2] arm64: irq: set the correct node for VMAP stack

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
arch_call_rest_init() --> rest_init() -- kernel_init()
--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by exporting the early_cpu_to_node(), and use it
in the init_irq_stacks().

Signed-off-by: Huang Shijie <[email protected]>
---
v1 --> v2:
fix the !NUMA compiling error.

---
arch/arm64/kernel/irq.c | 3 ++-
drivers/base/arch_numa.c | 2 +-
include/asm-generic/numa.h | 2 ++
3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..5226030979ae 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -25,6 +25,7 @@
#include <asm/softirq_stack.h>
#include <asm/stacktrace.h>
#include <asm/vmap_stack.h>
+#include <asm/numa.h>

/* Only access this in an NMI enter/exit */
DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts);
@@ -57,7 +58,7 @@ static void init_irq_stacks(void)
unsigned long *p;

for_each_possible_cpu(cpu) {
- p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+ p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
per_cpu(irq_stack_ptr, cpu) = p;
}
}
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..90519d981471 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
EXPORT_SYMBOL(__per_cpu_offset);

-static int __init early_cpu_to_node(int cpu)
+int early_cpu_to_node(int cpu)
{
return cpu_to_node_map[cpu];
}
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..16073111bffc 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
void __init numa_set_distance(int from, int to, int distance);
void __init numa_free_distance(void);
void __init early_map_cpu_to_node(unsigned int cpu, int nid);
+int early_cpu_to_node(int cpu);
void numa_store_cpu_info(unsigned int cpu);
void numa_add_cpu(unsigned int cpu);
void numa_remove_cpu(unsigned int cpu);
@@ -46,6 +47,7 @@ static inline void numa_add_cpu(unsigned int cpu) { }
static inline void numa_remove_cpu(unsigned int cpu) { }
static inline void arch_numa_init(void) { }
static inline void early_map_cpu_to_node(unsigned int cpu, int nid) { }
+static inline int early_cpu_to_node(int cpu) { return 0; }

#endif /* CONFIG_NUMA */

--
2.40.1

2023-11-18 16:03:10

by Huang Shijie

[permalink] [raw]
Subject: [PATCH v3] arm64: irq: set the correct node for VMAP stack

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
arch_call_rest_init() --> rest_init() -- kernel_init()
--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by exporting the early_cpu_to_node(), and use it
in the init_irq_stacks().

Signed-off-by: Huang Shijie <[email protected]>
---
v2 --> v3:
move the "numa.h" to the right position.
---
arch/arm64/kernel/irq.c | 3 ++-
drivers/base/arch_numa.c | 2 +-
include/asm-generic/numa.h | 2 ++
3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..d9ee14723478 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -22,6 +22,7 @@
#include <linux/vmalloc.h>
#include <asm/daifflags.h>
#include <asm/exception.h>
+#include <asm/numa.h>
#include <asm/softirq_stack.h>
#include <asm/stacktrace.h>
#include <asm/vmap_stack.h>
@@ -57,7 +58,7 @@ static void init_irq_stacks(void)
unsigned long *p;

for_each_possible_cpu(cpu) {
- p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+ p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
per_cpu(irq_stack_ptr, cpu) = p;
}
}
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..90519d981471 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
EXPORT_SYMBOL(__per_cpu_offset);

-static int __init early_cpu_to_node(int cpu)
+int early_cpu_to_node(int cpu)
{
return cpu_to_node_map[cpu];
}
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..16073111bffc 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
void __init numa_set_distance(int from, int to, int distance);
void __init numa_free_distance(void);
void __init early_map_cpu_to_node(unsigned int cpu, int nid);
+int early_cpu_to_node(int cpu);
void numa_store_cpu_info(unsigned int cpu);
void numa_add_cpu(unsigned int cpu);
void numa_remove_cpu(unsigned int cpu);
@@ -46,6 +47,7 @@ static inline void numa_add_cpu(unsigned int cpu) { }
static inline void numa_remove_cpu(unsigned int cpu) { }
static inline void arch_numa_init(void) { }
static inline void early_map_cpu_to_node(unsigned int cpu, int nid) { }
+static inline int early_cpu_to_node(int cpu) { return 0; }

#endif /* CONFIG_NUMA */

--
2.40.1

2023-11-23 16:55:43

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v3] arm64: irq: set the correct node for VMAP stack

On Sun, Nov 19, 2023 at 12:02:05AM +0800, Huang Shijie wrote:
> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> index eaa31e567d1e..90519d981471 100644
> --- a/drivers/base/arch_numa.c
> +++ b/drivers/base/arch_numa.c
> @@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
> unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
> EXPORT_SYMBOL(__per_cpu_offset);
>
> -static int __init early_cpu_to_node(int cpu)
> +int early_cpu_to_node(int cpu)
> {
> return cpu_to_node_map[cpu];
> }

I don't think we need this change, let's make the arm64
init_irq_stacks() an __init function instead.

> diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
> index 1a3ad6d29833..16073111bffc 100644
> --- a/include/asm-generic/numa.h
> +++ b/include/asm-generic/numa.h
> @@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
> void __init numa_set_distance(int from, int to, int distance);
> void __init numa_free_distance(void);
> void __init early_map_cpu_to_node(unsigned int cpu, int nid);
> +int early_cpu_to_node(int cpu);

And add __init here.

With these changes:

Reviewed-by: Catalin Marinas <[email protected]>

Happy to take this patch through the arm64 tree if I get an ack from
Greg or Rafael on the drivers/* change.

2023-11-24 03:16:32

by Huang Shijie

[permalink] [raw]
Subject: [PATCH v4] arm64: irq: set the correct node for VMAP stack

In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
arch_call_rest_init() --> rest_init() -- kernel_init()
--> kernel_init_freeable() --> smp_prepare_cpus()

But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().

So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.

This patch fixes it by:
1.) export the early_cpu_to_node(), and use it in the init_irq_stacks().
2.) change init_irq_stacks() to __init function.

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Huang Shijie <[email protected]>
---
v3 --> v4:
1.) keep early_cpu_to_node() as __init function.
2.) change init_irq_stacks() to __init function.

---
arch/arm64/kernel/irq.c | 5 +++--
drivers/base/arch_numa.c | 2 +-
include/asm-generic/numa.h | 2 ++
3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 6ad5c6ef5329..9f253d8efe90 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -22,6 +22,7 @@
#include <linux/vmalloc.h>
#include <asm/daifflags.h>
#include <asm/exception.h>
+#include <asm/numa.h>
#include <asm/softirq_stack.h>
#include <asm/stacktrace.h>
#include <asm/vmap_stack.h>
@@ -51,13 +52,13 @@ static void init_irq_scs(void)
}

#ifdef CONFIG_VMAP_STACK
-static void init_irq_stacks(void)
+static void __init init_irq_stacks(void)
{
int cpu;
unsigned long *p;

for_each_possible_cpu(cpu) {
- p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+ p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, early_cpu_to_node(cpu));
per_cpu(irq_stack_ptr, cpu) = p;
}
}
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index eaa31e567d1e..5b59d133b6af 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -144,7 +144,7 @@ void __init early_map_cpu_to_node(unsigned int cpu, int nid)
unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
EXPORT_SYMBOL(__per_cpu_offset);

-static int __init early_cpu_to_node(int cpu)
+int __init early_cpu_to_node(int cpu)
{
return cpu_to_node_map[cpu];
}
diff --git a/include/asm-generic/numa.h b/include/asm-generic/numa.h
index 1a3ad6d29833..c32e0cf23c90 100644
--- a/include/asm-generic/numa.h
+++ b/include/asm-generic/numa.h
@@ -35,6 +35,7 @@ int __init numa_add_memblk(int nodeid, u64 start, u64 end);
void __init numa_set_distance(int from, int to, int distance);
void __init numa_free_distance(void);
void __init early_map_cpu_to_node(unsigned int cpu, int nid);
+int __init early_cpu_to_node(int cpu);
void numa_store_cpu_info(unsigned int cpu);
void numa_add_cpu(unsigned int cpu);
void numa_remove_cpu(unsigned int cpu);
@@ -46,6 +47,7 @@ static inline void numa_add_cpu(unsigned int cpu) { }
static inline void numa_remove_cpu(unsigned int cpu) { }
static inline void arch_numa_init(void) { }
static inline void early_map_cpu_to_node(unsigned int cpu, int nid) { }
+static inline int early_cpu_to_node(int cpu) { return 0; }

#endif /* CONFIG_NUMA */

--
2.40.1

2023-11-24 11:49:53

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v4] arm64: irq: set the correct node for VMAP stack

On Fri, Nov 24, 2023 at 11:15:13AM +0800, Huang Shijie wrote:
> In current code, init_irq_stacks() will call cpu_to_node().
> The cpu_to_node() depends on percpu "numa_node" which is initialized in:
> arch_call_rest_init() --> rest_init() -- kernel_init()
> --> kernel_init_freeable() --> smp_prepare_cpus()
>
> But init_irq_stacks() is called in init_IRQ() which is before
> arch_call_rest_init().
>
> So in init_irq_stacks(), the cpu_to_node() does not work, it
> always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
> is in the node 0.
>
> This patch fixes it by:
> 1.) export the early_cpu_to_node(), and use it in the init_irq_stacks().
> 2.) change init_irq_stacks() to __init function.
>
> Reviewed-by: Catalin Marinas <[email protected]>
> Signed-off-by: Huang Shijie <[email protected]>
> ---
> v3 --> v4:
> 1.) keep early_cpu_to_node() as __init function.
> 2.) change init_irq_stacks() to __init function.
>
> ---
> arch/arm64/kernel/irq.c | 5 +++--
> drivers/base/arch_numa.c | 2 +-
> include/asm-generic/numa.h | 2 ++
> 3 files changed, 6 insertions(+), 3 deletions(-)

Greg, Rafael - any objections to taking this patch through the arm64
tree?

--
Catalin

2023-12-05 15:18:01

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4] arm64: irq: set the correct node for VMAP stack

On Fri, 24 Nov 2023 11:15:13 +0800, Huang Shijie wrote:
> In current code, init_irq_stacks() will call cpu_to_node().
> The cpu_to_node() depends on percpu "numa_node" which is initialized in:
> arch_call_rest_init() --> rest_init() -- kernel_init()
> --> kernel_init_freeable() --> smp_prepare_cpus()
>
> But init_irq_stacks() is called in init_IRQ() which is before
> arch_call_rest_init().
>
> [...]

Applied to arm64 (for-next/mm), thanks!

[1/1] arm64: irq: set the correct node for VMAP stack
https://git.kernel.org/arm64/c/75b5e0bf90bf

Cheers,
--
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev