LinuxLists.cc - Re: [PATCH v2] riscv: add irq stack support

2022-03-08 07:33:59

Subject: Re: [PATCH v2] riscv: add irq stack support

On Mon, Mar 7, 2022 at 3:08 PM Jisheng Zhang <[email protected]> wrote:
>
> Currently, IRQs are still handled on the kernel stack of the current
> task on riscv platforms. If the task has a deep call stack at the time
> of interrupt, and handling the interrupt also requires a deep stack,
> it's possible to see stack overflow.
>
> Before this patch, the stack_max_size of a v5.17-rc1 kernel running on
> a lichee RV board gave:
> ~ # cat /sys/kernel/debug/tracing/stack_max_size
> 3736
>
> After this patch,
> ~ # cat /sys/kernel/debug/tracing/stack_max_size
> 3176
>
> We reduce the max kernel stack usage by 560 bytes!
>
> From another side, after this patch, it's possible to reduce the
> THREAD_SIZE to 8KB for RV64 platforms. This is especially useful for
> those systems with small memory size, e.g the Allwinner D1S platform
> which is RV64 but only has 64MB DDR.
>
> Signed-off-by: Jisheng Zhang <[email protected]>

Very nice!

> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index ed29e9c8f660..57c9b64e16a5 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -126,12 +126,39 @@ skip_context_tracking:
> */
> bge s4, zero, 1f
>
> - la ra, ret_from_exception
> + /* preserve the sp */
> + move s0, sp
>
> - /* Handle interrupts */
> move a0, sp /* pt_regs */
> +
> + /*
> + * Compare sp with the base of the task stack.
> + * If the top ~(THREAD_SIZE - 1) bits match, we are on a task stack,
> + * and should switch to the irq stack.
> + */
> + REG_L t0, TASK_STACK(tp)
> + xor t0, t0, s0
> + li t1, ~(THREAD_SIZE - 1)
> + and t0, t0, t1
> + bnez t0, 2f
> +
> + la t1, irq_stack
> + REG_L t2, TASK_TI_CPU(tp)
> + slli t2, t2, RISCV_LGPTR
> + add t1, t1, t2
> + REG_L t2, 0(t1)
> + li t1, IRQ_STACK_SIZE
> + /* switch to the irq stack */
> + add sp, t2, t1
> +
> +2:

What is the benefit of doing this in assembler? Is it measurably faster?

I see that arm64 does the same thing in C code, and it would be best to
have a common implementation for doing this, in terms of maintainability.

> +
> + for_each_possible_cpu(cpu) {
> +#ifdef CONFIG_VMAP_STACK
> + void *s = __vmalloc_node(IRQ_STACK_SIZE, THREAD_ALIGN,
> + THREADINFO_GFP, cpu_to_node(cpu),
> + __builtin_return_address(0));
> +#else
> + void *s = (void *)__get_free_pages(GFP_KERNEL, get_order(IRQ_STACK_SIZE));
> +#endif

On a related topic: is there a reason to still keep the non-VMAP_STACK
code path around? I see that it currently is optional for 64-bit with MMU,
but not available otherwise. The benefits should still outweigh the downside
(virtual address space usage mainly) on 32-bit, especially when this allows
a common implementation. Not sure about NOMMU, but I would guess
that it's not a big issue to use the same code there as well, since nommu
vmalloc just turns into a kmalloc as well.

Arnd

2022-05-15 15:13:55

by Jisheng Zhang

[permalink] [raw]

Subject: Re: [PATCH v2] riscv: add irq stack support

On Mon, Mar 07, 2022 at 08:19:35PM +0100, Arnd Bergmann wrote:
> On Mon, Mar 7, 2022 at 3:08 PM Jisheng Zhang <[email protected]> wrote:
> >
> > Currently, IRQs are still handled on the kernel stack of the current
> > task on riscv platforms. If the task has a deep call stack at the time
> > of interrupt, and handling the interrupt also requires a deep stack,
> > it's possible to see stack overflow.
> >
> > Before this patch, the stack_max_size of a v5.17-rc1 kernel running on
> > a lichee RV board gave:
> > ~ # cat /sys/kernel/debug/tracing/stack_max_size
> > 3736
> >
> > After this patch,
> > ~ # cat /sys/kernel/debug/tracing/stack_max_size
> > 3176
> >
> > We reduce the max kernel stack usage by 560 bytes!
> >
> > From another side, after this patch, it's possible to reduce the
> > THREAD_SIZE to 8KB for RV64 platforms. This is especially useful for
> > those systems with small memory size, e.g the Allwinner D1S platform
> > which is RV64 but only has 64MB DDR.
> >
> > Signed-off-by: Jisheng Zhang <[email protected]>
>
> Very nice!
>
> > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> > index ed29e9c8f660..57c9b64e16a5 100644
> > --- a/arch/riscv/kernel/entry.S
> > +++ b/arch/riscv/kernel/entry.S
> > @@ -126,12 +126,39 @@ skip_context_tracking:
> > */
> > bge s4, zero, 1f
> >
> > - la ra, ret_from_exception
> > + /* preserve the sp */
> > + move s0, sp
> >
> > - /* Handle interrupts */
> > move a0, sp /* pt_regs */
> > +
> > + /*
> > + * Compare sp with the base of the task stack.
> > + * If the top ~(THREAD_SIZE - 1) bits match, we are on a task stack,
> > + * and should switch to the irq stack.
> > + */
> > + REG_L t0, TASK_STACK(tp)
> > + xor t0, t0, s0
> > + li t1, ~(THREAD_SIZE - 1)
> > + and t0, t0, t1
> > + bnez t0, 2f
> > +
> > + la t1, irq_stack
> > + REG_L t2, TASK_TI_CPU(tp)
> > + slli t2, t2, RISCV_LGPTR
> > + add t1, t1, t2
> > + REG_L t2, 0(t1)
> > + li t1, IRQ_STACK_SIZE
> > + /* switch to the irq stack */
> > + add sp, t2, t1
> > +
> > +2:
>
> What is the benefit of doing this in assembler? Is it measurably faster?
>
> I see that arm64 does the same thing in C code, and it would be best to
> have a common implementation for doing this, in terms of maintainability.
>

Hi Arnd,

Sorry for delay. The assembler code is mainly to cal the stack ptr then
change the SP to use the stack, which equals to arm64 call_on_irq_stack()
which is implemented in assembler too.

> > +
> > + for_each_possible_cpu(cpu) {
> > +#ifdef CONFIG_VMAP_STACK
> > + void *s = __vmalloc_node(IRQ_STACK_SIZE, THREAD_ALIGN,
> > + THREADINFO_GFP, cpu_to_node(cpu),
> > + __builtin_return_address(0));
> > +#else
> > + void *s = (void *)__get_free_pages(GFP_KERNEL, get_order(IRQ_STACK_SIZE));
> > +#endif
>
> On a related topic: is there a reason to still keep the non-VMAP_STACK

irq stack is 16KB on RV64 now, vmalloc doesn't gurantee physical
continuous pages, I want to keep the stack physical continuous
characteristic for !VMAP_STACK case.

Thanks

> code path around? I see that it currently is optional for 64-bit with MMU,
> but not available otherwise. The benefits should still outweigh the downside
> (virtual address space usage mainly) on 32-bit, especially when this allows
> a common implementation. Not sure about NOMMU, but I would guess
> that it's not a big issue to use the same code there as well, since nommu
> vmalloc just turns into a kmalloc as well.
>

2022-05-23 08:49:56

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH v2] riscv: add irq stack support

On Sun, May 15, 2022 at 7:14 AM Jisheng Zhang <[email protected]> wrote:
> On Mon, Mar 07, 2022 at 08:19:35PM +0100, Arnd Bergmann wrote:
> > On Mon, Mar 7, 2022 at 3:08 PM Jisheng Zhang <[email protected]> wrote:
> > > +2:
> >
> > What is the benefit of doing this in assembler? Is it measurably faster?
> >
> > I see that arm64 does the same thing in C code, and it would be best to
> > have a common implementation for doing this, in terms of maintainability.
> >
>
> Sorry for delay. The assembler code is mainly to cal the stack ptr then
> change the SP to use the stack, which equals to arm64 call_on_irq_stack()
> which is implemented in assembler too.

I understand that you need to be in asm code to switch the stack, it
just felt that the arm64 method is a bit easier to debug here.

I suppose being able to keep using generic_handle_arch_irq() is also
beneficial, so it doesn't make much difference either way.

> > > + for_each_possible_cpu(cpu) {
> > > +#ifdef CONFIG_VMAP_STACK
> > > + void *s = __vmalloc_node(IRQ_STACK_SIZE, THREAD_ALIGN,
> > > + THREADINFO_GFP, cpu_to_node(cpu),
> > > + __builtin_return_address(0));
> > > +#else
> > > + void *s = (void *)__get_free_pages(GFP_KERNEL, get_order(IRQ_STACK_SIZE));
> > > +#endif
> >
> > On a related topic: is there a reason to still keep the non-VMAP_STACK
>
> irq stack is 16KB on RV64 now, vmalloc doesn't gurantee physical
> continuous pages, I want to keep the stack physical continuous
> characteristic for !VMAP_STACK case.

I don't understand. What is the benefit of having a physically continuous
stack? If this is required for something, you could still get that with a VMAP
stack by using alloc_pages() to allocate the stack and them using vmap() to
put it into the vmalloc range with appropriate guard pages.

I think we really want to avoid the case of missing guard pages around
the stack, and eliminate the part where the stack is in the linear map.

Arnd

2022-05-27 12:08:38

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH v2] riscv: add irq stack support

On Mon, May 23, 2022 at 10:16 AM Arnd Bergmann <[email protected]> wrote:
> > > > + for_each_possible_cpu(cpu) {
> > > > +#ifdef CONFIG_VMAP_STACK
> > > > + void *s = __vmalloc_node(IRQ_STACK_SIZE, THREAD_ALIGN,
> > > > + THREADINFO_GFP, cpu_to_node(cpu),
> > > > + __builtin_return_address(0));
> > > > +#else
> > > > + void *s = (void *)__get_free_pages(GFP_KERNEL, get_order(IRQ_STACK_SIZE));
> > > > +#endif
> > >
> > > On a related topic: is there a reason to still keep the non-VMAP_STACK
> >
> > irq stack is 16KB on RV64 now, vmalloc doesn't gurantee physical
> > continuous pages, I want to keep the stack physical continuous
> > characteristic for !VMAP_STACK case.
>
> I don't understand. What is the benefit of having a physically continuous
> stack? If this is required for something, you could still get that with a VMAP
> stack by using alloc_pages() to allocate the stack and them using vmap() to
> put it into the vmalloc range with appropriate guard pages.
>
> I think we really want to avoid the case of missing guard pages around
> the stack, and eliminate the part where the stack is in the linear map.

I was actually confused here and mixed up a few things: I thought this
was about whether to use vmap stacks unconditionally, and this is in
fact not even an architecture specific decision, it's a global option as you
are probably aware.

Since one can still turn off VMAP_STACK for normal thread stacks,
it doesn't make much of a difference whether one can do the same for
IRQ stacks. Please just ignore what I said above. I see you already
sent a modified v3, and I think either way is fine, feel free to revert back
to this method if it makes life easier.

Arnd