2020-06-24 09:06:04

by Greentime Hu

[permalink] [raw]
Subject: [PATCH v2 0/2] riscv: Add context tracker suppor

This patchset adds support for irq_work via self IPI and context tracking.
It is tested in qemu-system-riscv64 and SiFive HiFive Unleashed board based
on v5.8-rc2.

---
Changes in v2
- Fix the compiling warning

Greentime Hu (2):
riscv: Support irq_work via self IPIs
riscv: Enable context tracking

arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/irq_work.h | 10 ++++++++++
arch/riscv/kernel/entry.S | 23 +++++++++++++++++++++++
arch/riscv/kernel/smp.c | 15 +++++++++++++++
4 files changed, 49 insertions(+)
create mode 100644 arch/riscv/include/asm/irq_work.h

--
2.27.0


2020-06-24 09:07:32

by Greentime Hu

[permalink] [raw]
Subject: [PATCH v2 2/2] riscv: Enable context tracking

This patch implements and enables context tracking for riscv (which is a
prerequisite for CONFIG_NO_HZ_FULL support)

It adds checking for previous state in the entry that all excepttions and
interrupts goes to and calls context_tracking_user_exit() if it comes from
user space. It also calls context_tracking_user_enter() if it will return
to user space before restore_all.

This patch is tested with the dynticks-testing testcase in
qemu-system-riscv64 virt machine and Unleashed board.
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git

We can see the log here. The tick got mostly stopped during the execution
of the user loop.

_-----=> irqs-off
/ _----=> need-resched
| / _---=> hardirq/softirq
|| / _--=> preempt-depth
||| / delay
TASK-PID CPU# |||| TIMESTAMP FUNCTION
| | | |||| | |
<idle>-0 [001] d..2 604.183512: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=273 next_prio=120
user_loop-273 [001] d.h1 604.184788: hrtimer_expire_entry: hrtimer=000000002eda5fab function=tick_sched_timer now=604176096300
user_loop-273 [001] d.s2 604.184897: workqueue_queue_work: work struct=00000000383402c2 function=vmstat_update workqueue=00000000f36d35d4 req_cpu=1 cpu=1
user_loop-273 [001] dns2 604.185039: tick_stop: success=0 dependency=SCHED
user_loop-273 [001] dn.1 604.185103: tick_stop: success=0 dependency=SCHED
user_loop-273 [001] d..2 604.185154: sched_switch: prev_comm=taskset prev_pid=273 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=46 next_prio=120
<...>-46 [001] .... 604.185194: workqueue_execute_start: work struct 00000000383402c2: function vmstat_update
<...>-46 [001] d..2 604.185266: sched_switch: prev_comm=kworker/1:1 prev_pid=46 prev_prio=120 prev_state=I ==> next_comm=taskset next_pid=273 next_prio=120
user_loop-273 [001] d.h1 604.188812: hrtimer_expire_entry: hrtimer=000000002eda5fab function=tick_sched_timer now=604180133400
user_loop-273 [001] d..1 604.189050: tick_stop: success=1 dependency=NONE
user_loop-273 [001] d..2 614.251386: sched_switch: prev_comm=user_loop prev_pid=273 prev_prio=120 prev_state=X ==> next_comm=swapper/1 next_pid=0 next_prio=120
<idle>-0 [001] d..2 614.315391: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=276 next_prio=120

Signed-off-by: Greentime Hu <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/entry.S | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 128192e14ff2..17520e11815b 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -52,6 +52,7 @@ config RISCV
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_ASM_MODVERSIONS
+ select HAVE_CONTEXT_TRACKING
select HAVE_COPY_THREAD_TLS
select HAVE_DMA_CONTIGUOUS if MMU
select HAVE_EBPF_JIT if MMU
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index cae7e6d4c7ef..6ed579fc1073 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -97,6 +97,14 @@ _save_context:
la gp, __global_pointer$
.option pop

+#ifdef CONFIG_CONTEXT_TRACKING
+ /* If previous state is in user mode, call context_tracking_user_exit. */
+ andi a0, s1, SR_SPP
+ bnez a0, skip_context_tracking
+ call context_tracking_user_exit
+
+skip_context_tracking:
+#endif
la ra, ret_from_exception
/*
* MSB of cause differentiates between
@@ -137,6 +145,17 @@ _save_context:
tail do_trap_unknown

handle_syscall:
+#ifdef CONFIG_CONTEXT_TRACKING
+ /* Recover a0 - a7 for system calls */
+ REG_L x10, PT_A0(sp)
+ REG_L x11, PT_A1(sp)
+ REG_L x12, PT_A2(sp)
+ REG_L x13, PT_A3(sp)
+ REG_L x14, PT_A4(sp)
+ REG_L x15, PT_A5(sp)
+ REG_L x16, PT_A6(sp)
+ REG_L x17, PT_A7(sp)
+#endif
/* save the initial A0 value (needed in signal handlers) */
REG_S a0, PT_ORIG_A0(sp)
/*
@@ -205,6 +224,10 @@ resume_userspace:
andi s1, s0, _TIF_WORK_MASK
bnez s1, work_pending

+#ifdef CONFIG_CONTEXT_TRACKING
+ call context_tracking_user_enter
+#endif
+
/* Save unwound kernel stack pointer in thread_info */
addi s0, sp, PT_SIZE_ON_STACK
REG_S s0, TASK_TI_KERNEL_SP(tp)
--
2.27.0

2020-07-10 17:31:56

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] riscv: Enable context tracking

On Wed, 24 Jun 2020 02:03:16 PDT (-0700), [email protected] wrote:
> This patch implements and enables context tracking for riscv (which is a
> prerequisite for CONFIG_NO_HZ_FULL support)
>
> It adds checking for previous state in the entry that all excepttions and
> interrupts goes to and calls context_tracking_user_exit() if it comes from
> user space. It also calls context_tracking_user_enter() if it will return
> to user space before restore_all.
>
> This patch is tested with the dynticks-testing testcase in
> qemu-system-riscv64 virt machine and Unleashed board.
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git
>
> We can see the log here. The tick got mostly stopped during the execution
> of the user loop.
>
> _-----=> irqs-off
> / _----=> need-resched
> | / _---=> hardirq/softirq
> || / _--=> preempt-depth
> ||| / delay
> TASK-PID CPU# |||| TIMESTAMP FUNCTION
> | | | |||| | |
> <idle>-0 [001] d..2 604.183512: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=273 next_prio=120
> user_loop-273 [001] d.h1 604.184788: hrtimer_expire_entry: hrtimer=000000002eda5fab function=tick_sched_timer now=604176096300
> user_loop-273 [001] d.s2 604.184897: workqueue_queue_work: work struct=00000000383402c2 function=vmstat_update workqueue=00000000f36d35d4 req_cpu=1 cpu=1
> user_loop-273 [001] dns2 604.185039: tick_stop: success=0 dependency=SCHED
> user_loop-273 [001] dn.1 604.185103: tick_stop: success=0 dependency=SCHED
> user_loop-273 [001] d..2 604.185154: sched_switch: prev_comm=taskset prev_pid=273 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=46 next_prio=120
> <...>-46 [001] .... 604.185194: workqueue_execute_start: work struct 00000000383402c2: function vmstat_update
> <...>-46 [001] d..2 604.185266: sched_switch: prev_comm=kworker/1:1 prev_pid=46 prev_prio=120 prev_state=I ==> next_comm=taskset next_pid=273 next_prio=120
> user_loop-273 [001] d.h1 604.188812: hrtimer_expire_entry: hrtimer=000000002eda5fab function=tick_sched_timer now=604180133400
> user_loop-273 [001] d..1 604.189050: tick_stop: success=1 dependency=NONE
> user_loop-273 [001] d..2 614.251386: sched_switch: prev_comm=user_loop prev_pid=273 prev_prio=120 prev_state=X ==> next_comm=swapper/1 next_pid=0 next_prio=120
> <idle>-0 [001] d..2 614.315391: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=276 next_prio=120
>
> Signed-off-by: Greentime Hu <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> arch/riscv/kernel/entry.S | 23 +++++++++++++++++++++++
> 2 files changed, 24 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 128192e14ff2..17520e11815b 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -52,6 +52,7 @@ config RISCV
> select HAVE_ARCH_SECCOMP_FILTER
> select HAVE_ARCH_TRACEHOOK
> select HAVE_ASM_MODVERSIONS
> + select HAVE_CONTEXT_TRACKING
> select HAVE_COPY_THREAD_TLS
> select HAVE_DMA_CONTIGUOUS if MMU
> select HAVE_EBPF_JIT if MMU
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index cae7e6d4c7ef..6ed579fc1073 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -97,6 +97,14 @@ _save_context:
> la gp, __global_pointer$
> .option pop
>
> +#ifdef CONFIG_CONTEXT_TRACKING
> + /* If previous state is in user mode, call context_tracking_user_exit. */
> + andi a0, s1, SR_SPP

I've changed that to SR_PP, as I don't see any reason why this should depend on
MMU.

I think this is correct: we're using scratch==0 elsewhere to detect recursive
traps, but we've blown that away by this point so it's not an option. I don't
know of any reason why PP wouldn't be accurate.

> + bnez a0, skip_context_tracking
> + call context_tracking_user_exit
> +
> +skip_context_tracking:
> +#endif
> la ra, ret_from_exception
> /*
> * MSB of cause differentiates between
> @@ -137,6 +145,17 @@ _save_context:
> tail do_trap_unknown
>
> handle_syscall:
> +#ifdef CONFIG_CONTEXT_TRACKING
> + /* Recover a0 - a7 for system calls */
> + REG_L x10, PT_A0(sp)
> + REG_L x11, PT_A1(sp)
> + REG_L x12, PT_A2(sp)
> + REG_L x13, PT_A3(sp)
> + REG_L x14, PT_A4(sp)
> + REG_L x15, PT_A5(sp)
> + REG_L x16, PT_A6(sp)
> + REG_L x17, PT_A7(sp)
> +#endif
> /* save the initial A0 value (needed in signal handlers) */
> REG_S a0, PT_ORIG_A0(sp)
> /*
> @@ -205,6 +224,10 @@ resume_userspace:
> andi s1, s0, _TIF_WORK_MASK
> bnez s1, work_pending
>
> +#ifdef CONFIG_CONTEXT_TRACKING
> + call context_tracking_user_enter
> +#endif
> +
> /* Save unwound kernel stack pointer in thread_info */
> addi s0, sp, PT_SIZE_ON_STACK
> REG_S s0, TASK_TI_KERNEL_SP(tp)

2020-07-10 17:33:21

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] riscv: Add context tracker suppor

On Wed, 24 Jun 2020 02:03:14 PDT (-0700), [email protected] wrote:
> This patchset adds support for irq_work via self IPI and context tracking.
> It is tested in qemu-system-riscv64 and SiFive HiFive Unleashed board based
> on v5.8-rc2.
>
> ---
> Changes in v2
> - Fix the compiling warning
>
> Greentime Hu (2):
> riscv: Support irq_work via self IPIs
> riscv: Enable context tracking
>
> arch/riscv/Kconfig | 1 +
> arch/riscv/include/asm/irq_work.h | 10 ++++++++++
> arch/riscv/kernel/entry.S | 23 +++++++++++++++++++++++
> arch/riscv/kernel/smp.c | 15 +++++++++++++++
> 4 files changed, 49 insertions(+)
> create mode 100644 arch/riscv/include/asm/irq_work.h

These are on for-next, with some merge conflicts fixed up. Thanks!

2020-07-11 01:51:29

by Greentime Hu

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] riscv: Enable context tracking

Palmer Dabbelt <[email protected]> 於 2020年7月11日 週六 上午1:30寫道:
>
> On Wed, 24 Jun 2020 02:03:16 PDT (-0700), [email protected] wrote:
> > This patch implements and enables context tracking for riscv (which is a
> > prerequisite for CONFIG_NO_HZ_FULL support)
> >
> > It adds checking for previous state in the entry that all excepttions and
> > interrupts goes to and calls context_tracking_user_exit() if it comes from
> > user space. It also calls context_tracking_user_enter() if it will return
> > to user space before restore_all.
> >
> > This patch is tested with the dynticks-testing testcase in
> > qemu-system-riscv64 virt machine and Unleashed board.
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git
> >
> > We can see the log here. The tick got mostly stopped during the execution
> > of the user loop.
> >
> > _-----=> irqs-off
> > / _----=> need-resched
> > | / _---=> hardirq/softirq
> > || / _--=> preempt-depth
> > ||| / delay
> > TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > | | | |||| | |
> > <idle>-0 [001] d..2 604.183512: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=273 next_prio=120
> > user_loop-273 [001] d.h1 604.184788: hrtimer_expire_entry: hrtimer=000000002eda5fab function=tick_sched_timer now=604176096300
> > user_loop-273 [001] d.s2 604.184897: workqueue_queue_work: work struct=00000000383402c2 function=vmstat_update workqueue=00000000f36d35d4 req_cpu=1 cpu=1
> > user_loop-273 [001] dns2 604.185039: tick_stop: success=0 dependency=SCHED
> > user_loop-273 [001] dn.1 604.185103: tick_stop: success=0 dependency=SCHED
> > user_loop-273 [001] d..2 604.185154: sched_switch: prev_comm=taskset prev_pid=273 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=46 next_prio=120
> > <...>-46 [001] .... 604.185194: workqueue_execute_start: work struct 00000000383402c2: function vmstat_update
> > <...>-46 [001] d..2 604.185266: sched_switch: prev_comm=kworker/1:1 prev_pid=46 prev_prio=120 prev_state=I ==> next_comm=taskset next_pid=273 next_prio=120
> > user_loop-273 [001] d.h1 604.188812: hrtimer_expire_entry: hrtimer=000000002eda5fab function=tick_sched_timer now=604180133400
> > user_loop-273 [001] d..1 604.189050: tick_stop: success=1 dependency=NONE
> > user_loop-273 [001] d..2 614.251386: sched_switch: prev_comm=user_loop prev_pid=273 prev_prio=120 prev_state=X ==> next_comm=swapper/1 next_pid=0 next_prio=120
> > <idle>-0 [001] d..2 614.315391: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=276 next_prio=120
> >
> > Signed-off-by: Greentime Hu <[email protected]>
> > ---
> > arch/riscv/Kconfig | 1 +
> > arch/riscv/kernel/entry.S | 23 +++++++++++++++++++++++
> > 2 files changed, 24 insertions(+)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 128192e14ff2..17520e11815b 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -52,6 +52,7 @@ config RISCV
> > select HAVE_ARCH_SECCOMP_FILTER
> > select HAVE_ARCH_TRACEHOOK
> > select HAVE_ASM_MODVERSIONS
> > + select HAVE_CONTEXT_TRACKING
> > select HAVE_COPY_THREAD_TLS
> > select HAVE_DMA_CONTIGUOUS if MMU
> > select HAVE_EBPF_JIT if MMU
> > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> > index cae7e6d4c7ef..6ed579fc1073 100644
> > --- a/arch/riscv/kernel/entry.S
> > +++ b/arch/riscv/kernel/entry.S
> > @@ -97,6 +97,14 @@ _save_context:
> > la gp, __global_pointer$
> > .option pop
> >
> > +#ifdef CONFIG_CONTEXT_TRACKING
> > + /* If previous state is in user mode, call context_tracking_user_exit. */
> > + andi a0, s1, SR_SPP
>
> I've changed that to SR_PP, as I don't see any reason why this should depend on
> MMU.
>
> I think this is correct: we're using scratch==0 elsewhere to detect recursive
> traps, but we've blown that away by this point so it's not an option. I don't
> know of any reason why PP wouldn't be accurate.

Hi Palmer,

Thank you. That makes sense to me.

>
> > + bnez a0, skip_context_tracking
> > + call context_tracking_user_exit
> > +
> > +skip_context_tracking:
> > +#endif
> > la ra, ret_from_exception
> > /*
> > * MSB of cause differentiates between
> > @@ -137,6 +145,17 @@ _save_context:
> > tail do_trap_unknown
> >
> > handle_syscall:
> > +#ifdef CONFIG_CONTEXT_TRACKING
> > + /* Recover a0 - a7 for system calls */
> > + REG_L x10, PT_A0(sp)
> > + REG_L x11, PT_A1(sp)
> > + REG_L x12, PT_A2(sp)
> > + REG_L x13, PT_A3(sp)
> > + REG_L x14, PT_A4(sp)
> > + REG_L x15, PT_A5(sp)
> > + REG_L x16, PT_A6(sp)
> > + REG_L x17, PT_A7(sp)
> > +#endif
> > /* save the initial A0 value (needed in signal handlers) */
> > REG_S a0, PT_ORIG_A0(sp)
> > /*
> > @@ -205,6 +224,10 @@ resume_userspace:
> > andi s1, s0, _TIF_WORK_MASK
> > bnez s1, work_pending
> >
> > +#ifdef CONFIG_CONTEXT_TRACKING
> > + call context_tracking_user_enter
> > +#endif
> > +
> > /* Save unwound kernel stack pointer in thread_info */
> > addi s0, sp, PT_SIZE_ON_STACK
> > REG_S s0, TASK_TI_KERNEL_SP(tp)