2015-11-02 07:48:59

by Vineet Gupta

[permalink] [raw]
Subject: Re: [PATCH v1 19/20] ARC: [plat-eznps] replace sync with proper cpu barrier

+CC Peter.

On Saturday 31 October 2015 06:53 PM, Noam Camus wrote:
> From: Tal Zilcer <[email protected]>
>
> In SMT system like we have the generic "sync" is not working with
> HW threads. The replacement is "schd.rw" instruction that is served
> as cpu barrier for HW threads.
> Signed-off-by: Noam Camus <[email protected]>
> ---
> arch/arc/kernel/ctx_sw.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arc/kernel/ctx_sw.c b/arch/arc/kernel/ctx_sw.c
> index 92e2e82..2a2f50e 100644
> --- a/arch/arc/kernel/ctx_sw.c
> +++ b/arch/arc/kernel/ctx_sw.c
> @@ -61,7 +61,11 @@ __switch_to(struct task_struct *prev_task, struct task_struct *next_task)
> "st sp, [r24] \n\t"
> #endif
>
> +#ifdef CONFIG_EZNPS_MTM_EXT
> + ".word %5 \n\t"
> +#else
> "sync \n\t"
> +#endif
>
> /*
> * setup _current_task with incoming tsk.
> @@ -122,6 +126,9 @@ __switch_to(struct task_struct *prev_task, struct task_struct *next_task)
> #ifdef CONFIG_ARC_PLAT_EZNPS
> , "i"(CTOP_AUX_LOGIC_GLOBAL_ID)
> #endif
> +#ifdef CONFIG_EZNPS_MTM_EXT
> + , "i"(CTOP_INST_SCHD_RW)
> +#endif
> : "blink"
> );

Since u bring this up - I think we don't need the original SYNC and/or SMT thread
schedule at all.
The SYNC here is a historic relic at best and we can get rid of it per reasoning
below:

In UP context it is obviously useless, why would we want to stall the core for all
updates to stack memory of t0 to complete before loading kernel ode callee
registers from t1 stack's memory.

In SMP, we could have a potential race in which outdoing task could be
concurrently picked for running, thus the writes to stack here need to be visible
before the reads from stack on other core. But I think since this is the same rq,
there would be a taken spinlock and once a core gives it up, an smp barrier would
come naturally.

Peter do u concur ?


2015-11-02 09:27:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v1 19/20] ARC: [plat-eznps] replace sync with proper cpu barrier

On Mon, Nov 02, 2015 at 07:48:54AM +0000, Vineet Gupta wrote:
> Since u bring this up - I think we don't need the original SYNC and/or
> SMT thread schedule at all. The SYNC here is a historic relic at best
> and we can get rid of it per reasoning below:
>
> In UP context it is obviously useless, why would we want to stall the
> core for all updates to stack memory of t0 to complete before loading
> kernel ode callee registers from t1 stack's memory.
>
> In SMP, we could have a potential race in which outdoing task could be
> concurrently picked for running, thus the writes to stack here need to
> be visible before the reads from stack on other core. But I think
> since this is the same rq, there would be a taken spinlock and once a
> core gives it up, an smp barrier would come naturally.
>
> Peter do u concur ?

I'm still somewhat jet-lagged, but I think the below reference should
answer your question:

lkml.kernel.org/r/[email protected]

I (still) need to update that patch and send it out again.

But I think it answers your question; we do not rely on arch code to
provide barriers for the generic code.

Now, if for some reason the arch code has further constraints, then
maybe, but I don't think so.

2015-11-17 13:48:28

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH] ARC: remove SYNC from __switch_to()

SYNC in __switch_to() is a historic relic and not needed at all.

- In UP context it is obviously useless, why would we want to stall
the core for all updates to stack memory of t0 to complete before
loading kernel mode callee registers from t1 stack's memory.

- In SMP, there could be potential race in which outgoing task could
be concurrently picked for running on a different core, thus writes
to stack here need to be visible before the reads from stack on
other core. Peter confirmed that generic schedular already has needed
barriers (by way of rq lock) so there is no need for additional arch
barrier.

This came up when Noam was tryign to replace this SYNC with EZChip
specific hardware thread scheduling instruction for their platform support.

Link: http://lkml.kernel.org/r/[email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: Noam Camus <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/kernel/ctx_sw.c | 2 --
arch/arc/kernel/ctx_sw_asm.S | 3 ---
2 files changed, 5 deletions(-)

diff --git a/arch/arc/kernel/ctx_sw.c b/arch/arc/kernel/ctx_sw.c
index c14a5bea0c76..5d446df2c413 100644
--- a/arch/arc/kernel/ctx_sw.c
+++ b/arch/arc/kernel/ctx_sw.c
@@ -58,8 +58,6 @@ __switch_to(struct task_struct *prev_task, struct task_struct *next_task)
"st sp, [r24] \n\t"
#endif

- "sync \n\t"
-
/*
* setup _current_task with incoming tsk.
* optionally, set r25 to that as well
diff --git a/arch/arc/kernel/ctx_sw_asm.S b/arch/arc/kernel/ctx_sw_asm.S
index e248594097e7..e6890b1f8650 100644
--- a/arch/arc/kernel/ctx_sw_asm.S
+++ b/arch/arc/kernel/ctx_sw_asm.S
@@ -44,9 +44,6 @@ __switch_to:
* don't need to do anything special to return it
*/

- /* hardware memory barrier */
- sync
-
/*
* switch to new task, contained in r1
* Temp reg r3 is required to get the ptr to store val
--
1.9.1