2013-09-27 10:58:58

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 0/4] ARC fixes for 3.12-rc3

ARC fixes for 3.12-rc3.

Thx,
-Vineet

Mischa Jonker (1):
ARC: Handle zero-overhead-loop in unaligned access handler

Uwe Kleine-König (1):
ARC: Use clockevents_config_and_register over
clockevents_register_device

Vineet Gupta (2):
ARC: Fix 32-bit wrap around in access_ok()
ARC: Workaround spinlock livelock in SMP SystemC simulation

arch/arc/include/asm/spinlock.h | 9 ++++++++-
arch/arc/include/asm/uaccess.h | 4 ++--
arch/arc/kernel/time.c | 7 ++-----
arch/arc/kernel/unaligned.c | 6 ++++++
4 files changed, 18 insertions(+), 8 deletions(-)

--
1.8.1.2


2013-09-27 10:58:16

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 1/4] ARC: Handle zero-overhead-loop in unaligned access handler

From: Mischa Jonker <[email protected]>

If a load or store is the last instruction in a zero-overhead-loop, and
it's misaligned, the loop would execute only once.

This fixes that problem.

Signed-off-by: Mischa Jonker <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/kernel/unaligned.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/arc/kernel/unaligned.c b/arch/arc/kernel/unaligned.c
index 28d1700..7ff5b5c 100644
--- a/arch/arc/kernel/unaligned.c
+++ b/arch/arc/kernel/unaligned.c
@@ -245,6 +245,12 @@ int misaligned_fixup(unsigned long address, struct pt_regs *regs,
regs->status32 &= ~STATUS_DE_MASK;
} else {
regs->ret += state.instr_len;
+
+ /* handle zero-overhead-loop */
+ if ((regs->ret == regs->lp_end) && (regs->lp_count)) {
+ regs->ret = regs->lp_start;
+ regs->lp_count--;
+ }
}

return 0;
--
1.8.1.2

2013-09-27 10:58:24

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok()

Anton reported

| LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
| similarly in one testcase test_iov_invalid -> lvec->iov_base.
| Testcase expects errno EFAULT and return code -1,
| but it gets return code 1 and ERRNO is 0 what means success.

Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would
pass for @addr == -1

Fixed that by rewriting as [@addr <= TASK_SIZE - @sz]

Reported-by: Anton Kolesov <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/include/asm/uaccess.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 3242082..30c9baf 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -43,7 +43,7 @@
* Because it essentially checks if buffer end is within limit and @len is
* non-ngeative, which implies that buffer start will be within limit too.
*
- * The reason for rewriting being, for majorit yof cases, @len is generally
+ * The reason for rewriting being, for majority of cases, @len is generally
* compile time constant, causing first sub-expression to be compile time
* subsumed.
*
@@ -53,7 +53,7 @@
*
*/
#define __user_ok(addr, sz) (((sz) <= TASK_SIZE) && \
- (((addr)+(sz)) <= get_fs()))
+ ((addr) <= (get_fs() - (sz))))
#define __access_ok(addr, sz) (unlikely(__kernel_ok) || \
likely(__user_ok((addr), (sz))))

--
1.8.1.2

2013-09-27 10:58:35

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation

Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

mov reg, 1 ; 1=locked, 0=unlocked
retry:
EX reg, [lock] ; load existing, store 1, atomically
BREQ reg, 1, rety ; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

core1 core2
-------- --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read) - LOCKED
3. spin unlock [ST 0] - UNLOCKED
spin lock [EX r=0,w=1] - LOCKED
-- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

-- resched core 2----
6. rwlock(Write) - READER-LOCKED
7. spin unlock [ST 0]
8. rwlock failed, retry again

9. spin lock [EX r=0, w=1]
-- resched core 1----

10 spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
-- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.

Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/include/asm/spinlock.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/spinlock.h b/arch/arc/include/asm/spinlock.h
index f158197..b6a8c2d 100644
--- a/arch/arc/include/asm/spinlock.h
+++ b/arch/arc/include/asm/spinlock.h
@@ -45,7 +45,14 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)

static inline void arch_spin_unlock(arch_spinlock_t *lock)
{
- lock->slock = __ARCH_SPIN_LOCK_UNLOCKED__;
+ unsigned int tmp = __ARCH_SPIN_LOCK_UNLOCKED__;
+
+ __asm__ __volatile__(
+ " ex %0, [%1] \n"
+ : "+r" (tmp)
+ : "r"(&(lock->slock))
+ : "memory");
+
smp_mb();
}

--
1.8.1.2

2013-09-27 10:58:42

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device

From: Uwe Kleine-König <[email protected]>

clockevents_config_and_register is more clever and correct than doing it
by hand; so use it.

[vgupta: fixed build failure due to missing ; in patch]

Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/kernel/time.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arc/kernel/time.c b/arch/arc/kernel/time.c
index 0e51e69..3fde7de 100644
--- a/arch/arc/kernel/time.c
+++ b/arch/arc/kernel/time.c
@@ -227,12 +227,9 @@ void __attribute__((weak)) arc_local_timer_setup(unsigned int cpu)
{
struct clock_event_device *clk = &per_cpu(arc_clockevent_device, cpu);

- clockevents_calc_mult_shift(clk, arc_get_core_freq(), 5);
-
- clk->max_delta_ns = clockevent_delta2ns(ARC_TIMER_MAX, clk);
clk->cpumask = cpumask_of(cpu);
-
- clockevents_register_device(clk);
+ clockevents_config_and_register(clk, arc_get_core_freq(),
+ 0, ARC_TIMER_MAX);

/*
* setup the per-cpu timer IRQ handler - for all cpus
--
1.8.1.2