From: Arnd Bergmann <arnd@arndb.de>
To: ganguly.s@samsung.com
Cc: Peter Zijlstra <peterz@infradead.org>,
        "linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
        "tglx@linutronix.de" <tglx@linutronix.de>,
        "mingo@redhat.com" <mingo@redhat.com>, "hpa@zytor.com" <hpa@zytor.com>,
        "Waiman.Long@hp.com" <Waiman.Long@hp.com>,
        "raghavendra.kt@linux.vnet.ibm.com" 
	<raghavendra.kt@linux.vnet.ibm.com>,
        "oleg@redhat.com" <oleg@redhat.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        SHARAN ALLUR <sharan.allur@samsung.com>,
        "torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
        VIKRAM MUPPARTHI <vikram.m@samsung.com>,
        SUNEEL KUMAR SURIMANI <suneel@samsung.com>
Subject: Re: [RFC] arm: Add for atomic half word exchange
Date: Wed, 20 May 2015 08:51:32 +0200
Message-ID: <2528978.P5FT0BVksd@wuerfel>
User-Agent: KMail/4.11.5 (Linux/3.16.0-10-generic; KDE/4.11.5; x86_64; ; )
In-Reply-To: <1348896100.440561432098574765.JavaMail.weblogic@ep2mlwas07a>
References: <1348896100.440561432098574765.JavaMail.weblogic@ep2mlwas07a>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1822
Lines: 48

On Wednesday 20 May 2015 05:09:35 Sarbojit Ganguly wrote:

> > ------- Original Message -------
> > Sender : Peter Zijlstra<peterz@infradead.org>
> > Date : May 19, 2015 21:43 (GMT+09:00)
> > Title : Re: [RFC] arm: Add for atomic half word exchange
> > 
> > On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote:
> > > On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote:
> > > > Since 16 bit half word exchange was not there and MCS based
> > > > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a
> > > > half word, here is a small modification to __xchg() code.
> > 
> > Can you actually see a performance improvement with the qspinlock code
> > on ARM ?
> > 
> > The real improvements on x86 were on NUMA systems; although there were
> > real improvements on light loads as well.
> > 
> > 
> > Note that ARM (or any load-store arch) could get rid of all the cmpxchg
> > loops in that code. Although I suppose we replaced the most common ones
> > with these unconditional atomics already -- like that xchg16 -- so
> > implementing those with ll/sc, as you did, should be near optimal.
>
> Yes, the main advantage of Qspinlock code can be observed in NUMA but
> when I tested in an embedded system, a slight advantage was observed.

Is this a multi-cluster SMP system? Those can behave like NUMA
machines in some ways.

We could easily limit the use of 16-bit xchg() to ARMv7 machines
by using

	select ARCH_USE_QUEUED_SPINLOCKS if !SMP_ON_UP

or

	select ARCH_USE_QUEUED_SPINLOCKS if !CPU_V6

when enabling the qspinlock implementation.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/