Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751714AbbETFJs (ORCPT ); Wed, 20 May 2015 01:09:48 -0400 Received: from mailout3.samsung.com ([203.254.224.33]:48586 "EHLO mailout3.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750963AbbETFJm (ORCPT ); Wed, 20 May 2015 01:09:42 -0400 X-AuditID: cbfee68e-f79c56d000006efb-b3-555c1713d33e Date: Wed, 20 May 2015 05:09:39 +0000 (GMT) From: Sarbojit Ganguly Subject: Re: Re: [RFC] arm: Add for atomic half word exchange To: Peter Zijlstra , Sarbojit Ganguly Cc: Arnd Bergmann , "linux-arm-kernel@lists.infradead.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "hpa@zytor.com" , "Waiman.Long@hp.com" , "raghavendra.kt@linux.vnet.ibm.com" , "oleg@redhat.com" , "linux-kernel@vger.kernel.org" , SHARAN ALLUR , "torvalds@linux-foundation.org" , VIKRAM MUPPARTHI , SUNEEL KUMAR SURIMANI Reply-to: ganguly.s@samsung.com MIME-version: 1.0 X-MTR: 20150520050342621@ganguly.s Msgkey: 20150520050342621@ganguly.s X-EPLocale: en_US.windows-1252 X-Priority: 3 X-EPWebmail-Msg-Type: personal X-EPWebmail-Reply-Demand: 0 X-EPApproval-Locale: X-EPHeader: ML X-MLAttribute: X-RootMTR: 20150520050342621@ganguly.s X-ParentMTR: X-ArchiveUser: X-CPGSPASS: Y X-ConfirmMail: N,general Content-type: text/plain; charset=windows-1252 MIME-version: 1.0 Message-id: <1348896100.440561432098574765.JavaMail.weblogic@ep2mlwas07a> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrCIsWRmVeSWpSXmKPExsWyRsSkVldYPCbUoKFHweLyrjlsDowenzfJ BTBGcdmkpOZklqUW6dslcGUsnviYqeCBYMXN1tdMDYwLBLsYOTiEBFQk+iZFdDFyckgImEi8 WHKUFcIWk7hwbz1bFyMXUMlSRonnhxYywxRNW/+cESIxh1Fi76oFLCCDWARUJZb3WIPUsAno S5ze/5IJxBYWsJW4c3slmC0iECmx7/lLFhCbWWAWq8SJ02CLhQTkJdpfbAer4RUQlDg58wkL xC4liX+f1zNCxJUlnh+ZDhWXk1gy9TIThM0rMaP9KVx82tc1UHdKS5yftYER5pnF3x9Dxfkl jt3eAdUrIDH1zEGoGnWJnkmv2CBsPok1C9+ywNTvOrWcGWZXw8bf7BC2hMTWliesEL8oSkzp fsgOYRtIHFk0hxXdL7wCnhIzps9gAoWbhEAvh8Tz6SeZJjAqzUJSNwvJrFlIZiGrWcDIsopR NLUguaA4Kb3ISK84Mbe4NC9dLzk/dxMjMDGc/vesbwfjzQPWhxgFOBiVeHgLD0SHCrEmlhVX 5h5iNAVG00RmKdHkfGD6ySuJNzQ2M7IwNTE1NjK3NFMS502Q+hksJJCeWJKanZpakFoUX1Sa k1p8iJGJg1OqgTFU5cj0f1MTfvT1i/MYvTVdwSS5gT89zpw3+s1sRUFuR3OeS/K3OpJPzA+6 c2Pzjq0HCqPk2xfKWLYZHAoumrqI87rtx3nTw4sWfKsQyg3RN6q5mHyC3/B73O5iM08TlkvM c2aITl3QaWHifq+jz/tB2UbZyY5dXDc31UStzZZ4mG52asf2eUosxRmJhlrMRcWJAPNuCQIH AwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrKKsWRmVeSWpSXmKPExsVy+t/tPl1h8ZhQgyVnpSwu75rD5sDo8XmT XABjVJpNRmpiSmqRQmpecn5KZl66rZJ3cLxzvKmZgaGuoaWFuZJCXmJuqq2Si0+ArltmDtBQ JYWyxJxSoFBAYnGxkr6dTVF+aUmqQkZ+cYmtUrShuZGekYGeqZGeoWmslaGBgZEpUE1CWsbi iY+ZCh4IVtxsfc3UwLhAsIuRg0NIQEWib1JEFyMnh4SAicS09c8ZIWwxiQv31rN1MXIBlcxh lNi7agELSD2LgKrE8h5rkBo2AX2J0/tfMoHYwgK2EndurwSzRQQiJfY9f8kCYjMLzGKVOHEa bL6QgLxE+4vtYDW8AoISJ2c+YYHYpSTx7/N6Roi4ssTzI9Oh4nISS6ZeZoKweSVmtD+Fi0/7 uoYZwpaWOD9rA9zNi78/horzSxy7vQOqV0Bi6pmDUDXqEj2TXrFB2HwSaxa+ZYGp33VqOTPM roaNv9khbAmJrS1PWCF+UZSY0v2QHcI2kDiyaA4rul94BTwlZkyfwTSBUXYWktQsJO2zkLQj q1nAyLKKUTS1ILmgOCm9wlivODG3uDQvXS85P3cTIzgJPVu8g/H/eetDjAIcjEo8vCcORYcK sSaWFVfmHmKU4GBWEuFdyhUTKsSbklhZlVqUH19UmpNafIjRFBhpE5mlRJPzgQkyryTe0NjE 3NTY1MLA0NzcTEmc9/+53BAhgfTEktTs1NSC1CKYPiYOTqkGxoYnD5xurWIXurbL+kpGiVut oXSJSfgmFV+GpZsXMPPKC3Z8bLOd+Dk8dO3T5bPLjh1K2XwotvOAoCtj6vMqw1nHRfyc5BVj Dn11n/DS70m6zyvb3zOLrl4/xvoml13M/8oFdeEfCxMv+2SkpZudtffy3h85obA65n3BTcWW pNTHu/Z8DtvTocRSnJFoqMVcVJwIAKO/OKZYAwAA DLP-Filter: Pass X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id t4K59rWg007688 Content-Length: 1779 Lines: 37 Yes, the main advantage of Qspinlock code can be observed in NUMA but when I tested in an embedded system, a slight advantage was observed. ------- Original Message ------- Sender : Peter Zijlstra Date : May 19, 2015 21:43 (GMT+09:00) Title : Re: [RFC] arm: Add for atomic half word exchange On Tue, May 19, 2015 at 11:20:13AM +0000, Sarbojit Ganguly wrote: > On Tuesday 19 May 2015 09:39:33 Sarbojit Ganguly wrote: > > Since 16 bit half word exchange was not there and MCS based > > qspinlock by Waiman's xchg_tail() requires an atomic exchange on a > > half word, here is a small modification to __xchg() code. Can you actually see a performance improvement with the qspinlock code on ARM ? The real improvements on x86 were on NUMA systems; although there were real improvements on light loads as well. Note that ARM (or any load-store arch) could get rid of all the cmpxchg loops in that code. Although I suppose we replaced the most common ones with these unconditional atomics already -- like that xchg16 -- so implementing those with ll/sc, as you did, should be near optimal. ????? ??? ?? ?? ?? ----------------------------------------------------------------------+ The Tao lies beyond Yin and Yang. It is silent and still as a pool of water. | It does not seek fame, therefore nobody knows its presence. | It does not seek fortune, for it is complete within itself. | It exists beyond space and time. | ----------------------------------------------------------------------+????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?