Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965103AbcDYVLb (ORCPT ); Mon, 25 Apr 2016 17:11:31 -0400 Received: from mail-am1on0057.outbound.protection.outlook.com ([157.56.112.57]:5440 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S964811AbcDYVL2 (ORCPT ); Mon, 25 Apr 2016 17:11:28 -0400 Authentication-Results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=mellanox.com; From: Chris Metcalf Subject: Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() To: Peter Zijlstra , , , , , , , , References: <20160422090413.393652501@infradead.org> <20160422093924.482859927@infradead.org> CC: , , , , , , , , , , , , , , , , , , , , , , Message-ID: <571E87E2.3010306@mellanox.com> Date: Mon, 25 Apr 2016 17:10:58 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <20160422093924.482859927@infradead.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: CY1PR15CA0028.namprd15.prod.outlook.com (10.163.14.38) To DB5PR05MB1687.eurprd05.prod.outlook.com (10.165.6.153) X-MS-Office365-Filtering-Correlation-Id: 185c33b5-57f4-4568-d1e1-08d36d4e25ed X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1687;2:olXfors3Jf/s99m6/P9lo/gaDvqIkqBXf/E0n282Lh8PmGIRgZ2gkyaEPVnqn35S3OxsBMTY8iGdYg/AKGM7cBdjA1whY7s73Yr1FmrmOPmIyXBKA/tLPJTu9Z7xzrDlMpvo2SVqMCno31qIKBmpsu3ZqjsdYDXwYopxdZAzO0fE5hYAt9y+bYxKjN7edzQZ;3:wqdsJMaCeG8rJzspJPilFwRkzENuOVkS1ZE5pUmb7br9CEXtZ6lmKsmK5ueJmZzSQtrnEV1PNRYH4dq24cKdvAUQLLZAFmdU5apJpq4sns7aazJB55od+TaqbqNme3qX;25:2AqXql7PneMZy0MBYaZ7mM1RKI/2+2Cvk0UA60O+BFvczL55Xr0ObRvVeN2wvmfnNYb2xUKIdbxs6fiZIPPJpId9yXimGtclyzSZf18CNrunSrJixgoqDQR9HblgOF06GMfFRfkxuMYi+CemGZy5k3bFuvlDN5aGxWKSwBxqmGOrbq40BwCh+zkCxqfN0iHpVvkDdSABxNEV2ZfB56d0SOaO8+/uzLcniSMnHfwyy9RajUGKPG+wXXoHpcScv1vJKFr3IM90OvD6OECIwHpXBTZKnK2WHaUdr3voD4ryHYtAvBnW/NkCgHcjpQwxES8DQo60+Ey5xAIg38L9JB2xMA== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB5PR05MB1687; X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1687;20:cxXpqDPYLF/OJpC4I+rNDH5GNAgARsTvDVBNZ0q1Zm0zXi40euxLvAWxnXhBQlT60FmhddELycEuDf60kboRMl6g8TL4efTr9706frd4A32xd3uuqNt+JK4X6qZ2bd37/FLxjEs5ZLTW7sPp8ihNKZyBk3KHJ/mH48zvu903c4qyzU50bWhYXik10KTSkhADFXc7jo9j+vkopBAfb3ApoY3x5hTpCwVEVr3jHArRZvWh9k/CKHq2FJWov6VwypN/aNIEr5E8XBpGIPqhpUsBv1wLL5FTgN2Lq83NN4NOCB9eFaOi7FkWaJHus/AkWzhDRkkZhdQaJVH1AQ0eMLr5wWo7hAx4iVF5jqJFbGwPd7NqgrPSQdntLbG/+ZHr2Kq1BzQNfrXM2OCyQ4TNxBS69i82EAksGbFtZxgbcT5Vnp7i0TguTEQowIp9NeLKp5bKnCvBPw9yuRgTftJtUWr4iZSeFNUzOfmSYmhC3sfmNgo8xxIYNxluC/1DNg5JG8dg;4:WCglARz05vVNjVGuytWktP+6TLZgrPh9F8JenTJR0Nu+dx8WGF7aT3tK6s8kmU0QsWZqRKBuCk8tVWKFnjRE+FA+Oc0Io3fRtzs5dD1GuLbv1LIv3vdZGL7HGVZANfXbuwDbOH/k9TByvU1tnLevpGaC7xbNfiP5qGzOfDKR798zO5TFc9UvJwLQ23x/mGqozh+QMx6y3O4n8ssOOC8WfCRUKDxIy6owbHpEQmSynVvtz/GBM+xUN3b4+2n/m2dqBOpXOwzEhCffTm+E9KJn7F37gnIzVn3PTPtMe0iGafcvG0p/xG/uhk7q/uQJVTH20nUtrG5Ss1g+0+QfOGhFoXF+oqTgyZmjS8uIXeTQamlZd1z3QmZH0aUdDd38OTB4w+KiUh+VVbMCA/vRc+K1r0yJwYvRmtGSwuX8kV5VYIQ= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(9101521072)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026);SRVR:DB5PR05MB1687;BCL:0;PCL:0;RULEID:;SRVR:DB5PR05MB1687; X-Forefront-PRVS: 0923977CCA X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(24454002)(377454003)(33656002)(66066001)(586003)(42186005)(1096002)(80316001)(230700001)(65816999)(50986999)(54356999)(81166005)(19580395003)(87266999)(92566002)(76176999)(86362001)(2201001)(19580405001)(50466002)(77096005)(64126003)(2950100001)(5008740100001)(47776003)(2906002)(5001770100001)(23676002)(189998001)(4001350100001)(65806001)(3846002)(36756003)(6116002)(65956001)(15975445007)(4326007)(5004730100002)(83506001)(142933001)(21314002)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR05MB1687;H:[10.15.7.169];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtEQjVQUjA1TUIxNjg3OzIzOjl3S3hPWU1tekUyVEhTY2d6eU02NFV6disz?= =?utf-8?B?czIwek5SYzNnVm4wRzh3NHdpYU5id0J2bEJRamg5RzBJM3BLWHpMZ3FudzZq?= =?utf-8?B?M09LbENoekcxWGpyWXVEOVNQekVBNHA3MVdEUDJNUGc1anZGNHhGRmdXU2dt?= =?utf-8?B?cDdMQXVYTjZUajQyQmpsc1NFYnN6akt4YXpEMmo2VGtBV3B1UFhiY25td2Rn?= =?utf-8?B?UjAxbWZGTXNCVVdnV3NuRUhyQTNMLzdBWU8rUU8zVnR5RmlXRlVZcXFOT1Bz?= =?utf-8?B?ZHVCa0duUU5hZGVGMmFQM1N0RjNrS2lEMHlZV204TEVOS2lObXlRQmhQc291?= =?utf-8?B?S0JQTmdSamNvNnlTbW1KdjhaRGpYT004UDRibm5qMzQ2T1NuY0ptWjhzUjZa?= =?utf-8?B?UXI0cEp4NFJXYTBJTVoyeTkrR1FadkhCNWY1V0Z3MEFkT1A1VW5QRFd0L2oz?= =?utf-8?B?Y2dmMEk1YnpYeEFIRFJpOWoyaGIwaVN4Q3dsUXQ0dldTNmFOdmExSWRDNmFu?= =?utf-8?B?bS9GbWdPTjljUk4wTCtlUWNxNXFxc3A5YUc5QzlneUZERnorN1BzMmhwZUw0?= =?utf-8?B?VFdrSzVwMnBLb3BxZnJxUll6aW1GYXhZV3FzaFBFVTBwdGhzUDhzamUxaDlP?= =?utf-8?B?eGR5WmcrNTZoY1lXQmxZeTl5UUFoUk56ZGVqQUpsMys0L3J3NU9nYjVHeTVt?= =?utf-8?B?aTJ2R3JvcFRLUVVqRDBDUmRvbGZIYit1OW9uZ1JxK1pXbEVRdGlxdkY0dmNZ?= =?utf-8?B?SlhNOGFHU3pzOG9PR3pDd000QnFNVnpOSHFEaE1lN0gyeUVWZytSSmVTNFYx?= =?utf-8?B?MGVLMzFBM1huZ0w3ZnpyZ0dGNlBZVlIzSVZFdW41amZKbmV1YmJCM2tpRlpm?= =?utf-8?B?eW1VRytBYTFFL0pBNkxSUy92WkRTMHYxWkhiaHJaYjFFVTFtMUNpamxIeW96?= =?utf-8?B?M05IVWVtbFFuWE5zSHowYW5zMWh1a2FJYnpDbmRRSVovQnhZT1EvK0RSZ1NB?= =?utf-8?B?RzdJZC9ndGNScnhtMlNOZGdVdGtNS2pwbHlZVUdIUUc1TTVYLzFpc3ZmUkJJ?= =?utf-8?B?a1lzd2V6VU9FcW50QmhqZTExQlpWQXd6aURBZmFoM05yZm9YYzJ3NjRoWnc3?= =?utf-8?B?cFdxanNOZFNaNkExT0I2Y29pdENSSzFrTi8rTVdKelNWQ1ZSTTVFcUJvSGV2?= =?utf-8?B?UDhYSjJkWWFKRHNPam5VTmNlbGtJbjVFOEgzbTM1U3labXBpZmtPOTVzc1Bp?= =?utf-8?B?QzBOWUVOTEhCNnpvc3hEanhkenVBVTdQelJvOXdlalM5b1ZyVFFqSmhObUht?= =?utf-8?B?NmVLdGcvMWcwUlpZenlQZVJYRG4vVGtvTU5zMGN3T2FyemJpc01qMzFrM1ZR?= =?utf-8?B?b1hQQjE5ekU3N01aUjRVelNoVktwWmxnd0FvdGg4MWtURzJuY3ZiS2RBUDlR?= =?utf-8?B?MFg1UmdORE03SHRUejhuSW83Rm5ZNXJwaU1OOEdENEI5VkdISVZHWFBKS2V6?= =?utf-8?B?YjY3VzYvMEdWNkpVWi9TTmgrNzZhdXFobG15cDhHNi9iRXJZV3hLcHRlVVhz?= =?utf-8?B?aHRiR1V5K2JIeTRRVitxUml5VDZ5a2V4ZzlnUTgwSlNId1ZUbjVhNERISXA1?= =?utf-8?Q?Bz7SoARwgMmXn42UNGxH?= X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1687;5:TOd7mPV5gFFC5FlKiLf2aYrbaoY0kUNK57t4t+uh7YpYMd3ld7sLJqkRXaYfjEOPixI3s0xIk6NVCJGY+/dV8QTfu+gVKMOOT4LsVHz1p/gw80/ZDsZ20r4PNu36uHf6K8pUEFS23ndv0f5ZYxAEmbUaZH73yv0jxAPneRf+gNfjOPtnOy2WKd8zEvkPWnOh;24:pmE8c6+IB2+7VH3T3x4ESI52m9MJknrdpkAKHub5xnD0YYklX5vgteRVj8JVr4G5yMJhq9/Z/ImqU7otGEuJGlvw/aheSCk3SiyI5U5I3cw=;7:u8BM80acS6Yy1bPoLN1j1zdtK4CoKhGxl9P4gJyMUh/KoLq4WQJq43F9Dat52bjQeDoJK+wEmdax0yx1JgVidF8hs2NW/LKGxZwjwZArTejQrFaMgQeDSf+uA5zjhX10rVkahJsxwS5xj1fGn3TqZ2bnD4dPRtumDzzrVmku+s+cKLTuqJc++AeK1W8re58gcyE1pjP6V5yBpzfRKUR6J5+BSjT3oKZKHpF/TlNcorA= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Apr 2016 21:11:13.4963 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR05MB1687 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2848 Lines: 64 [Grr, resending as text/plain; I have no idea what inspired Thunderbird to send this as multipart/mixed with HTML.] On 4/22/2016 5:04 AM, Peter Zijlstra wrote: > Implement FETCH-OP atomic primitives, these are very similar to the > existing OP-RETURN primitives we already have, except they return the > value of the atomic variable_before_ modification. > > This is especially useful for irreversible operations -- such as > bitops (because it becomes impossible to reconstruct the state prior > to modification). > > XXX please look at the tilegx (CONFIG_64BIT) atomics, I think we get > the barriers wrong (at the very least they're inconsistent). > > Signed-off-by: Peter Zijlstra (Intel) > --- > arch/tile/include/asm/atomic.h | 4 + > arch/tile/include/asm/atomic_32.h | 60 +++++++++++++------ > arch/tile/include/asm/atomic_64.h | 117 +++++++++++++++++++++++++------------- > arch/tile/include/asm/bitops_32.h | 18 ++--- > arch/tile/lib/atomic_32.c | 42 ++++++------- > arch/tile/lib/atomic_asm_32.S | 14 ++-- > 6 files changed, 161 insertions(+), 94 deletions(-) > > [...] > static inline int atomic_add_return(int i, atomic_t *v) > { > int val; > smp_mb(); /* barrier for proper semantics */ > val = __insn_fetchadd4((void *)&v->counter, i) + i; > barrier(); /* the "+ i" above will wait on memory */ > + /* XXX smp_mb() instead, as per cmpxchg() ? */ > return val; > } The existing code is subtle but I'm pretty sure it's not a bug. The tilegx architecture will take the "+ i" and generate an add instruction. The compiler barrier will make sure the add instruction happens before anything else that could touch memory, and the microarchitecture will make sure that the result of the atomic fetchadd has been returned to the core before any further instructions are issued. (The memory architecture is lazy, but when you feed a load through an arithmetic operation, we block issuing any further instructions until the add's operands are available.) This would not be an adequate memory barrier in general, since other loads or stores might still be in flight, even if the "val" operand had made it from memory to the core at this point. However, we have issued no other stores or loads since the previous memory barrier, so we know that there can be no other loads or stores in flight, and thus the compiler barrier plus arithmetic op is equivalent to a memory barrier here. In hindsight, perhaps a more substantial comment would have been helpful here. Unless you see something missing in my analysis, I'll plan to go ahead and add a suitable comment here :-) Otherwise, though just based on code inspection so far: Acked-by: Chris Metcalf [for tile] -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com