Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754763AbcJEMTe (ORCPT ); Wed, 5 Oct 2016 08:19:34 -0400 Received: from mail-sn1nam01on0138.outbound.protection.outlook.com ([104.47.32.138]:33497 "EHLO NAM01-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752213AbcJEMTb (ORCPT ); Wed, 5 Oct 2016 08:19:31 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=waiman.long@hpe.com; Message-ID: <57F4EFCA.6050503@hpe.com> Date: Wed, 5 Oct 2016 08:19:22 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Davidlohr Bueso CC: Peter Zijlstra , Ingo Molnar , , , , , , , , Jason Low , Dave Chinner , Jonathan Corbet , Scott J Norton , Douglas Hatch Subject: Re: [RFC PATCH-tip v4 01/10] locking/osq: Make lock/unlock proper acquire/release barrier References: <1471554672-38662-1-git-send-email-Waiman.Long@hpe.com> <1471554672-38662-2-git-send-email-Waiman.Long@hpe.com> <20161004190601.GD24086@linux-80c1.suse> In-Reply-To: <20161004190601.GD24086@linux-80c1.suse> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.102] X-ClientProxiedBy: YTOPR01CA0010.CANPRD01.PROD.OUTLOOK.COM (10.166.147.20) To AT5PR84MB0308.NAMPRD84.PROD.OUTLOOK.COM (10.162.138.30) X-MS-Office365-Filtering-Correlation-Id: 9c12b8ad-5100-44f9-0322-08d3ed19dad2 X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0308;2:CMIn9p3/FQCs6hBqCS9I9qQcSbVuQkLoWJ5oLs6zy37oJPboVbftCHuM7bPx2e8QlFy87dzsofwhaor2znZj749BI034lVvNDfwIxYHs9TqL41Tkv6ArGHVgq0uBNlnol3AYfBKS99+OFrK3jjyj3mlJQkJ4qDSp98nMA3091ARlGhxi0xRmY930d5s5ghnFYoj9pLoM9NeE3shaimrjqw==;3:JmZd9AW6ZbkMmpcZc6ZYYSUKKTf5CJRShxYdsS4EVVA/8bY1G6AHtFF1DU+Og/24D6l8eJpvqeLkK5As456A6kqGdoQ9yErT0k+7ZJmT9in1VdQxF+3xu3OgnO4+iVHBo+uMSlzE1EvA5psxzNAQPQ== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0308; X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0308;25:rCA2sfZ7bJr5D/c+ZoFT4/O5JAsVntku0gkSBgcx8ys1XpEPUoFJAIk4gZbRp0zuRV224hi0vDoCSObFr+uq+P8aSOKu3zj9/Dlt0XfiwemhQV35mC/jHyFxch/Vtwiw65I0fgy9UbOaka+stXcyhnstXLkiv9NNpiSMCV+8RTPVAOdB7aZWD3tJlZ1KTPqc7ygeMmX8imcqySOasRygz1NQPPX5g3oooJBLhmL0iuY7cKVcQk/QeCHUZd1ZPCAUoAPKCC5OsDaxWo8ZI79e+6gW3D5fiec0cvBM2SbB/RNPNtAe3fMppd1WZNCmhAF8Ro9ImmmEcB1kwkC1wJL0vQVle81CbJTED7RCEMDrzKL3veCK4mYbqQFedGk+G5jIwaU3x/vYECcklbVJnC14ZsmcCaJkygWaUqpTWp+umeD+SbmGCLimat9F6/M/OMNuFWARQBtvEZ6K82ckAxiOAGrsVh2CLkd1Z6cal+IHHpUXIJMra9msbz3pXanxshRrplXtH+7qSOPKi82C1AmLSjgs5tecGm3a9km3vGxXHFkNy4T2V0P2/2fqXWmaHwuPDJQer+hTx1UqDbi1LNRRYBo72lX/EWpM9zMynuqynJVdIiOdcxmRD+6f1Rc1Zp03engz0t+PV+KbJZfuszD2v0UnFU5Enk6IFMSw1MtwAi8aGsKwHxyd/bMqvoQBXkLlTmuuIA+F+xBlgOsT2OplV4CiVsWc4FffKhxs1e9GxlTGGdKoZP26K829mXWw2EET X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0308;31:GIpCy0D0Kw5/txqE1QFUvtrwBGwptXC6sUj6KY+9kJoiwSCTBRPd/harTaKA6A+CijbSN1L6TLLWbey6xCaxVPa9ibzPqKukYDKfK3kQZvHtyccDejxrKrpWIR95zOnT+KORHYsX962QbT8+lrvAY4ia8B6GsGFWBtMsyEHo/vtm8pyqPNDx13FHtngn8I2/UExF+G8veAQWeL4i7gYVQQKoSSnAtg0n/4tdvOgooOs/BNQgegzp3SHsrtpT9mG/;20:tOXw5e+VLWfIZapW3RYJOkn/gh3zHV/24DSqlHJCYUNRm0S66EDCo2JVVaFLbcC53wRQLP5qOhYSaOBcrsBhIVa7N3FDg8V/w0h3S4V0CM2U0e5buhfEq03L4g5UoJaWGoeL78IH/tX0vb3N4LBbnnkUDLzm1ZXjpeDEYgANUlTN/C5WnDlQQe524d/SUo9ideHZsGPGu3qH/UPKSJXSbOtnAfEG1cXU/zBaX3zsuFeqdcO/3lYav6t1jfxJlcFy4ax41G/TKz42w8VMBVifYU6jlz58zE3Ca6/vNWrJEiMRnU5ufYUQLIYNS4HjKDV6W5Itdj8Vzppuc+P4uEA0pHvYPVesin8NRU8mg+Y1o5blSwPI9lfCsOQM0hatbGLTJaDRAxled+eC99Q0CGT/dfawaonMn6rFC4BxuM8VhURxc/eHYLMU3vg4SpwEM1NshLxvpfLZbppCsLlQDhtXQUr2HCvA4w0ZfndfolOFv408V12ejDyGbVoxXfoW8yrt X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(227479698468861); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026);SRVR:AT5PR84MB0308;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0308; X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0308;4:El6LC9BWA/jbdInjf1PcinVIykvQHPBLPFKOJxuxEuPet7sDqQ6+TOnoW4yMl7gtoOYjO+OwRyLkqcLJjwqE4q19jJkjXckVWY3iQyeGs99QsEBaI7QilaA4sfu/1tS/iJLMkenSvgNCOfLDlwFVRJfVecqfNdR9M8IFLMuMRhOoxP/z32t7rHMQaU/BnHLW089kXHGU4Kmw1GZKx7AI6std3uc/8CpxNJe/S6TqRl89zPyrkN3ptkOgItYTZo1M06mhhQBDlJTSQilfU1S1NYjdTHqC2KXHGv4RtZfRFT0OvhnlzfnMQTXYebmu+41f7T2kH6Mv0sbx/ghZ65j/uvLxY8m3Yzo7E5LlBCALLQlrVsgSIPolREHalt0pgF5eSgMt6TnPL4o4wnyaygmlN9YYXUKlER8IghWAZq3AamFxxCpAp8fK2Tk3wwboGbXDnWF8P5vwQHa26v0rbLk29A== X-Forefront-PRVS: 008663486A X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(6049001)(7916002)(189002)(377454003)(24454002)(51234002)(57704003)(199003)(7416002)(81166006)(81156014)(8676002)(33656002)(36756003)(575784001)(66066001)(59896002)(117156001)(47776003)(77096005)(65956001)(65806001)(101416001)(305945005)(2950100002)(6666003)(86362001)(6916009)(92566002)(4326007)(64126003)(5660300001)(105586002)(2906002)(106356001)(7736002)(68736007)(42186005)(8666005)(7846002)(230700001)(50986999)(97736004)(3846002)(19580395003)(80316001)(189998001)(586003)(6116002)(83506001)(76176999)(54356999)(87266999)(65816999)(19580405001)(50466002)(4001350100001)(110136003)(23756003)(7059030);DIR:OUT;SFP:1102;SCL:1;SRVR:AT5PR84MB0308;H:[192.168.142.142];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;AT5PR84MB0308;23:LoTBJZa/8THZGkTB52khqQk/sQTEnjo/kg65eRk?= =?iso-8859-1?Q?yGc+yqWIaCEhGS2o5VlFFc5NOODIQHDEFrszTwzG3V8pc3I/JW5n8oHdxD?= =?iso-8859-1?Q?VjRvKaufbSF75z8WqZ2g0he/44AiyEDe2/M4yrS+xs+se7j88jtEO7eSFt?= =?iso-8859-1?Q?FYS5BOiK/vwQ2gCL73PKyqiHUajyLOYdwi4a0RIzrrpMwGXMb5X/CRQ1hj?= =?iso-8859-1?Q?c7cpZ80DVSmkEoSDvnei6SoI43UAAa+8+/znHx0hc7AxutZpyWbruscOJ6?= =?iso-8859-1?Q?DO0PWf08OYsGzmytvQOVbSJFAnPmwyyr6xL49mD2FGEuMsCVb9Q4ny4SEA?= =?iso-8859-1?Q?okROpH5WH8eSX71b69hzUFE9SlQ4PgWFduaXEUTL07pvliP1+LGwmTV5h3?= =?iso-8859-1?Q?WJRQpeod2LR4x4JujLnQdI0Dt4gjQrlVSa19s+F2GxjC05TqZ8su3EevlG?= =?iso-8859-1?Q?bXzbmu6dqV9QHYSBOHvEHKWg1DLqLyILuV/e9quE9xHXFZEgiXWGcnBpHP?= =?iso-8859-1?Q?NOefGRxuWbqWcE7vsFzIn/u3ov3HzWd7LqojS4umFOyoP+noKpCgy2uIgO?= =?iso-8859-1?Q?EhR0RKEiXxZKhd1WhRdXh9kBYWm0gQu1tVQcJAoJYGkvDp83xQrmMJelXV?= =?iso-8859-1?Q?zKKwE7axdLYHiqlDSA+l/ZdD0LZHGESll5jO9AkjCMV/OCtggtW++8yaFK?= =?iso-8859-1?Q?00M3eoc8YgazssmkkjnLtOPBf+NfaFAnPhZAo47CIfJ/a6ND+Sdd0iDl9m?= =?iso-8859-1?Q?OexZGdumgnJm+b2Uw2cKitg907ExRGuBoRvxZKWdhOETmLCI5l6IP3zjI7?= =?iso-8859-1?Q?BUTiw5CQilirgcYWzDmfEHZPDu5NP0GZ/YODXxOyVoqTNHpsdgRcII9arq?= =?iso-8859-1?Q?rduOX/ntS4N5tfZXc0R55je4V5KWTHjQMiB8kpqYGNDG6sxoDHM7rLCgqK?= =?iso-8859-1?Q?cE/2sXtWeirrom3y73xjKxv7DqoAiSfiSBUcIcPXwcOv1E7G/W52GRtG4k?= =?iso-8859-1?Q?hNLAwrm2iPz/VaDLcTmy3yifVZ7fSc0IOjUVGY0bNNZwFTuQiXNwoW1+Je?= =?iso-8859-1?Q?RotywX1EcNPFQqyanwNKdyJ0HBWutMa8irZVt/nHCFwz62cXREFiz7hFOL?= =?iso-8859-1?Q?tEpD/x+e/4NJNkYOE7WVLYdTtW5U7WZAYGVSfCj8n1KK8seWW201JR6hjZ?= =?iso-8859-1?Q?Ij7b97jAPNvtLeyf8bNo+TLExg6GQnib7JpEAVcmAFHUYa+fX8ADauZKRL?= =?iso-8859-1?Q?jly+heQ8HdLv9LLiuSLDzJxyNIoDXdUwik1ib64W0u5HGBEESIaMSe9ToF?= =?iso-8859-1?Q?RwHIgmp6UuC9y6gR+PuB89cMnhcCe52DnUg5pQwk/CH3LIAaTdYOTykKoJ?= =?iso-8859-1?Q?OF40Oi/PH1vIgIlbmYOdcUrcPDoYLq7DtUzlB11FC/UDKTVyWz8FSWjlM6?= =?iso-8859-1?Q?x9hm/eSZyBDDN+SGbUaArWvkuMjdOOmtCLT9dfMRMU3aqWWVdj4VNYfo4/?= =?iso-8859-1?Q?on8Uq6pr+Tc6X7TYoUFBz+oApNeS4snNtKozhrb0WVfZIWSulep7GVYKc7?= =?iso-8859-1?Q?luPOtYMEfpei2ewrEnuw9Ty5DE=3D?= X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0308;6:pbHMvMhZYjN4KML8lV0rZm6onwwfGItt/rX4F0+SgOvZc0fM+cAkB6sk0f+UH4LMu3GASfay7c0vQrTb1Nl0RuForsyAN/jncL3nR/ppbd9KabeTf7RRsA8ORssIhjijleZWTB4jTYBch0HAzvPzhXaBamPZ9cuzJTc+2X1zSQNkdGkp1Yh10DeMwKVCldxoxIbsr96YTnp2cfsOjRBMBl2Fwl2Pzl+CwLV1K2m2fmBQrOBZG9NaJwOsMX1RPrB3K4mL5RqXcpbxtOfBsRtD+Z4Jq83SiaiMQEqzmf87h1tW/UHQQ0KXb01meYuBHVjgo6Ki0f8MIk4gsLRis1J4JPq219OQEXPjj84LqczSdqI=;5:uAsBvBMptpfO+pdQR9BsdDAB9Z6npSDDqIJ10Dvn1BQ+hXrbQ19d54fLel/wxF5rfmNbvC7ppCPXIniLWrVHaguQTpY79e5ySPGZcFnG4mCQlFq9DlPzfHP/i1cHK225JtzDYAPk3vyIeiA1X5N9WJR+zzjuDLJeZmHJOrHlKys=;24:5B2puyut3wlADMOQc/+bH475NebE07MUHTpyG1eNm7XXKtebBsKuB83zoMWdy04r+s8azKPSVqQ9SKrvhZqHyk6/Od85vMkfJRAuHMOEaY4= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0308;7:lWgLsB+5Ji0ult8paDl4Hh69yMQG0OPYgaqD4tU8j7Zcum4dRzcggdbP3DI6dnQCNHFVPGhc0zUxdl/HYFEJoic52qd35T6P8qaTxgAYTKCDPAZH6pdbmxunFu//TDTkcNf4uBNM+ESAFnagkFzlzp2DbEPW4HJc/APp2YLnF0ZhnhkQkNe31t4qj1pDIvf8J/DIq0O+bWH+Nwp1RVDJ0lavi73gbcgkIbdkaoUhlhg/1s9BBUbb81iCywU0i436QSaAO1fjzblGqoOq/7ZL/TMuKLEyEcSVqm2UeICxg6RamDkAGn9fApiW9gVyDFyOcwVPbzz77qkYF3T0atpgEA== X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Oct 2016 12:19:27.5521 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AT5PR84MB0308 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3730 Lines: 108 On 10/04/2016 03:06 PM, Davidlohr Bueso wrote: > On Thu, 18 Aug 2016, Waiman Long wrote: > >> The osq_lock() and osq_unlock() function may not provide the necessary >> acquire and release barrier in some cases. This patch makes sure >> that the proper barriers are provided when osq_lock() is successful >> or when osq_unlock() is called. > > But why do we need these guarantees given that osq is only used > internally > for lock owner spinning situations? Leaking out of the critical region > will > obviously be bad if using it as a full lock, but, as is, this can only > hurt > performance of two of the most popular locks in the kernel -- although > yes, > using smp_acquire__after_ctrl_dep is nicer for polling. First of all, it is not obvious from the name osq_lock() that it is not an acquire barrier in some cases. We either need to clearly document it or has a variant name that indicate that, e.g. osq_lock_relaxed, for example. Secondly, if we look at the use cases of osq_lock(), the additional latency (for non-x86 archs) only matters if the master lock is immediately available for acquisition after osq_lock() return. Otherwise, it will be hidden in the spinning loop for that master lock. So yes, there may be a slight performance hit in some cases, but certainly not always. > If you need tighter osq for rwsems, could it be refactored such that > mutexes > do not take a hit? > Yes, we can certainly do that like splitting into 2 variants, one with acquire barrier guarantee and one without. >> >> Suggested-by: Peter Zijlstra (Intel) >> Signed-off-by: Waiman Long >> --- >> kernel/locking/osq_lock.c | 24 ++++++++++++++++++------ >> 1 files changed, 18 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c >> index 05a3785..3da0b97 100644 >> --- a/kernel/locking/osq_lock.c >> +++ b/kernel/locking/osq_lock.c >> @@ -124,6 +124,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) >> >> cpu_relax_lowlatency(); >> } >> + /* >> + * Add an acquire memory barrier for pairing with the release >> barrier >> + * in unlock. >> + */ >> + smp_acquire__after_ctrl_dep(); >> return true; >> >> unqueue: >> @@ -198,13 +203,20 @@ void osq_unlock(struct optimistic_spin_queue >> *lock) >> * Second most likely case. >> */ >> node = this_cpu_ptr(&osq_node); >> - next = xchg(&node->next, NULL); >> - if (next) { >> - WRITE_ONCE(next->locked, 1); >> + next = xchg_relaxed(&node->next, NULL); >> + if (next) >> + goto unlock; >> + >> + next = osq_wait_next(lock, node, NULL); >> + if (unlikely(!next)) { >> + /* >> + * In the unlikely event that the OSQ is empty, we need to >> + * provide a proper release barrier. >> + */ >> + smp_mb(); >> return; >> } >> >> - next = osq_wait_next(lock, node, NULL); >> - if (next) >> - WRITE_ONCE(next->locked, 1); >> +unlock: >> + smp_store_release(&next->locked, 1); >> } > > As well as for the smp_acquire__after_ctrl_dep comment you have above, > this also > obviously pairs with the osq_lock's smp_load_acquire while backing out > (unqueueing, > step A). Given the above, for this case we might also just rely on > READ_ONCE(node->locked), > if we get the conditional wrong and miss the node becoming locked, all > we do is another > iteration, and while there is a cmpxchg() there, it is mitigated with > the ccas thingy. Similar to osq_lock(), the current osq_unlock() does not have a release barrier guarantee. I think splitting into 2 variants - osq_unlock, osq_unlock_relaxed will help. Cheers, Longman