Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758485AbcCaUx3 (ORCPT ); Thu, 31 Mar 2016 16:53:29 -0400 Received: from mail-bl2on0125.outbound.protection.outlook.com ([65.55.169.125]:17775 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757200AbcCaUxZ (ORCPT ); Thu, 31 Mar 2016 16:53:25 -0400 Authentication-Results: infradead.org; dkim=none (message not signed) header.d=none;infradead.org; dmarc=none action=none header.from=hpe.com; Message-ID: <56FD8A94.9050807@hpe.com> Date: Thu, 31 Mar 2016 16:37:40 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Peter Zijlstra CC: Ingo Molnar , , Linus Torvalds , Ding Tianhong , Jason Low , Davidlohr Bueso , "Paul E. McKenney" , Thomas Gleixner , Will Deacon , Tim Chen Subject: Re: [PATCH v3 2/3] locking/mutex: Enable optimistic spinning of woken task in wait queue References: <1458668804-10138-1-git-send-email-Waiman.Long@hpe.com> <1458668804-10138-3-git-send-email-Waiman.Long@hpe.com> <20160329153935.GL3408@twins.programming.kicks-ass.net> In-Reply-To: <20160329153935.GL3408@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.178] X-ClientProxiedBy: SN1PR0701CA0018.namprd07.prod.outlook.com (10.162.96.28) To TU4PR84MB0318.NAMPRD84.PROD.OUTLOOK.COM (10.162.186.28) X-MS-Office365-Filtering-Correlation-Id: 38b2c626-f9c2-4769-c901-08d359a452dc X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0318;2:Y2pt6bgDL800bhxDttxdytAuXrNSDb28uZtZPP0A7GzzfUnZmgIFSTCQfFTlBowY9NvHKTovGdqFkxsdntoHmhPeS71KVM/EVZn5aridqAjpv9NYAoHlUe3tLLaFnaoV3QQZKuDuC0L+o8p0aj1K2Ux8IJX0hZmh669dpENfmozXZQWBIT4alZfYHlRUhbAf;3:m/Sba51AIMS1o536Tn16A6i1R8bEzpVxUbawjYCWgFK1/ctZ5uCZQzMhpfl6aP3ZQs8paws8HYsaM89Tr1PhVtxCEev+BbPwDhnAopqBDRTUUsWQuEU0Z9q785bKz/ef;25:KlC9O4iPsoAUTuaWRLKxoElN/rnPTeMU7oq2GenpDchF2aHaXhqqE0dJUCOTKR4yH9PFgwZtxPHHVS0Mc39IKP0evvJqQFdse3EOPYefBzDRfjbG3D/ICmmM+lxWyAv9SlrlA42vsTR4cfg2BZYzOHVEXM8KRP8dwH/JZiRtv6yxwXY/9nTM+f07Jo+40Htt/N7CXvcRsUwUMyFeBE3YIPeqVyxbBIn0RjJV6onxLBSoUYsK7P5jg2hw8l6EWF1nj+/2wSIYvIanDkYbT4nqhIiPYfajUbSc8MH3pFYtlQwk/jv8Kik797YLldnf1es+rOnok1P+x2xohZPbon89n7vaDGJ9CG+lFKuPV9d6Sk4= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:TU4PR84MB0318; X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0318;20:QAFWphxcRdfffOMglxlS17XU24sNNwbYgKe/NC3Ofq4NGdu8G7fH3tqU8SwTttCxo5DlTT9p7Lp0XZlZglm+HAzo3lG80kOX5KvS+fC5WedRvHiR09vH72VTu4YbQKLSKanMrirgYcXdOSDpCEm/F8QU1QC7dAfRxjflmWZZTSVSweBNwAq6D3n2W/ywNi81WdRDIpMi84l/7h52pcAZoMuP7PCzJ3lXCEhF1KfSXKni1lMNB293gS6VUyW65pRbM+d9SUSR17Rx1FxpCwYCAGsxZA+u4yBgzOKP3b6S4hw1CUrsDQREfg26jr1Zlx1E6imYIyMSRZxeAirr8F6UeQ==;4:2h2Ymq7fGPIjPYb0nWhz8x+0wpLLIkQqJO5f1JEYPsLCIuod1kc91fK9bbuBUXqiRN8ioiT3EW/dk0JSyDDfwsT2LHzvUoVmQpI5xi/F6HgVaPR/Im2qXfmr0xweVy0BbOOJJMC/5lnK6w0A3URq2MsrtxCTwzx3EE6CVkWpiM++tZwsjueDT9QiWPNiomQJFWhY35o1qhGRR9GH7dgLkrTnWzYDOk7ViCq911s8e1HNYgYEd1Zldf2yzrxutXuiLcJVDn19XvO9V8q9mNFxSKUj3sj96I66wlyH4KYVVVGn60WUXYO6kW+jOOiEeNGCiu+Z6fpXm41iqifOcWxsIn3f1K6GD2LtTv94oLxAK9eDfM52pJRU66lx/STO3pQK X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046);SRVR:TU4PR84MB0318;BCL:0;PCL:0;RULEID:;SRVR:TU4PR84MB0318; X-Forefront-PRVS: 0898A6E028 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(76104003)(24454002)(377454003)(33656002)(83506001)(1096002)(59896002)(6116002)(92566002)(586003)(117156001)(81166005)(230700001)(3846002)(50466002)(575784001)(77096005)(86362001)(76176999)(66066001)(54356999)(65806001)(65956001)(4326007)(47776003)(64126003)(23756003)(5004730100002)(2950100001)(50986999)(65816999)(87266999)(42186005)(5008740100001)(110136002)(36756003)(189998001)(4001350100001)(2906002)(7059030);DIR:OUT;SFP:1102;SCL:1;SRVR:TU4PR84MB0318;H:[192.168.142.148];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;TU4PR84MB0318;23:ERYD7gEMcL5sPV13uKPE6oOj1nGsuBwtrW3kiaU?= =?iso-8859-1?Q?1y9nh2I4l2HI0BmFdi7D5JSTNqKVWDjX3N5e9I5YTQ5WIzLubiWMyXXVVN?= =?iso-8859-1?Q?r6+KF5O9cRCranlnTJDDr+GqxapMva3AudUrBeTwP8ZrDsS7u7TqmG7BE9?= =?iso-8859-1?Q?qhSLa5a6T/whi+ID8F+0OASLypnd1q5InBQrfBxqEVA6oCNy2080TGS5Oc?= =?iso-8859-1?Q?MFneOU2aom66fWZ5G1jijJmadqXFLthOcfrmKshleXzj/GpjJ7bWwmkWTe?= =?iso-8859-1?Q?xnC1UbtsWFY8RU9RPFq4E1mZb+SWckNR5c4eNTCvuRIMiInufQ67bozVs1?= =?iso-8859-1?Q?PN/qkVgZMvhcM7DL2o8i57nU6pvpUX9kd48rEXAkLsi2KbtARxTutHe5/v?= =?iso-8859-1?Q?Te228W/nQLHoGAM5b3XJ28LMJMNmnMcmg0Vc+RmqO10PIHBnIx/Hd2nu0i?= =?iso-8859-1?Q?84Ht2vzR29Etg0ZQNSioi2SvopfzhDq3wK+xES7wgM+Y8UoQjcUKCL0atQ?= =?iso-8859-1?Q?LO02XfuNlH5DQTnG+PVc2f2B3n4g7mD6+42PVulU05yR7j3N44Q/rp+hqj?= =?iso-8859-1?Q?sKiXOrtTT+WK0EM6687C8ioDLavqPVgv4Eg2jw/VtDi7UGK4MFI5S3E2J2?= =?iso-8859-1?Q?ev5pDmAhsIuFT+SfaF1OXKK8LyUPxATQzuhAnXbkGYsxj5WlGSci57ra0j?= =?iso-8859-1?Q?mKLTA2LTQ12YGE0fUHn+QlxOaMa6/34fiKJ6yQytu7AOKh30tBXNlkRi8Y?= =?iso-8859-1?Q?MrbpLiQjVWIQHnsLLoGTDe7jrr3rYsLTocs+xAIf6ZUqHM1jRstofE+pH7?= =?iso-8859-1?Q?celNA39zUGOTcuTkmpv2HxFCdqrgL6M89s61moq4TmwBzo47x1o9N5mfcX?= =?iso-8859-1?Q?77KB18QTC3OMqAw2qCIrFbfIvkgjzjlahQeJEUXHXSh0LR5psTuQTY2Mx1?= =?iso-8859-1?Q?lJOoYyH0in/3sq87dm0A0r7fa7RBJFCBGB563RjojmRZV9ACcfIqHN2oVp?= =?iso-8859-1?Q?M1F1rkXVx1RPq7SN79LlozILOfz6W3Bx5ejerpfe94jL1uNUkvlbK9BU/U?= =?iso-8859-1?Q?mRoM2yWaOBarcSvHBvnCW3HlJXIJfO97pg2ycc82AS4lY73lzp+Ry15bbs?= =?iso-8859-1?Q?e6ZbqZqkBxwezNXX2m+gv5qlUeV2LAoELGqL+idacOKCwybk=3D?= X-Microsoft-Exchange-Diagnostics: 1;TU4PR84MB0318;5:iETnIuFY0H8HalA1K3emetLpyWa+QdsJcaCacX07ng4qxzgZx/wFklLvPsl6JRjrbOqE2tNb4K1dgpwy7xzqg1KGXQXF0hDWr7Kj39C3jNxDFE6Wi5GiSInBk3K63Y4GaHbIjSDGe619VOUSySZ6sA==;24:/y9P+URs2psz83GrY8XZrkZCW1wUrkJrcb4GSqQdTrF+qgWjYZJS4hISVdYwRHBeM16LXM2HR0HiPSAMPZisA4f4eeymeXNZC1IapfQsC4Y= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Mar 2016 20:37:47.2518 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: TU4PR84MB0318 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2547 Lines: 59 On 03/29/2016 11:39 AM, Peter Zijlstra wrote: > On Tue, Mar 22, 2016 at 01:46:43PM -0400, Waiman Long wrote: >> Ding Tianhong reported a live-lock situation where a constant stream >> of incoming optimistic spinners blocked a task in the wait list from >> getting the mutex. >> >> This patch attempts to fix this live-lock condition by enabling the >> woken task in the wait queue to enter into an optimistic spinning >> loop itself in parallel with the regular spinners in the OSQ. This >> should prevent the live-lock condition from happening. > I would very much like a few words on how fairness is preserved. > > Because while the waiter remains on the wait_list while it spins, and > therefore unlock()s will only wake it, and we'll only contend with the > one waiter, the fact that we have two spinners is not fair or starvation > proof at all. > > By adding the waiter to the OSQ we get only a single spinner and force > 'fairness' by queuing. > > I say 'fairness' because the OSQ (need_resched) cancellation can still > take the waiter out again and let even more new spinners in. > In my v1 patch, I added a flag in the mutex structure to signal that the waiter is spinning and the OSQ spinner should yield to address this fairness issue. I took it out in my later patchs as you said you want to make the patch simpler. Yes, I do agree that it is not guaranteed that the waiter spinner will have a decent chance to get the lock, but I think it is still better than queuing at the end of the OSQ as the time slice may expire before the waiter bubbles up to the beginning of the queue. This can be especially problematic if the waiter has lower priority which means shorter time slice. What do you think about the idea of adding a flag as in my v1 patch? For 64-bit systems, there is a 4-byte hole below osq and so it won't increase the structure size. There will be a 4-byte increase in size for 32-bit systems, though. Alternatively, I can certainly add a bit more comments to explain the situation and the choice that we made. >> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c >> index 5dd6171..5c0acee 100644 >> --- a/kernel/locking/mutex.c >> +++ b/kernel/locking/mutex.c >> @@ -538,6 +538,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, >> struct task_struct *task = current; >> struct mutex_waiter waiter; >> unsigned long flags; >> + bool acquired = false; /* True if the lock is acquired */ > Superfluous space there. OK, will remove that. Cheers, Longman