Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758554AbdHYUZp (ORCPT ); Fri, 25 Aug 2017 16:25:45 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:32906 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754631AbdHYUZn (ORCPT ); Fri, 25 Aug 2017 16:25:43 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Fri, 25 Aug 2017 13:25:42 -0700 From: Vikram Mulukutla To: Will Deacon Cc: qiaozhou , Thomas Gleixner , John Stultz , sboyd@codeaurora.org, LKML , Wang Wilbur , Marc Zyngier , Peter Zijlstra , linux-kernel-owner@vger.kernel.org, sudeep.holla@arm.com Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync In-Reply-To: <9f86bd426bbaede9de6d38cb047bd6fa@codeaurora.org> References: <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com> <20170728092831.GA24839@arm.com> <2aa9684cf9c889ee9fdc8550b4388af6@codeaurora.org> <20170731131321.GB1737@arm.com> <20170815184039.GE10801@arm.com> <9f86bd426bbaede9de6d38cb047bd6fa@codeaurora.org> Message-ID: User-Agent: Roundcube Webmail/1.2.5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1091 Lines: 36 On 2017-08-25 12:48, Vikram Mulukutla wrote: > > If I understand the code correctly, the upper 32 bits of an ARM64 > virtual > address will overflow when 1 is added to it, and so we'll keep WFE'ing > on > every subsequent cpu_relax invoked from the same PC, until we cross the > hard-coded threshold, right? > Oops, misread that. Second time we enter cpu_relax from the same PC, we do a WFE. Then we stop doing the WFE until we hit the threshold using the per-cpu counter. So with a higher threshold, we wait for more cpu_relax() calls before starting the WFE again. So a lower threshold implies we should hit WFE branch sooner. It seems that since my test keeps the while loop going for a full 5 seconds, a lower threshold will obviously result in more WFEs and lower the lock-acquired-count. I guess we want a high threshold but not so high that the little CPU has to wait a while before the big CPU counts up to the threshold, is that correct? Thanks, Vikram -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project