Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751612AbbKJAJy (ORCPT ); Mon, 9 Nov 2015 19:09:54 -0500 Received: from g2t4622.austin.hp.com ([15.73.212.79]:60398 "EHLO g2t4622.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750863AbbKJAJx (ORCPT ); Mon, 9 Nov 2015 19:09:53 -0500 From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Scott J Norton , Douglas Hatch , Davidlohr Bueso , Waiman Long Subject: [PATCH tip/locking/core v10 0/7] locking/qspinlock: Enhance qspinlock & pvqspinlock performance Date: Mon, 9 Nov 2015 19:09:20 -0500 Message-Id: <1447114167-47185-1-git-send-email-Waiman.Long@hpe.com> X-Mailer: git-send-email 1.7.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6046 Lines: 136 v9->v10: - Broke patch 2 into two separated patches (suggested by PeterZ). - Changed the slowpath statistical counter code to use back debugfs while keeping the per-cpu counter setup. - Some minor twists and additional comments for the lock stealing and adaptive spinning patches. v8->v9: - Added a new patch 2 which tried to prefetch the cacheline of the next MCS node in order to reduce the MCS unlock latency when it was time to do the unlock. - Changed the slowpath statistical counters implementation in patch 4 from atomic_t to per-cpu variables to reduce performance overhead and used sysfs instead of debugfs to return the consolidated counts and data. v7->v8: - Annotated the use of each _acquire/_release variants in qspinlock.c. - Used the available pending bit in the lock stealing patch to disable lock stealing when the queue head vCPU is actively spinning on the lock to avoid lock starvation. - Restructured the lock stealing patch to reduce code duplication. - Verified that the waitcnt processing will be compiled away if QUEUED_LOCK_STAT isn't enabled. v6->v7: - Removed arch/x86/include/asm/qspinlock.h from patch 1. - Removed the unconditional PV kick patch as it has been merged into tip. - Changed the pvstat_inc() API to add a new condition parameter. - Added comments and rearrange code in patch 4 to clarify where lock stealing happened. - In patch 5, removed the check for pv_wait count when deciding when to wait early. - Updated copyrights and email address. v5->v6: - Added a new patch 1 to relax the cmpxchg and xchg operations in the native code path to reduce performance overhead on non-x86 architectures. - Updated the unconditional PV kick patch as suggested by PeterZ. - Added a new patch to allow one lock stealing attempt at slowpath entry point to reduce performance penalty due to lock waiter preemption. - Removed the pending bit and kick-ahead patches as they didn't show any noticeable performance improvement on top of the lock stealing patch. - Simplified the adaptive spinning patch as the lock stealing patch allows more aggressive pv_wait() without much performance penalty in non-overcommitted VMs. v4->v5: - Rebased the patch to the latest tip tree. - Corrected the comments and commit log for patch 1. - Removed the v4 patch 5 as PV kick deferment is no longer needed with the new tip tree. - Simplified the adaptive spinning patch (patch 6) & improve its performance a bit further. - Re-ran the benchmark test with the new patch. v3->v4: - Patch 1: add comment about possible racing condition in PV unlock. - Patch 2: simplified the pv_pending_lock() function as suggested by Davidlohr. - Move PV unlock optimization patch forward to patch 4 & rerun performance test. - Moved deferred kicking enablement patch forward & move back the kick-ahead patch to make the effect of kick-ahead more visible. - Reworked patch 6 to make it more readable. - Reverted back to use state as a tri-state variable instead of adding an additional bistate variable. - Added performance data for different values of PV_KICK_AHEAD_MAX. - Add a new patch to optimize PV unlock code path performance. v1->v2: - Take out the queued unfair lock patches - Add a patch to simplify the PV unlock code - Move pending bit and statistics collection patches to the front - Keep vCPU kicking in pv_kick_node(), but defer it to unlock time when appropriate. - Change the wait-early patch to use adaptive spinning to better balance the difference effect on normal and over-committed guests. - Add patch-to-patch performance changes in the patch commit logs. This patchset tries to improve the performance of both regular and over-commmitted VM guests. The adaptive spinning patch was inspired by the "Do Virtual Machines Really Scale?" blog from Sanidhya Kashyap. Patch 1 relaxes the memory order restriction of atomic operations by using less restrictive _acquire and _release variants of cmpxchg() and xchg(). This will reduce performance overhead when ported to other non-x86 architectures. Patch 2 attempts to prefetch the cacheline of the next MCS node to reduce latency in the MCS unlock operation. Patch 3 removes a redundant read of the next pointer. Patch 4 optimizes the PV unlock code path performance for x86-64 architecture. Patch 5 allows the collection of various slowpath statistics counter data that are useful to see what is happening in the system. Per-cpu counters are used to minimize performance overhead. Patch 6 allows one lock stealing attempt at slowpath entry. This causes a pretty big performance improvement for over-committed VM guests. Patch 7 enables adaptive spinning in the queue nodes. This patch leads to further performance improvement in over-committed guest, though it is not as big as the previous patch. Waiman Long (7): locking/qspinlock: Use _acquire/_release versions of cmpxchg & xchg locking/qspinlock: prefetch next node cacheline locking/qspinlock: Avoid redundant read of next pointer locking/pvqspinlock, x86: Optimize PV unlock code path locking/pvqspinlock: Collect slowpath lock statistics locking/pvqspinlock: Allow limited lock stealing locking/pvqspinlock: Queue node adaptive spinning arch/x86/Kconfig | 8 + arch/x86/include/asm/qspinlock_paravirt.h | 59 ++++++ include/asm-generic/qspinlock.h | 9 +- kernel/locking/qspinlock.c | 90 +++++++-- kernel/locking/qspinlock_paravirt.h | 252 +++++++++++++++++++++---- kernel/locking/qspinlock_stat.h | 293 +++++++++++++++++++++++++++++ 6 files changed, 648 insertions(+), 63 deletions(-) create mode 100644 kernel/locking/qspinlock_stat.h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/