Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753298AbcLFNPC (ORCPT ); Tue, 6 Dec 2016 08:15:02 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:38898 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752014AbcLFNN2 (ORCPT ); Tue, 6 Dec 2016 08:13:28 -0500 From: Pan Xinhui To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Cc: benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au, peterz@infradead.org, mingo@redhat.com, paulmck@linux.vnet.ibm.com, longman@redhat.com, xinhui.pan@linux.vnet.ibm.com, virtualization@lists.linux-foundation.org, boqun.feng@gmail.com Subject: [PATCH v9 6/6] powerpc/pv-qspinlock: Optimise native unlock path Date: Tue, 6 Dec 2016 13:09:27 -0500 X-Mailer: git-send-email 2.4.11 In-Reply-To: <1481047767-60255-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> References: <1481047767-60255-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16120613-0044-0000-0000-0000020CFFF3 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16120613-0045-0000-0000-0000061F522A Message-Id: <1481047767-60255-7-git-send-email-xinhui.pan@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-12-06_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1612060208 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2637 Lines: 78 Avoid a function call under native version of qspinlock. On powerNV, bafore applying this patch, every unlock is expensive. This small optimizes enhance the performance. We use static_key with jump_lable which removes unnecessary loads of lppaca and its stuff. Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/qspinlock_paravirt.h | 18 +++++++++++++++++- arch/powerpc/kernel/paravirt.c | 4 ++++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h b/arch/powerpc/include/asm/qspinlock_paravirt.h index d87cda0..8d39446 100644 --- a/arch/powerpc/include/asm/qspinlock_paravirt.h +++ b/arch/powerpc/include/asm/qspinlock_paravirt.h @@ -6,12 +6,14 @@ #define _ASM_QSPINLOCK_PARAVIRT_H #include +#include extern void pv_lock_init(void); extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); extern void __pv_init_lock_hash(void); extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); extern void __pv_queued_spin_unlock(struct qspinlock *lock); +extern struct static_key_true sharedprocessor_key; static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val) { @@ -20,7 +22,21 @@ static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val) static inline void pv_queued_spin_unlock(struct qspinlock *lock) { - pv_lock_op.unlock(lock); + /* + * on powerNV and pSeries with jump_label, code will be + * PowerNV: pSeries: + * nop; b 2f; + * native unlock 2: + * pv unlock; + * In this way, we can do unlock quick in native case. + * + * IF jump_label is not enabled, we fall back into + * if condition, IOW, ld && cmp && bne. + */ + if (static_branch_likely(&sharedprocessor_key)) + native_queued_spin_unlock(lock); + else + pv_lock_op.unlock(lock); } static inline void pv_wait(u8 *ptr, u8 val) diff --git a/arch/powerpc/kernel/paravirt.c b/arch/powerpc/kernel/paravirt.c index e697b17..a0a000e 100644 --- a/arch/powerpc/kernel/paravirt.c +++ b/arch/powerpc/kernel/paravirt.c @@ -140,6 +140,9 @@ struct pv_lock_ops pv_lock_op = { }; EXPORT_SYMBOL(pv_lock_op); +struct static_key_true sharedprocessor_key = STATIC_KEY_TRUE_INIT; +EXPORT_SYMBOL(sharedprocessor_key); + void __init pv_lock_init(void) { if (SHARED_PROCESSOR) { @@ -149,5 +152,6 @@ void __init pv_lock_init(void) pv_lock_op.unlock = __pv_queued_spin_unlock; pv_lock_op.wait = __pv_wait; pv_lock_op.kick = __pv_kick; + static_branch_disable(&sharedprocessor_key); } } -- 2.4.11