Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp710159imu; Thu, 20 Dec 2018 04:12:14 -0800 (PST) X-Google-Smtp-Source: AFSGD/UH28nMPzcHtP9SW0LN+krPk5V3iLW7ldiT8AC26U3pIQyeSkmaLDuTLuiBcGskGd1mw/Qv X-Received: by 2002:a17:902:887:: with SMTP id 7mr23669904pll.164.1545307934185; Thu, 20 Dec 2018 04:12:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545307934; cv=none; d=google.com; s=arc-20160816; b=Xm1i/b2eiaKvKtBp/qjMRiKNoKgphygB356Of0GcUKkBan5jUjvnb573inPSP34l97 8ZBnsg7328Dmqoam8NAYDHsj4AoAyW+HVZpyS/vL1XMiAzJU+H/mZsix8js/2NKl294C RZ4GtyZtyg7P1keo7ZJz42U01+TPzNyp+nq+bzf6h0FdN/il920dyu5TiykhxQK2saGU kifBtjSp43CJwWW1+Y3/x6llwLqutOGz23HaadbhxZRoc/lfwrzG9IB40zWF0abk5hH4 uQxTSsw4C4zcSUWL2hpZmdTEjEP0/Uymwj27oByJd8sWY+jYHU5dUGPhmCtP6Sd68sAI +nng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=KQAqer4R7NgKqfx084ADE+vjlA5WSLJUcacQUjolSwA=; b=zgrmMWkO4uGGktadLhYvC4Cc+GZO7ffPC0w7V36tboJH4GpQcX8GDaR6VrFwT7tjJw 6I+MyKByY36P9fztmTXlVK6vAhU44yCaFrGla5LjVVV5aBr/4BIZmJCMex/UqTAvjQcf 3Pf0lLwNxPFDM37Ip1TYJu5FuZkQo7RoNv1VcGOiTH8TtasvX19UTdCYhz+vnySpSL85 OmbJIad3usHTMXAkX/9ImRp0AZI8fe3gmRlYuw6Rywm65soGbQgZORmdguIU2r0UP9tl MFChmC7HG7syLVw3Qm36+0sj269BGxFZTRLR0anb5fdg4mqtHccVm8HwSh1YzyPowCAY h+nA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=jrFgmuJY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o85si19090012pfa.162.2018.12.20.04.11.37; Thu, 20 Dec 2018 04:12:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=jrFgmuJY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729944AbeLTJXp (ORCPT + 99 others); Thu, 20 Dec 2018 04:23:45 -0500 Received: from mail.kernel.org ([198.145.29.99]:55776 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731648AbeLTJXk (ORCPT ); Thu, 20 Dec 2018 04:23:40 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1CF962186A; Thu, 20 Dec 2018 09:23:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1545297819; bh=b8YZj5NDsMOhsJy4RWpf4gts4aOTi+9kywzbEpD24PM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jrFgmuJYDkAF8XUipLL0tFJsTmHL47aDVXyV6Lb7VmZ5KeRqoV6E6MpPRCCDH7/jL ndBudiV5QXSe6S0AGH3M7WOhp/f+EYQyaHDBA8uYTRBD6xsVj13O22Er/Dc/Z/tGbr qtEZINSdoj2Uak9flQ+cFD4gEzh3Pfkt/g0qBcJk= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Will Deacon , Thomas Gleixner , "Peter Zijlstra (Intel)" , Linus Torvalds , andrea.parri@amarulasolutions.com, longman@redhat.com, Ingo Molnar , Sebastian Andrzej Siewior , Sasha Levin Subject: [PATCH 4.9 29/61] locking/qspinlock, x86: Provide liveness guarantee Date: Thu, 20 Dec 2018 10:18:29 +0100 Message-Id: <20181220085844.901078994@linuxfoundation.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20181220085843.743900603@linuxfoundation.org> References: <20181220085843.743900603@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream. On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() with a composite operation: tas-pending + load. Using two instructions of course opens a window we previously did not have. Consider the scenario: CPU0 CPU1 CPU2 1) lock trylock -> (0,0,1) 2) lock trylock /* fail */ 3) unlock -> (0,0,0) 4) lock trylock -> (0,0,1) 5) tas-pending -> (0,1,1) load-val <- (0,1,0) from 3 6) clear-pending-set-locked -> (0,0,1) FAIL: _2_ owners where 5) is our new composite operation. When we consider each part of the qspinlock state as a separate variable (as we can when _Q_PENDING_BITS == 8) then the above is entirely possible, because tas-pending will only RmW the pending byte, so the later load is able to observe prior tail and lock state (but not earlier than its own trylock, which operates on the whole word, due to coherence). To avoid this we need 2 things: - the load must come after the tas-pending (obviously, otherwise it can trivially observe prior state). - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for example, such that we cannot observe other state prior to setting pending. On x86 we can realize this by using "LOCK BTS m32, r32" for tas-pending followed by a regular load. Note that observing later state is not a problem: - if we fail to observe a later unlock, we'll simply spin-wait for that store to become visible. - if we observe a later xchg_tail(), there is no difference from that xchg_tail() having taken place before the tas-pending. Suggested-by: Will Deacon Reported-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Will Deacon Cc: Linus Torvalds Cc: Peter Zijlstra Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath") Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org Signed-off-by: Ingo Molnar [bigeasy: GEN_BINARY_RMWcc macro redo] Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Sasha Levin --- arch/x86/include/asm/qspinlock.h | 21 +++++++++++++++++++++ kernel/locking/qspinlock.c | 17 ++++++++++++++++- 2 files changed, 37 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index 8b1ba1607091..9e78e963afb8 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -4,9 +4,30 @@ #include #include #include +#include #define _Q_PENDING_LOOPS (1 << 9) +#define queued_fetch_set_pending_acquire queued_fetch_set_pending_acquire + +static __always_inline bool __queued_RMW_btsl(struct qspinlock *lock) +{ + GEN_BINARY_RMWcc(LOCK_PREFIX "btsl", lock->val.counter, + "I", _Q_PENDING_OFFSET, "%0", c); +} + +static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock) +{ + u32 val = 0; + + if (__queued_RMW_btsl(lock)) + val |= _Q_PENDING_VAL; + + val |= atomic_read(&lock->val) & ~_Q_PENDING_MASK; + + return val; +} + #define queued_spin_unlock queued_spin_unlock /** * queued_spin_unlock - release a queued spinlock diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index f493a4fce624..0ed478e10071 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -224,6 +224,20 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) } #endif /* _Q_PENDING_BITS == 8 */ +/** + * queued_fetch_set_pending_acquire - fetch the whole lock value and set pending + * @lock : Pointer to queued spinlock structure + * Return: The previous lock value + * + * *,*,* -> *,1,* + */ +#ifndef queued_fetch_set_pending_acquire +static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock) +{ + return atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); +} +#endif + /** * set_locked - Set the lock bit and own the lock * @lock: Pointer to queued spinlock structure @@ -439,7 +453,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * 0,0,0 -> 0,0,1 ; trylock * 0,0,1 -> 0,1,1 ; pending */ - val = atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); + val = queued_fetch_set_pending_acquire(lock); + /* * If we observe any contention; undo and queue. */ -- 2.19.1