Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp1625099pxb; Sat, 4 Sep 2021 16:14:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyVz3zLS5PR0igPg6ppMu/FVRzauy09mf4QHEuUMPJKwlJ8qiQU/q43QQiEnUvk7IPYfQUP X-Received: by 2002:a50:f0da:: with SMTP id a26mr6306808edm.58.1630797298368; Sat, 04 Sep 2021 16:14:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630797298; cv=none; d=google.com; s=arc-20160816; b=AN2GlIo03n7ISZQZnGhNRBDveikk8GDkd5snUDU+Q2vp0fsA2M6xveYpB0pk7DMDBM 6e7H59BoCJvFtgSmSV+4qO4wek0+u2y8nSBFWBnO3pmAnofBn51TwUtNid051HknE0rC eHvlC83gXnJGd1RiAVLy9qUNjAxx/4r+oZhh9KAxLE7AfQ4/G4VNfIpuAnVDjfTKDi2c RzqSx/nUJml3wnQt5VawG0BQBkFB528aheHoh6cr5LE9mZjR2fmqZadWKz1q4Ydt3EER pkU7028jku4jS4qPvDGYQVgkn6HRddG70saDHkV1Ydsdt0EanwLumCx2qb/cpTWT/DDS LfUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=gyxGd8WQx7i9bmt2v+veAiTfgeyT8etlwUgIc6oqcio=; b=ieAd/lu7oNG2ksL6CXm5d83lWKAURHirnIslCIfKGmjw+aYj3SiSwFniGwx5gN/mfC FhtszwDdUikZTCmujFKzuilZl+SDm3vwmL3iLSSvOkbrZDflEEpcHbuk13h+lazAUJIF pjqzsYIhDKTOO9cKbpWZUgnboKkPu+dN7usi6Q5on9Tq2Sj8JblfR2Mv9NDlWMXTso9X egZppRVdGNX+lN3h8eYyYjvo5BgAvWXVqTVamfGTDFtmL8cXXWEc1SupxoUAJo1RtFZ1 Wl4ypx8LvW7R7HS7yGrhXZTLqnGmn+TuqtXGaNNtNgRvg6CG31xsXcXnKNRcVE8xFs8k CWPg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mp24si3638860ejc.731.2021.09.04.16.14.23; Sat, 04 Sep 2021 16:14:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237467AbhIDXNZ (ORCPT + 99 others); Sat, 4 Sep 2021 19:13:25 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:52012 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236512AbhIDXNW (ORCPT ); Sat, 4 Sep 2021 19:13:22 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: tonyk) with ESMTPSA id 9AC8E1F41C89 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , linux-kernel@vger.kernel.org, Steven Rostedt , Sebastian Andrzej Siewior Cc: kernel@collabora.com, krisman@collabora.com, linux-api@vger.kernel.org, libc-alpha@sourceware.org, mtk.manpages@gmail.com, Davidlohr Bueso , Arnd Bergmann , =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v2 1/5] futex: Prepare for futex_wait_multiple() Date: Sat, 4 Sep 2021 20:11:55 -0300 Message-Id: <20210904231159.13292-2-andrealmeid@collabora.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210904231159.13292-1-andrealmeid@collabora.com> References: <20210904231159.13292-1-andrealmeid@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Make public functions and defines that will be used for futex_wait_multiple() function in next commit. Signed-off-by: André Almeida --- kernel/futex.c | 134 +--------------------------------------------- kernel/futex.h | 140 +++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 142 insertions(+), 132 deletions(-) create mode 100644 kernel/futex.h diff --git a/kernel/futex.c b/kernel/futex.c index e7b4c6121da4..58f2dc29c408 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -34,14 +34,11 @@ #include #include #include -#include #include #include #include -#include - -#include +#include "futex.h" #include "locking/rtmutex_common.h" /* @@ -150,22 +147,6 @@ static int __read_mostly futex_cmpxchg_enabled; #endif -/* - * Futex flags used to encode options to functions and preserve them across - * restarts. - */ -#ifdef CONFIG_MMU -# define FLAGS_SHARED 0x01 -#else -/* - * NOMMU does not have per process address space. Let the compiler optimize - * code away. - */ -# define FLAGS_SHARED 0x00 -#endif -#define FLAGS_CLOCKRT 0x02 -#define FLAGS_HAS_TIMEOUT 0x04 - /* * Priority Inheritance state: */ @@ -187,103 +168,6 @@ struct futex_pi_state { union futex_key key; } __randomize_layout; -/** - * struct futex_q - The hashed futex queue entry, one per waiting task - * @list: priority-sorted list of tasks waiting on this futex - * @task: the task waiting on the futex - * @lock_ptr: the hash bucket lock - * @key: the key the futex is hashed on - * @pi_state: optional priority inheritance state - * @rt_waiter: rt_waiter storage for use with requeue_pi - * @requeue_pi_key: the requeue_pi target futex key - * @bitset: bitset for the optional bitmasked wakeup - * @requeue_state: State field for futex_requeue_pi() - * @requeue_wait: RCU wait for futex_requeue_pi() (RT only) - * - * We use this hashed waitqueue, instead of a normal wait_queue_entry_t, so - * we can wake only the relevant ones (hashed queues may be shared). - * - * A futex_q has a woken state, just like tasks have TASK_RUNNING. - * It is considered woken when plist_node_empty(&q->list) || q->lock_ptr == 0. - * The order of wakeup is always to make the first condition true, then - * the second. - * - * PI futexes are typically woken before they are removed from the hash list via - * the rt_mutex code. See unqueue_me_pi(). - */ -struct futex_q { - struct plist_node list; - - struct task_struct *task; - spinlock_t *lock_ptr; - union futex_key key; - struct futex_pi_state *pi_state; - struct rt_mutex_waiter *rt_waiter; - union futex_key *requeue_pi_key; - u32 bitset; - atomic_t requeue_state; -#ifdef CONFIG_PREEMPT_RT - struct rcuwait requeue_wait; -#endif -} __randomize_layout; - -/* - * On PREEMPT_RT, the hash bucket lock is a 'sleeping' spinlock with an - * underlying rtmutex. The task which is about to be requeued could have - * just woken up (timeout, signal). After the wake up the task has to - * acquire hash bucket lock, which is held by the requeue code. As a task - * can only be blocked on _ONE_ rtmutex at a time, the proxy lock blocking - * and the hash bucket lock blocking would collide and corrupt state. - * - * On !PREEMPT_RT this is not a problem and everything could be serialized - * on hash bucket lock, but aside of having the benefit of common code, - * this allows to avoid doing the requeue when the task is already on the - * way out and taking the hash bucket lock of the original uaddr1 when the - * requeue has been completed. - * - * The following state transitions are valid: - * - * On the waiter side: - * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE - * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT - * - * On the requeue side: - * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS - * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED - * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed) - * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED - * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed) - * - * The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this - * signals that the waiter is already on the way out. It also means that - * the waiter is still on the 'wait' futex, i.e. uaddr1. - * - * The waiter side signals early wakeup to the requeue side either through - * setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending - * on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately - * proceed to take the hash bucket lock of uaddr1. If it set state to WAIT, - * which means the wakeup is interleaving with a requeue in progress it has - * to wait for the requeue side to change the state. Either to DONE/LOCKED - * or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex - * and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by - * the requeue side when the requeue attempt failed via deadlock detection - * and therefore the waiter q is still on the uaddr1 futex. - */ -enum { - Q_REQUEUE_PI_NONE = 0, - Q_REQUEUE_PI_IGNORE, - Q_REQUEUE_PI_IN_PROGRESS, - Q_REQUEUE_PI_WAIT, - Q_REQUEUE_PI_DONE, - Q_REQUEUE_PI_LOCKED, -}; - -static const struct futex_q futex_q_init = { - /* list gets initialized in queue_me()*/ - .key = FUTEX_KEY_INIT, - .bitset = FUTEX_BITSET_MATCH_ANY, - .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE), -}; /* * Hash buckets are shared by all the futex_keys that hash to the same @@ -453,7 +337,7 @@ enum futex_access { * Return: Initialized hrtimer_sleeper structure or NULL if no timeout * value given */ -static inline struct hrtimer_sleeper * +inline struct hrtimer_sleeper * futex_setup_timer(ktime_t *time, struct hrtimer_sleeper *timeout, int flags, u64 range_ns) { @@ -3973,20 +3857,6 @@ static __always_inline bool futex_cmd_has_timeout(u32 cmd) return false; } -static __always_inline int -futex_init_timeout(u32 cmd, u32 op, struct timespec64 *ts, ktime_t *t) -{ - if (!timespec64_valid(ts)) - return -EINVAL; - - *t = timespec64_to_ktime(*ts); - if (cmd == FUTEX_WAIT) - *t = ktime_add_safe(ktime_get(), *t); - else if (cmd != FUTEX_LOCK_PI && !(op & FUTEX_CLOCK_REALTIME)) - *t = timens_ktime_to_host(CLOCK_MONOTONIC, *t); - return 0; -} - SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val, const struct __kernel_timespec __user *, utime, u32 __user *, uaddr2, u32, val3) diff --git a/kernel/futex.h b/kernel/futex.h new file mode 100644 index 000000000000..c914e0080cf1 --- /dev/null +++ b/kernel/futex.h @@ -0,0 +1,140 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _FUTEX_H +#define _FUTEX_H + +#include + +#include +#include + +/* + * Futex flags used to encode options to functions and preserve them across + * restarts. + */ +#ifdef CONFIG_MMU +# define FLAGS_SHARED 0x01 +#else +/* + * NOMMU does not have per process address space. Let the compiler optimize + * code away. + */ +# define FLAGS_SHARED 0x00 +#endif +#define FLAGS_CLOCKRT 0x02 +#define FLAGS_HAS_TIMEOUT 0x04 + +/** + * struct futex_q - The hashed futex queue entry, one per waiting task + * @list: priority-sorted list of tasks waiting on this futex + * @task: the task waiting on the futex + * @lock_ptr: the hash bucket lock + * @key: the key the futex is hashed on + * @pi_state: optional priority inheritance state + * @rt_waiter: rt_waiter storage for use with requeue_pi + * @requeue_pi_key: the requeue_pi target futex key + * @bitset: bitset for the optional bitmasked wakeup + * + * We use this hashed waitqueue, instead of a normal wait_queue_entry_t, so + * we can wake only the relevant ones (hashed queues may be shared). + * + * A futex_q has a woken state, just like tasks have TASK_RUNNING. + * It is considered woken when plist_node_empty(&q->list) || q->lock_ptr == 0. + * The order of wakeup is always to make the first condition true, then + * the second. + * + * PI futexes are typically woken before they are removed from the hash list via + * the rt_mutex code. See unqueue_me_pi(). + */ +struct futex_q { + struct plist_node list; + + struct task_struct *task; + spinlock_t *lock_ptr; + union futex_key key; + struct futex_pi_state *pi_state; + struct rt_mutex_waiter *rt_waiter; + union futex_key *requeue_pi_key; + u32 bitset; + atomic_t requeue_state; +#ifdef CONFIG_PREEMPT_RT + struct rcuwait requeue_wait; +#endif +} __randomize_layout; + +/* + * On PREEMPT_RT, the hash bucket lock is a 'sleeping' spinlock with an + * underlying rtmutex. The task which is about to be requeued could have + * just woken up (timeout, signal). After the wake up the task has to + * acquire hash bucket lock, which is held by the requeue code. As a task + * can only be blocked on _ONE_ rtmutex at a time, the proxy lock blocking + * and the hash bucket lock blocking would collide and corrupt state. + * + * On !PREEMPT_RT this is not a problem and everything could be serialized + * on hash bucket lock, but aside of having the benefit of common code, + * this allows to avoid doing the requeue when the task is already on the + * way out and taking the hash bucket lock of the original uaddr1 when the + * requeue has been completed. + * + * The following state transitions are valid: + * + * On the waiter side: + * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE + * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT + * + * On the requeue side: + * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS + * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED + * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed) + * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED + * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed) + * + * The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this + * signals that the waiter is already on the way out. It also means that + * the waiter is still on the 'wait' futex, i.e. uaddr1. + * + * The waiter side signals early wakeup to the requeue side either through + * setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending + * on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately + * proceed to take the hash bucket lock of uaddr1. If it set state to WAIT, + * which means the wakeup is interleaving with a requeue in progress it has + * to wait for the requeue side to change the state. Either to DONE/LOCKED + * or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex + * and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by + * the requeue side when the requeue attempt failed via deadlock detection + * and therefore the waiter q is still on the uaddr1 futex. + */ +enum { + Q_REQUEUE_PI_NONE = 0, + Q_REQUEUE_PI_IGNORE, + Q_REQUEUE_PI_IN_PROGRESS, + Q_REQUEUE_PI_WAIT, + Q_REQUEUE_PI_DONE, + Q_REQUEUE_PI_LOCKED, +}; + +static const struct futex_q futex_q_init = { + /* list gets initialized in queue_me()*/ + .key = FUTEX_KEY_INIT, + .bitset = FUTEX_BITSET_MATCH_ANY, + .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE), +}; + +inline struct hrtimer_sleeper * +futex_setup_timer(ktime_t *time, struct hrtimer_sleeper *timeout, + int flags, u64 range_ns); + +static __always_inline int +futex_init_timeout(u32 cmd, u32 op, struct timespec64 *ts, ktime_t *t) +{ + if (!timespec64_valid(ts)) + return -EINVAL; + + *t = timespec64_to_ktime(*ts); + if (cmd == FUTEX_WAIT) + *t = ktime_add_safe(ktime_get(), *t); + else if (cmd != FUTEX_LOCK_PI && !(op & FUTEX_CLOCK_REALTIME)) + *t = timens_ktime_to_host(CLOCK_MONOTONIC, *t); + return 0; +} + +#endif -- 2.33.0