Received: by 2002:a05:6a10:c7d3:0:0:0:0 with SMTP id h19csp1233271pxy; Sun, 15 Aug 2021 14:31:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwXeBN9KpcfE9DAFhCwaiu38CSNjlj/IH+Kq+ql6CaTHrxGS/e4bEp0vGUQYsZL7GYirhIV X-Received: by 2002:a02:2a88:: with SMTP id w130mr12521042jaw.60.1629063113271; Sun, 15 Aug 2021 14:31:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629063113; cv=none; d=google.com; s=arc-20160816; b=uNdKRJbIVT0Hhaff7Sm9UQxcJkjyUVA1gCdL6udmzZ4vL779bduh2nxYq0D9YgGhXd c5M9MuvriSyre7lHy8G+bkazZQg/zDbXwE9O2hWyXjeIJ9kCG3gdAym0MgGegFHnBSs2 +4GAPN/9QNpsA+Gbn9PZ6ctWaFaxhdJMRq5EzKvg9ybcwGIi1eLT246db/tRwnwyDEh/ 9Byo6OebEqTpEYnox4fWP/BIXhLT7HxsB0ula1nH6NrISt+XUeTe5XrJMkqP5/A5bMhV YeVWjpukV8KDP8P9RC/eVr5NddAKNnj1moUgWcwnNvDgKJwaotm6jN8BgGrneIsXBaq5 VO1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:content-transfer-encoding:mime-version :references:subject:cc:to:from:dkim-signature:dkim-signature :message-id; bh=vRYH3lXprNIkcIyNuvndwQFOMUpMUPQVnMjtHhMN11M=; b=fRATApHeyl74IZ4rNypX5h4P4tMjzYSUiuPAEoE19GB1FRe1prkYRoCd2wHLCKzoQG SryNE1RkV0VQWAxZpMcyqI73TUX/Rx+wEAIqQtGv7d+x9kzThg8U9sstoCTSew8506SP vrGaWtZVdpPeke+P+bjvhcS6+KMuLgNQCHsiG73nZdsjBRKqPQR6o8cMvnU71k+8cYqO Eex6ZwX768ctloAXb+X07MRd79fTN4qGW5HWr9/EEjqngFtFMdUr1oyfQWBfHavfhypP mW6lEHBytCG9MR0vpnng4aQLQgQKmvzdZhmoQQAEkOrW7+Y7nwluFT1/DZ69HeNZT4S1 dqGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=uJ7UEmGA; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=dyZ3w+t3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q18si8866818jao.21.2021.08.15.14.31.42; Sun, 15 Aug 2021 14:31:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=uJ7UEmGA; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=dyZ3w+t3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233404AbhHOVbT (ORCPT + 99 others); Sun, 15 Aug 2021 17:31:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232109AbhHOV3G (ORCPT ); Sun, 15 Aug 2021 17:29:06 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8109CC0612A7 for ; Sun, 15 Aug 2021 14:28:30 -0700 (PDT) Message-ID: <20210815211303.882793524@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1629062909; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=vRYH3lXprNIkcIyNuvndwQFOMUpMUPQVnMjtHhMN11M=; b=uJ7UEmGA+fUyFHO4RyILto+99utIyw4nmQxkvLaHOKyohww4ysKXgsLJEpuseMsVayG/FD HlDn1cHetAbW7eu8zG18KDAoVtKwiBACOZ4eE7Cldr+OiX2S8H6E/FP21gtyWGJNs3ieMx bVtwi5NKqsNWGbfVyYWweWX97LEMBBVYHHNRr3PiaIc4WDhWpoaH0tf49WmN/TQR4PmZMG 3iAnWbe7SdTZDfVHQ8hjwrVFAghhRcr6UNMFyTuYQeaY05ImglWL4iHJn+dJt3wz7avy4H RpvBTVIEa+7xxKDaKby0+UVmMjTUczgm+QiDs5R/LHhNC5y2Wq9XfcsAAICsLA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1629062909; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=vRYH3lXprNIkcIyNuvndwQFOMUpMUPQVnMjtHhMN11M=; b=dyZ3w+t3+essG4Sx444L+6YQC6H0ixclvbrc5ty5wlFQD+fOb/vO0k5h3c5jBiNSmztIxh uJZQxKEnWgN/srDg== From: Thomas Gleixner To: LKML Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Steven Rostedt , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , Sebastian Andrzej Siewior , Davidlohr Bueso , Mike Galbraith Subject: [patch V5 34/72] locking/rwlock: Provide RT variant References: <20210815203225.710392609@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-transfer-encoding: 8-bit Date: Sun, 15 Aug 2021 23:28:28 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Thomas Gleixner Similar to rw_semaphores on RT the rwlock substitution is not writer fair because it's not feasible to have a writer inherit it's priority to multiple readers. Readers blocked on a writer follow the normal rules of priority inheritance. Like RT spinlocks RT rwlocks are state preserving across the slow lock operations (contended case). Signed-off-by: Thomas Gleixner --- V5: Add missing might_sleep() and fix lockdep init (Sebastian) --- include/linux/rwlock_rt.h | 140 ++++++++++++++++++++++++++++++++++++++++ include/linux/rwlock_types.h | 49 ++++++++++---- include/linux/spinlock_rt.h | 2 kernel/Kconfig.locks | 2 kernel/locking/spinlock.c | 7 ++ kernel/locking/spinlock_debug.c | 5 + kernel/locking/spinlock_rt.c | 131 +++++++++++++++++++++++++++++++++++++ 7 files changed, 323 insertions(+), 13 deletions(-) create mode 100644 include/linux/rwlock_rt.h --- --- /dev/null +++ b/include/linux/rwlock_rt.h @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0-only +#ifndef __LINUX_RWLOCK_RT_H +#define __LINUX_RWLOCK_RT_H + +#ifndef __LINUX_SPINLOCK_RT_H +#error Do not include directly. Use spinlock.h +#endif + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +extern void __rt_rwlock_init(rwlock_t *rwlock, const char *name, + struct lock_class_key *key); +#else +static inline void __rt_rwlock_init(rwlock_t *rwlock, char *name, + struct lock_class_key *key) +{ +} +#endif + +#define rwlock_init(rwl) \ +do { \ + static struct lock_class_key __key; \ + \ + init_rwbase_rt(&(rwl)->rwbase); \ + __rt_rwlock_init(rwl, #rwl, &__key); \ +} while (0) + +extern void rt_read_lock(rwlock_t *rwlock); +extern int rt_read_trylock(rwlock_t *rwlock); +extern void rt_read_unlock(rwlock_t *rwlock); +extern void rt_write_lock(rwlock_t *rwlock); +extern int rt_write_trylock(rwlock_t *rwlock); +extern void rt_write_unlock(rwlock_t *rwlock); + +static __always_inline void read_lock(rwlock_t *rwlock) +{ + rt_read_lock(rwlock); +} + +static __always_inline void read_lock_bh(rwlock_t *rwlock) +{ + local_bh_disable(); + rt_read_lock(rwlock); +} + +static __always_inline void read_lock_irq(rwlock_t *rwlock) +{ + rt_read_lock(rwlock); +} + +#define read_lock_irqsave(lock, flags) \ + do { \ + typecheck(unsigned long, flags); \ + rt_read_lock(lock); \ + flags = 0; \ + } while (0) + +#define read_trylock(lock) __cond_lock(lock, rt_read_trylock(lock)) + +static __always_inline void read_unlock(rwlock_t *rwlock) +{ + rt_read_unlock(rwlock); +} + +static __always_inline void read_unlock_bh(rwlock_t *rwlock) +{ + rt_read_unlock(rwlock); + local_bh_enable(); +} + +static __always_inline void read_unlock_irq(rwlock_t *rwlock) +{ + rt_read_unlock(rwlock); +} + +static __always_inline void read_unlock_irqrestore(rwlock_t *rwlock, + unsigned long flags) +{ + rt_read_unlock(rwlock); +} + +static __always_inline void write_lock(rwlock_t *rwlock) +{ + rt_write_lock(rwlock); +} + +static __always_inline void write_lock_bh(rwlock_t *rwlock) +{ + local_bh_disable(); + rt_write_lock(rwlock); +} + +static __always_inline void write_lock_irq(rwlock_t *rwlock) +{ + rt_write_lock(rwlock); +} + +#define write_lock_irqsave(lock, flags) \ + do { \ + typecheck(unsigned long, flags); \ + rt_write_lock(lock); \ + flags = 0; \ + } while (0) + +#define write_trylock(lock) __cond_lock(lock, rt_write_trylock(lock)) + +#define write_trylock_irqsave(lock, flags) \ +({ \ + int __locked; \ + \ + typecheck(unsigned long, flags); \ + flags = 0; \ + __locked = write_trylock(lock); \ + __locked; \ +}) + +static __always_inline void write_unlock(rwlock_t *rwlock) +{ + rt_write_unlock(rwlock); +} + +static __always_inline void write_unlock_bh(rwlock_t *rwlock) +{ + rt_write_unlock(rwlock); + local_bh_enable(); +} + +static __always_inline void write_unlock_irq(rwlock_t *rwlock) +{ + rt_write_unlock(rwlock); +} + +static __always_inline void write_unlock_irqrestore(rwlock_t *rwlock, + unsigned long flags) +{ + rt_write_unlock(rwlock); +} + +#define rwlock_is_contended(lock) (((void)(lock), 0)) + +#endif --- a/include/linux/rwlock_types.h +++ b/include/linux/rwlock_types.h @@ -5,9 +5,19 @@ # error "Do not include directly, include spinlock_types.h" #endif +#ifdef CONFIG_DEBUG_LOCK_ALLOC +# define RW_DEP_MAP_INIT(lockname) \ + .dep_map = { \ + .name = #lockname, \ + .wait_type_inner = LD_WAIT_CONFIG, \ + } +#else +# define RW_DEP_MAP_INIT(lockname) +#endif + +#ifndef CONFIG_PREEMPT_RT /* - * include/linux/rwlock_types.h - generic rwlock type definitions - * and initializers + * generic rwlock type definitions and initializers * * portions Copyright 2005, Red Hat, Inc., Ingo Molnar * Released under the General Public License (GPL). @@ -25,16 +35,6 @@ typedef struct { #define RWLOCK_MAGIC 0xdeaf1eed -#ifdef CONFIG_DEBUG_LOCK_ALLOC -# define RW_DEP_MAP_INIT(lockname) \ - .dep_map = { \ - .name = #lockname, \ - .wait_type_inner = LD_WAIT_CONFIG, \ - } -#else -# define RW_DEP_MAP_INIT(lockname) -#endif - #ifdef CONFIG_DEBUG_SPINLOCK #define __RW_LOCK_UNLOCKED(lockname) \ (rwlock_t) { .raw_lock = __ARCH_RW_LOCK_UNLOCKED, \ @@ -50,4 +50,29 @@ typedef struct { #define DEFINE_RWLOCK(x) rwlock_t x = __RW_LOCK_UNLOCKED(x) +#else /* !CONFIG_PREEMPT_RT */ + +#include + +typedef struct { + struct rwbase_rt rwbase; + atomic_t readers; +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map dep_map; +#endif +} rwlock_t; + +#define __RWLOCK_RT_INITIALIZER(name) \ +{ \ + .rwbase = __RWBASE_INITIALIZER(name), \ + RW_DEP_MAP_INIT(name) \ +} + +#define __RW_LOCK_UNLOCKED(name) __RWLOCK_RT_INITIALIZER(name) + +#define DEFINE_RWLOCK(name) \ + rwlock_t name = __RW_LOCK_UNLOCKED(name) + +#endif /* CONFIG_PREEMPT_RT */ + #endif /* __LINUX_RWLOCK_TYPES_H */ --- a/include/linux/spinlock_rt.h +++ b/include/linux/spinlock_rt.h @@ -146,4 +146,6 @@ static inline int spin_is_locked(spinloc #define assert_spin_locked(lock) BUG_ON(!spin_is_locked(lock)) +#include + #endif --- a/kernel/Kconfig.locks +++ b/kernel/Kconfig.locks @@ -251,7 +251,7 @@ config ARCH_USE_QUEUED_RWLOCKS config QUEUED_RWLOCKS def_bool y if ARCH_USE_QUEUED_RWLOCKS - depends on SMP + depends on SMP && !PREEMPT_RT config ARCH_HAS_MMIOWB bool --- a/kernel/locking/spinlock.c +++ b/kernel/locking/spinlock.c @@ -124,8 +124,11 @@ void __lockfunc __raw_##op##_lock_bh(loc * __[spin|read|write]_lock_bh() */ BUILD_LOCK_OPS(spin, raw_spinlock); + +#ifndef CONFIG_PREEMPT_RT BUILD_LOCK_OPS(read, rwlock); BUILD_LOCK_OPS(write, rwlock); +#endif #endif @@ -209,6 +212,8 @@ void __lockfunc _raw_spin_unlock_bh(raw_ EXPORT_SYMBOL(_raw_spin_unlock_bh); #endif +#ifndef CONFIG_PREEMPT_RT + #ifndef CONFIG_INLINE_READ_TRYLOCK int __lockfunc _raw_read_trylock(rwlock_t *lock) { @@ -353,6 +358,8 @@ void __lockfunc _raw_write_unlock_bh(rwl EXPORT_SYMBOL(_raw_write_unlock_bh); #endif +#endif /* !CONFIG_PREEMPT_RT */ + #ifdef CONFIG_DEBUG_LOCK_ALLOC void __lockfunc _raw_spin_lock_nested(raw_spinlock_t *lock, int subclass) --- a/kernel/locking/spinlock_debug.c +++ b/kernel/locking/spinlock_debug.c @@ -31,6 +31,7 @@ void __raw_spin_lock_init(raw_spinlock_t EXPORT_SYMBOL(__raw_spin_lock_init); +#ifndef CONFIG_PREEMPT_RT void __rwlock_init(rwlock_t *lock, const char *name, struct lock_class_key *key) { @@ -48,6 +49,7 @@ void __rwlock_init(rwlock_t *lock, const } EXPORT_SYMBOL(__rwlock_init); +#endif static void spin_dump(raw_spinlock_t *lock, const char *msg) { @@ -139,6 +141,7 @@ void do_raw_spin_unlock(raw_spinlock_t * arch_spin_unlock(&lock->raw_lock); } +#ifndef CONFIG_PREEMPT_RT static void rwlock_bug(rwlock_t *lock, const char *msg) { if (!debug_locks_off()) @@ -228,3 +231,5 @@ void do_raw_write_unlock(rwlock_t *lock) debug_write_unlock(lock); arch_write_unlock(&lock->raw_lock); } + +#endif /* !CONFIG_PREEMPT_RT */ --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -127,3 +127,134 @@ void __rt_spin_lock_init(spinlock_t *loc } EXPORT_SYMBOL(__rt_spin_lock_init); #endif + +/* + * RT-specific reader/writer locks + */ +#define rwbase_set_and_save_current_state(state) \ + current_save_and_set_rtlock_wait_state() + +#define rwbase_restore_current_state() \ + current_restore_rtlock_saved_state() + +static __always_inline int +rwbase_rtmutex_lock_state(struct rt_mutex_base *rtm, unsigned int state) +{ + if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current))) + rtlock_slowlock(rtm); + return 0; +} + +static __always_inline int +rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int state) +{ + rtlock_slowlock_locked(rtm); + return 0; +} + +static __always_inline void rwbase_rtmutex_unlock(struct rt_mutex_base *rtm) +{ + if (likely(rt_mutex_cmpxchg_acquire(rtm, current, NULL))) + return; + + rt_mutex_slowunlock(rtm); +} + +static __always_inline int rwbase_rtmutex_trylock(struct rt_mutex_base *rtm) +{ + if (likely(rt_mutex_cmpxchg_acquire(rtm, NULL, current))) + return 1; + + return rt_mutex_slowtrylock(rtm); +} + +#define rwbase_signal_pending_state(state, current) (0) + +#define rwbase_schedule() \ + schedule_rtlock() + +#include "rwbase_rt.c" +/* + * The common functions which get wrapped into the rwlock API. + */ +int __sched rt_read_trylock(rwlock_t *rwlock) +{ + int ret; + + ret = rwbase_read_trylock(&rwlock->rwbase); + if (ret) { + rwlock_acquire_read(&rwlock->dep_map, 0, 1, _RET_IP_); + rcu_read_lock(); + migrate_disable(); + } + return ret; +} +EXPORT_SYMBOL(rt_read_trylock); + +int __sched rt_write_trylock(rwlock_t *rwlock) +{ + int ret; + + ret = rwbase_write_trylock(&rwlock->rwbase); + if (ret) { + rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_); + rcu_read_lock(); + migrate_disable(); + } + return ret; +} +EXPORT_SYMBOL(rt_write_trylock); + +void __sched rt_read_lock(rwlock_t *rwlock) +{ + ___might_sleep(__FILE__, __LINE__, 0); + rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_read_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); + migrate_disable(); +} +EXPORT_SYMBOL(rt_read_lock); + +void __sched rt_write_lock(rwlock_t *rwlock) +{ + ___might_sleep(__FILE__, __LINE__, 0); + rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_write_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); + migrate_disable(); +} +EXPORT_SYMBOL(rt_write_lock); + +void __sched rt_read_unlock(rwlock_t *rwlock) +{ + rwlock_release(&rwlock->dep_map, _RET_IP_); + migrate_enable(); + rcu_read_unlock(); + rwbase_read_unlock(&rwlock->rwbase, TASK_RTLOCK_WAIT); +} +EXPORT_SYMBOL(rt_read_unlock); + +void __sched rt_write_unlock(rwlock_t *rwlock) +{ + rwlock_release(&rwlock->dep_map, _RET_IP_); + rcu_read_unlock(); + migrate_enable(); + rwbase_write_unlock(&rwlock->rwbase); +} +EXPORT_SYMBOL(rt_write_unlock); + +int __sched rt_rwlock_is_contended(rwlock_t *rwlock) +{ + return rw_base_is_contended(&rwlock->rwbase); +} +EXPORT_SYMBOL(rt_rwlock_is_contended); + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +void __rt_rwlock_init(rwlock_t *rwlock, const char *name, + struct lock_class_key *key) +{ + debug_check_no_locks_freed((void *)rwlock, sizeof(*rwlock)); + lockdep_init_map_wait(&rwlock->dep_map, name, key, 0, LD_WAIT_CONFIG); +} +EXPORT_SYMBOL(__rt_rwlock_init); +#endif