Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp4222107rdb; Thu, 28 Dec 2023 14:40:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IEdRt/1zPmtaXugRcG1ShySnyk5NFrrIchf0NjTuCUsmWqWTZ8euIy7LTFlXho9c7AF3m1F X-Received: by 2002:a05:6808:1817:b0:3bb:d669:5e26 with SMTP id bh23-20020a056808181700b003bbd6695e26mr1287709oib.33.1703803220262; Thu, 28 Dec 2023 14:40:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703803220; cv=none; d=google.com; s=arc-20160816; b=QajbC1P3trz0d2r1EE4eMsE0peg9gjPYPgkO/a/pjfES6AWXI/j5GVUOcNCYbriqqe hhjKCT7IusB0j4MUg/otRwuAlSH/BsAmfF2WwGMMAKPJFTYRDTsNAE7YG95jhozMFV5e 5JTNO3qoFIhou1BliqjAhsbrl0g0sv1zQybPFJc/wV45OCqZMNZbYG3VeLlaop4UVK/X 1X6jP509xhwyvZG6VbU/2KJStMATtowsYVMr5nOl82RiD5qNEdguOqG2uFtuHPQ5QUgM afdosgWofCgYJ8YbCs9CqNvmOfe/XsOxF2cJai+i5tWT2qnV0ZW49BbU/lI5VmhooMa1 F3Ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=kQ+uKuyxhmcjSMG7LN+mZVJSxMAB8MaVFuhiW7JiPfc=; fh=nDcwiR/5akruzAY2x7UCp1Di39F6AExvohgfDevE4sw=; b=tGaOOCSCOHv+V8eLDRVx6QYHgvwx7Z2xwmtvOwXB8tOrfaVHJqBzTaI+aCJSoj8gaC PevuSZA7GKrxtAeDOplRlo4B6khk7Kds1xsolDyYa1foWOD5woBrtPBjwHSqI+pPzIVU b/og4kZtxA24FyifSd4f4xkz4IzHBKMXwscSmiufXtkPNBR4w4yNUx/nCYldGGjqTY4L eJ8ApKGPAKu1Ro66e5YxylXFtOfEUEEBacDnZhF7eLGDlnzf7qRpHKXCA8/n9fzhXZ0B LQ1AbkGe2zyQKqqUYguxjxp7OZ8Kc0nd+auc5sK1y72+Ve80MOGlQ6806jYY809ArVRp 2ffw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=KHXXKQZp; spf=pass (google.com: domain of linux-kernel+bounces-12872-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12872-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id l4-20020a056a00140400b006d9aa48c4e7si8964418pfu.110.2023.12.28.14.40.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Dec 2023 14:40:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-12872-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=KHXXKQZp; spf=pass (google.com: domain of linux-kernel+bounces-12872-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12872-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 4E615B23CC3 for ; Thu, 28 Dec 2023 22:40:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 46EFF10A11; Thu, 28 Dec 2023 22:40:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KHXXKQZp" X-Original-To: linux-kernel@vger.kernel.org Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC9EB107BC; Thu, 28 Dec 2023 22:39:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=kQ+uKuyxhmcjSMG7LN+mZVJSxMAB8MaVFuhiW7JiPfc=; b=KHXXKQZpOt9etWodUQCdGaQE2b 9KJWxMzCTUUh3sxe3Y75CnkuzFp+l8AW3f2sSmKRCwanS720bp2Un6aHnAfX/z9/JtIRsMh/ypcC5 HGPs9M3qxsME+51W0rachnSOIb1ljosSCeax34cSZhyDvMN9xXfGsgmfoJ07Tp04fdpCh7WOs94G3 lDJy6Gv/cJRx890dse2nws3lBy7fN4fiz2GxWU+PvZ9Z1byTi2FrmfL2YqA9W/iWUnz91UC/s61J2 rSp8EQVu0uwUogN9IGCIUxZNxIetjAEg0lVuPElVdTrxavbxOqCYBBf5UhihqmCqxUVyPc7jTGZDF 27uLB1Mg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1rIyjT-005dpL-R0; Thu, 28 Dec 2023 22:20:03 +0000 Date: Thu, 28 Dec 2023 22:20:03 +0000 From: Matthew Wilcox To: "Eric W. Biederman" Cc: Maria Yu , kernel@quicinc.com, quic_pkondeti@quicinc.com, keescook@chromium.or, viro@zeniv.linux.org.uk, brauner@kernel.org, oleg@redhat.com, dhowells@redhat.com, jarkko@kernel.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, keyrings@vger.kernel.org, linux-security-module@vger.kernel.org, linux-arm-msm@vger.kernel.org Subject: Re: [PATCH] kernel: Introduce a write lock/unlock wrapper for tasklist_lock Message-ID: References: <20231213101745.4526-1-quic_aiquny@quicinc.com> <87o7eu7ybq.fsf@email.froward.int.ebiederm.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87o7eu7ybq.fsf@email.froward.int.ebiederm.org> On Wed, Dec 13, 2023 at 12:27:05PM -0600, Eric W. Biederman wrote: > Matthew Wilcox writes: > > I think the right way to fix this is to pass a boolean flag to > > queued_write_lock_slowpath() to let it know whether it can re-enable > > interrupts while checking whether _QW_WAITING is set. > > Yes. It seems to make sense to distinguish between write_lock_irq and > write_lock_irqsave and fix this for all of write_lock_irq. I wasn't planning on doing anything here, but Hillf kind of pushed me into it. I think it needs to be something like this. Compile tested only. If it ends up getting used, Signed-off-by: Matthew Wilcox (Oracle) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 75b8f4601b28..1152e080c719 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -33,8 +33,8 @@ /* * External function declarations */ -extern void queued_read_lock_slowpath(struct qrwlock *lock); -extern void queued_write_lock_slowpath(struct qrwlock *lock); +void queued_read_lock_slowpath(struct qrwlock *lock); +void queued_write_lock_slowpath(struct qrwlock *lock, bool irq); /** * queued_read_trylock - try to acquire read lock of a queued rwlock @@ -98,7 +98,21 @@ static inline void queued_write_lock(struct qrwlock *lock) if (likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))) return; - queued_write_lock_slowpath(lock); + queued_write_lock_slowpath(lock, false); +} + +/** + * queued_write_lock_irq - acquire write lock of a queued rwlock + * @lock : Pointer to queued rwlock structure + */ +static inline void queued_write_lock_irq(struct qrwlock *lock) +{ + int cnts = 0; + /* Optimize for the unfair lock case where the fair flag is 0. */ + if (likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))) + return; + + queued_write_lock_slowpath(lock, true); } /** @@ -138,6 +152,7 @@ static inline int queued_rwlock_is_contended(struct qrwlock *lock) */ #define arch_read_lock(l) queued_read_lock(l) #define arch_write_lock(l) queued_write_lock(l) +#define arch_write_lock_irq(l) queued_write_lock_irq(l) #define arch_read_trylock(l) queued_read_trylock(l) #define arch_write_trylock(l) queued_write_trylock(l) #define arch_read_unlock(l) queued_read_unlock(l) diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h index c0ef596f340b..897010b6ba0a 100644 --- a/include/linux/rwlock.h +++ b/include/linux/rwlock.h @@ -33,6 +33,7 @@ do { \ extern int do_raw_read_trylock(rwlock_t *lock); extern void do_raw_read_unlock(rwlock_t *lock) __releases(lock); extern void do_raw_write_lock(rwlock_t *lock) __acquires(lock); + extern void do_raw_write_lock_irq(rwlock_t *lock) __acquires(lock); extern int do_raw_write_trylock(rwlock_t *lock); extern void do_raw_write_unlock(rwlock_t *lock) __releases(lock); #else @@ -40,6 +41,7 @@ do { \ # define do_raw_read_trylock(rwlock) arch_read_trylock(&(rwlock)->raw_lock) # define do_raw_read_unlock(rwlock) do {arch_read_unlock(&(rwlock)->raw_lock); __release(lock); } while (0) # define do_raw_write_lock(rwlock) do {__acquire(lock); arch_write_lock(&(rwlock)->raw_lock); } while (0) +# define do_raw_write_lock_irq(rwlock) do {__acquire(lock); arch_write_lock_irq(&(rwlock)->raw_lock); } while (0) # define do_raw_write_trylock(rwlock) arch_write_trylock(&(rwlock)->raw_lock) # define do_raw_write_unlock(rwlock) do {arch_write_unlock(&(rwlock)->raw_lock); __release(lock); } while (0) #endif diff --git a/include/linux/rwlock_api_smp.h b/include/linux/rwlock_api_smp.h index dceb0a59b692..6257976dfb72 100644 --- a/include/linux/rwlock_api_smp.h +++ b/include/linux/rwlock_api_smp.h @@ -193,7 +193,7 @@ static inline void __raw_write_lock_irq(rwlock_t *lock) local_irq_disable(); preempt_disable(); rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_); - LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock); + LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock_irq); } static inline void __raw_write_lock_bh(rwlock_t *lock) diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index d2ef312a8611..6c644a71b01d 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -61,9 +61,10 @@ EXPORT_SYMBOL(queued_read_lock_slowpath); /** * queued_write_lock_slowpath - acquire write lock of a queued rwlock - * @lock : Pointer to queued rwlock structure + * @lock: Pointer to queued rwlock structure + * @irq: True if we can enable interrupts while spinning */ -void __lockfunc queued_write_lock_slowpath(struct qrwlock *lock) +void __lockfunc queued_write_lock_slowpath(struct qrwlock *lock, bool irq) { int cnts; @@ -82,7 +83,11 @@ void __lockfunc queued_write_lock_slowpath(struct qrwlock *lock) /* When no more readers or writers, set the locked flag */ do { + if (irq) + local_irq_enable(); cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); + if (irq) + local_irq_disable(); } while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED)); unlock: arch_spin_unlock(&lock->wait_lock); diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c index 87b03d2e41db..bf94551d7435 100644 --- a/kernel/locking/spinlock_debug.c +++ b/kernel/locking/spinlock_debug.c @@ -212,6 +212,13 @@ void do_raw_write_lock(rwlock_t *lock) debug_write_lock_after(lock); } +void do_raw_write_lock_irq(rwlock_t *lock) +{ + debug_write_lock_before(lock); + arch_write_lock_irq(&lock->raw_lock); + debug_write_lock_after(lock); +} + int do_raw_write_trylock(rwlock_t *lock) { int ret = arch_write_trylock(&lock->raw_lock);