Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp947646pxb; Tue, 3 Nov 2020 17:36:36 -0800 (PST) X-Google-Smtp-Source: ABdhPJwzrQFS0hWdltjxcaj6jQR/GoRlb6t6g2rfNwc+Vr9VzrGdj1GiMsI/zXvJxLXzYNWmu1xD X-Received: by 2002:a17:906:b783:: with SMTP id dt3mr22601042ejb.534.1604453796463; Tue, 03 Nov 2020 17:36:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604453796; cv=none; d=google.com; s=arc-20160816; b=hLqDtjwBwqOYGAv2nBQT+oyFu1r8qnBuDvl8sfBI9gdE4HkQYryO0DJ77EysIcK/26 ufW9hlbfVUVO5940MI/9cQKVHu/MF6YkbXgYh2mc+BybSFgeOBbcAc2i0LBgRO0DCLif NELaWNpwKSix+BqnE3cbGvYecO7IzXmSfBhoWgro0Z5xt0pGEPdvv/snxv+u5aaOAT4m vH3gBr2jqVz5do2NVXjEXmczXU++UZwsj2h97nlrfWqGkhit5H30nBLT4b2K7WaDF0Rb 49SClfWcmLCHcuN4QgQIa/5rc2lDZZZe7tD6vCiq1ty5QKNHNh2Pz3nqLMpMRgLff7DC vIEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature :dkim-signature:date; bh=1AJF4GobDUJTV6t1V44SFSKZKDHDM6AfrFfGbqaBGrE=; b=y+xZKrMtqIHgvSv656/Zg3XaBKdQ2P2HZpZchb00gGpuk40nK4Ktn5ezIpj3fnE/1q 9SX3geIpTaKvm8SxcgPhBfg/EJjg5NYsntMuql06meTSv2lVb3nVF9hZylq7rUvjCJC7 q9W/RAiaCHsPhIRcusUMI+G6EEDoifN9En/HVWH7EXlaa4rBCQ0ddpHDfoZ3uEZ2M4Oh XawVFArmCDqFLij+sL2yhnLKPZukx6Oss8j43AftkKiy9CKtPtMWilb2LxYyC0DotHQl AZHIH0Rvhaa7/oriDAPo1juJbWKjORcuBvImTyN8YIzfOhIhdC5aSn/r9haDDpqIfOAs JCKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=XwJhrO+V; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=YroUllR3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g18si306250edq.53.2020.11.03.17.36.13; Tue, 03 Nov 2020 17:36:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=XwJhrO+V; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=YroUllR3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729827AbgKDBcR (ORCPT + 99 others); Tue, 3 Nov 2020 20:32:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725769AbgKDBcR (ORCPT ); Tue, 3 Nov 2020 20:32:17 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6552C061A4D for ; Tue, 3 Nov 2020 17:32:16 -0800 (PST) Date: Wed, 4 Nov 2020 02:32:12 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1604453535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1AJF4GobDUJTV6t1V44SFSKZKDHDM6AfrFfGbqaBGrE=; b=XwJhrO+V9si+YMFpuFtYQdwVPGGXuLw00/Yqf81pPPHmsMeu+bJtDQe/+SV+ELlLFwDiUQ FslfTsj5HRPnE6U+PV/CuOSXhzFfAe1TjCrdOImUJvGvVLCdDFgHSlxFMVXLv21VAyGFNV l2zzxzHE8ouzxvvlRycmtvUE/0UZBt9HqQ0gcY8T8ZIicqNynecFGwD+0beA47wsnrRiWd vDFwmbF6eKeaNrCFAzVtXSwSDPMXGuseIdZy2HWViG/McEH1kOx/O97mDA5DObFZ658uUD qyGtCCFnF24wLoDtl+54DCcBqcfb1ZVxd55tU8aF7oLPDasopKyBaztC4YaLIQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1604453535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1AJF4GobDUJTV6t1V44SFSKZKDHDM6AfrFfGbqaBGrE=; b=YroUllR3rn/8rB6gQrk4CKp73GdlO+mTMyI7Kl/IKitA3T5k5SbL0aMPEF+2LRamSmsraV XPGQ5f9Y5uOpKYBg== From: "Ahmed S. Darwish" To: Linus Torvalds Cc: John Hubbard , Jason Gunthorpe , Peter Xu , Linux Kernel Mailing List , Andrea Arcangeli , Andrew Morton , "Aneesh Kumar K.V" , Christoph Hellwig , Hugh Dickins , Jan Kara , Jann Horn , Kirill Shutemov , Kirill Tkhai , Leon Romanovsky , Linux-MM , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner , Sebastian Siewior Subject: Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork Message-ID: <20201104013212.GA82153@lx-t490> References: <0-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> <2-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> <20201030225250.GB6357@xz-x1> <20201030235121.GQ2620339@nvidia.com> <20201103001712.GB52235@lx-t490> <20201103002532.GL2620339@nvidia.com> <20201103004133.GD52235@lx-t490> <20201103065225.GA63301@lx-t490> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 03, 2020 at 09:40:22AM -0800, Linus Torvalds wrote: > On Mon, Nov 2, 2020 at 10:52 PM Ahmed S. Darwish > wrote: > > > > The problem is, I've already documented seqlock.h to death.... There are > > more comments than code in there, and there is "seqlock.rst" under > > Documentation/ to further describe the big picture. > > Well, honestly, I think the correct thing to do is to get rid of the > *_seqcount_t_*() functions entirely. > > They add nothing but confusion, and they are entirely misnamed. That's > not the pattern we use for "internal use only" functions, and they are > *very* confusing. > I see. Would the enclosed patch #1 be OK? It basically uses the "__do_" prefix instead, with some rationale. > > They have other issues too: like raw_write_seqcount_end() not being > usable on its own when preemptibility isn't an issue like here. You > basically _have_ to use raw_write_seqcount_t_end(), because otherwise > it tries to re-enable preemption that was never there. > Hmmm, raw_write_seqcount_{begin,end}() *never* disable/enable preemption for plain seqcount_t. This is why I kept recommending those for this patch series instead of internal raw_write_seqcount_*t*_{begin,end}(). But..... given that multiple people made the exact same remark by now, I guess that's due to: #define raw_write_seqcount_begin(s) \ do { \ if (__seqcount_lock_preemptible(s)) \ preempt_disable(); \ \ ... \ } while (0); #define raw_write_seqcount_end(s) \ do { \ ... \ \ if (__seqcount_lock_preemptible(s)) \ preempt_enable(); \ } while (0); The tricky part is that __seqcount_lock_preemptible() is always false for plain "seqcount_t". With that data type, the _Generic() selection makes it resolve to __seqprop_preemptible(), which just returns false. Originally, __seqcount_lock_preemptible() was called: __seqcount_associated_lock_exists_and_is_preemptible() but it got transformed to its current short form in the process of some pre-mainline refactorings. Looking at it now after all the dust has settled, maybe the verbose form was much more clear. Please see the enclosed patch #2... Would that be OK too? (I will submit these two patches in their own thread after some common ground is reached.) Patches ------- ====> ====> patch #1: ====> Subject: [RFC][PATCH 1/2] seqlock: Use __do_ prefix instead of non-standed _seqcount_t_ marker The use of "*_seqcount_t_*" as a marker to denote internal seqlock.h functions taking only plain seqcount_t instead of the whole seqcount_LOCKNAME_t family is confusing users, as it's also not the standard kernel pattern for denoting header file internal functions. Use the __do_ prefix instead. Note, a plain "__" prefix is not used since seqlock.h already uses it for some of its exported functions; e.g. __read_seqcount_begin() and __read_seqcount_retry(). Reported-by: Jason Gunthorpe Reported-by: John Hubbard Reported-by: Linus Torvalds Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com Signed-off-by: Ahmed S. Darwish --- include/linux/seqlock.h | 62 ++++++++++++++++++++--------------------- 1 file changed, 31 insertions(+), 31 deletions(-) diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index cbfc78b92b65..5de043841d33 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -425,9 +425,9 @@ SEQCOUNT_LOCKNAME(ww_mutex, struct ww_mutex, true, &s->lock->base, ww_mu * Return: true if a read section retry is required, else false */ #define __read_seqcount_retry(s, start) \ - __read_seqcount_t_retry(__seqcount_ptr(s), start) + __do___read_seqcount_retry(__seqcount_ptr(s), start) -static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start) +static inline int __do___read_seqcount_retry(const seqcount_t *s, unsigned start) { kcsan_atomic_next(0); return unlikely(READ_ONCE(s->sequence) != start); @@ -445,12 +445,12 @@ static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start) * Return: true if a read section retry is required, else false */ #define read_seqcount_retry(s, start) \ - read_seqcount_t_retry(__seqcount_ptr(s), start) + __do_read_seqcount_retry(__seqcount_ptr(s), start) -static inline int read_seqcount_t_retry(const seqcount_t *s, unsigned start) +static inline int __do_read_seqcount_retry(const seqcount_t *s, unsigned start) { smp_rmb(); - return __read_seqcount_t_retry(s, start); + return __do___read_seqcount_retry(s, start); } /** @@ -462,10 +462,10 @@ do { \ if (__seqcount_lock_preemptible(s)) \ preempt_disable(); \ \ - raw_write_seqcount_t_begin(__seqcount_ptr(s)); \ + __do_raw_write_seqcount_begin(__seqcount_ptr(s)); \ } while (0) -static inline void raw_write_seqcount_t_begin(seqcount_t *s) +static inline void __do_raw_write_seqcount_begin(seqcount_t *s) { kcsan_nestable_atomic_begin(); s->sequence++; @@ -478,13 +478,13 @@ static inline void raw_write_seqcount_t_begin(seqcount_t *s) */ #define raw_write_seqcount_end(s) \ do { \ - raw_write_seqcount_t_end(__seqcount_ptr(s)); \ + __do_raw_write_seqcount_end(__seqcount_ptr(s)); \ \ if (__seqcount_lock_preemptible(s)) \ preempt_enable(); \ } while (0) -static inline void raw_write_seqcount_t_end(seqcount_t *s) +static inline void __do_raw_write_seqcount_end(seqcount_t *s) { smp_wmb(); s->sequence++; @@ -506,12 +506,12 @@ do { \ if (__seqcount_lock_preemptible(s)) \ preempt_disable(); \ \ - write_seqcount_t_begin_nested(__seqcount_ptr(s), subclass); \ + __do_write_seqcount_begin_nested(__seqcount_ptr(s), subclass); \ } while (0) -static inline void write_seqcount_t_begin_nested(seqcount_t *s, int subclass) +static inline void __do_write_seqcount_begin_nested(seqcount_t *s, int subclass) { - raw_write_seqcount_t_begin(s); + __do_raw_write_seqcount_begin(s); seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_); } @@ -533,12 +533,12 @@ do { \ if (__seqcount_lock_preemptible(s)) \ preempt_disable(); \ \ - write_seqcount_t_begin(__seqcount_ptr(s)); \ + __do_write_seqcount_begin(__seqcount_ptr(s)); \ } while (0) -static inline void write_seqcount_t_begin(seqcount_t *s) +static inline void __do_write_seqcount_begin(seqcount_t *s) { - write_seqcount_t_begin_nested(s, 0); + __do_write_seqcount_begin_nested(s, 0); } /** @@ -549,16 +549,16 @@ static inline void write_seqcount_t_begin(seqcount_t *s) */ #define write_seqcount_end(s) \ do { \ - write_seqcount_t_end(__seqcount_ptr(s)); \ + __do_write_seqcount_end(__seqcount_ptr(s)); \ \ if (__seqcount_lock_preemptible(s)) \ preempt_enable(); \ } while (0) -static inline void write_seqcount_t_end(seqcount_t *s) +static inline void __do_write_seqcount_end(seqcount_t *s) { seqcount_release(&s->dep_map, _RET_IP_); - raw_write_seqcount_t_end(s); + __do_raw_write_seqcount_end(s); } /** @@ -603,9 +603,9 @@ static inline void write_seqcount_t_end(seqcount_t *s) * } */ #define raw_write_seqcount_barrier(s) \ - raw_write_seqcount_t_barrier(__seqcount_ptr(s)) + __do_raw_write_seqcount_barrier(__seqcount_ptr(s)) -static inline void raw_write_seqcount_t_barrier(seqcount_t *s) +static inline void __do_raw_write_seqcount_barrier(seqcount_t *s) { kcsan_nestable_atomic_begin(); s->sequence++; @@ -623,9 +623,9 @@ static inline void raw_write_seqcount_t_barrier(seqcount_t *s) * will complete successfully and see data older than this. */ #define write_seqcount_invalidate(s) \ - write_seqcount_t_invalidate(__seqcount_ptr(s)) + __do_write_seqcount_invalidate(__seqcount_ptr(s)) -static inline void write_seqcount_t_invalidate(seqcount_t *s) +static inline void __do_write_seqcount_invalidate(seqcount_t *s) { smp_wmb(); kcsan_nestable_atomic_begin(); @@ -865,7 +865,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start) } /* - * For all seqlock_t write side functions, use write_seqcount_*t*_begin() + * For all seqlock_t write side functions, use __do_write_seqcount_begin() * instead of the generic write_seqcount_begin(). This way, no redundant * lockdep_assert_held() checks are added. */ @@ -886,7 +886,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start) static inline void write_seqlock(seqlock_t *sl) { spin_lock(&sl->lock); - write_seqcount_t_begin(&sl->seqcount.seqcount); + __do_write_seqcount_begin(&sl->seqcount.seqcount); } /** @@ -898,7 +898,7 @@ static inline void write_seqlock(seqlock_t *sl) */ static inline void write_sequnlock(seqlock_t *sl) { - write_seqcount_t_end(&sl->seqcount.seqcount); + __do_write_seqcount_end(&sl->seqcount.seqcount); spin_unlock(&sl->lock); } @@ -912,7 +912,7 @@ static inline void write_sequnlock(seqlock_t *sl) static inline void write_seqlock_bh(seqlock_t *sl) { spin_lock_bh(&sl->lock); - write_seqcount_t_begin(&sl->seqcount.seqcount); + __do_write_seqcount_begin(&sl->seqcount.seqcount); } /** @@ -925,7 +925,7 @@ static inline void write_seqlock_bh(seqlock_t *sl) */ static inline void write_sequnlock_bh(seqlock_t *sl) { - write_seqcount_t_end(&sl->seqcount.seqcount); + __do_write_seqcount_end(&sl->seqcount.seqcount); spin_unlock_bh(&sl->lock); } @@ -939,7 +939,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl) static inline void write_seqlock_irq(seqlock_t *sl) { spin_lock_irq(&sl->lock); - write_seqcount_t_begin(&sl->seqcount.seqcount); + __do_write_seqcount_begin(&sl->seqcount.seqcount); } /** @@ -951,7 +951,7 @@ static inline void write_seqlock_irq(seqlock_t *sl) */ static inline void write_sequnlock_irq(seqlock_t *sl) { - write_seqcount_t_end(&sl->seqcount.seqcount); + __do_write_seqcount_end(&sl->seqcount.seqcount); spin_unlock_irq(&sl->lock); } @@ -960,7 +960,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl) unsigned long flags; spin_lock_irqsave(&sl->lock, flags); - write_seqcount_t_begin(&sl->seqcount.seqcount); + __do_write_seqcount_begin(&sl->seqcount.seqcount); return flags; } @@ -989,7 +989,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl) static inline void write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags) { - write_seqcount_t_end(&sl->seqcount.seqcount); + __do_write_seqcount_end(&sl->seqcount.seqcount); spin_unlock_irqrestore(&sl->lock, flags); } ====> ====> patch #2: ====> Subject: [PATCH 2/2] seqlock: seqcount_LOCKAME_t: Use more verbose macro names As evidenced by multiple discussions over LKML so far, it's not clear that __seqcount_lock_preemptible() is always false for plain seqcount_t. For that data type, the _Generic() selection resolves to __seqprop_preemptible(), which just returns false. Use __seqcount_associated_lock_exists_and_is_preemptible() instead, which hints that "preemptibility" is for the associated write serialization lock (if any), not for the seqcount itself. Similarly, rename __seqcount_assert_lock_held() to __seqcount_assert_associated_lock_held(). Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201030235121.GQ2620339@nvidia.com Link: https://lkml.kernel.org/r/20201103170327.GJ20600@xz-x1 Signed-off-by: Ahmed S. Darwish --- include/linux/seqlock.h | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index 5de043841d33..eb1e5a822e44 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -307,10 +307,10 @@ SEQCOUNT_LOCKNAME(ww_mutex, struct ww_mutex, true, &s->lock->base, ww_mu __seqprop_case((s), mutex, prop), \ __seqprop_case((s), ww_mutex, prop)) -#define __seqcount_ptr(s) __seqprop(s, ptr) -#define __seqcount_sequence(s) __seqprop(s, sequence) -#define __seqcount_lock_preemptible(s) __seqprop(s, preemptible) -#define __seqcount_assert_lock_held(s) __seqprop(s, assert) +#define __seqcount_ptr(s) __seqprop(s, ptr) +#define __seqcount_sequence(s) __seqprop(s, sequence) +#define __seqcount_associated_lock_exists_and_is_preemptible(s) __seqprop(s, preemptible) +#define __seqcount_assert_associated_lock_held(s) __seqprop(s, assert) /** * __read_seqcount_begin() - begin a seqcount_t read section w/o barrier @@ -459,7 +459,7 @@ static inline int __do_read_seqcount_retry(const seqcount_t *s, unsigned start) */ #define raw_write_seqcount_begin(s) \ do { \ - if (__seqcount_lock_preemptible(s)) \ + if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \ preempt_disable(); \ \ __do_raw_write_seqcount_begin(__seqcount_ptr(s)); \ @@ -480,7 +480,7 @@ static inline void __do_raw_write_seqcount_begin(seqcount_t *s) do { \ __do_raw_write_seqcount_end(__seqcount_ptr(s)); \ \ - if (__seqcount_lock_preemptible(s)) \ + if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \ preempt_enable(); \ } while (0) @@ -501,9 +501,9 @@ static inline void __do_raw_write_seqcount_end(seqcount_t *s) */ #define write_seqcount_begin_nested(s, subclass) \ do { \ - __seqcount_assert_lock_held(s); \ + __seqcount_assert_associated_lock_held(s); \ \ - if (__seqcount_lock_preemptible(s)) \ + if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \ preempt_disable(); \ \ __do_write_seqcount_begin_nested(__seqcount_ptr(s), subclass); \ @@ -528,9 +528,9 @@ static inline void __do_write_seqcount_begin_nested(seqcount_t *s, int subclass) */ #define write_seqcount_begin(s) \ do { \ - __seqcount_assert_lock_held(s); \ + __seqcount_assert_associated_lock_held(s); \ \ - if (__seqcount_lock_preemptible(s)) \ + if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \ preempt_disable(); \ \ __do_write_seqcount_begin(__seqcount_ptr(s)); \ @@ -551,7 +551,7 @@ static inline void __do_write_seqcount_begin(seqcount_t *s) do { \ __do_write_seqcount_end(__seqcount_ptr(s)); \ \ - if (__seqcount_lock_preemptible(s)) \ + if (__seqcount_associated_lock_exists_and_is_preemptible(s)) \ preempt_enable(); \ } while (0) > Linus Thanks, -- Ahmed S. Darwish Linutronix GmbH