Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp340919ybh; Mon, 20 Jul 2020 18:36:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy4McHd/5mtDsrc+36sVA9rStCHt9uaQm0uPQq+x/jTAP68StRfEUOROEw35/NJtpXDL+EV X-Received: by 2002:aa7:da90:: with SMTP id q16mr23102668eds.44.1595295396012; Mon, 20 Jul 2020 18:36:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595295396; cv=none; d=google.com; s=arc-20160816; b=S/RoLB00Qy0cO7SU5GeeM9CCAxVrcTIVJ6RkHnw8ixF8htc+eSiQMuSGFIygoV11Gr 1dSZvMHgk8eUSawf+V6HRdK7qQSDi86+3xNjlHXE5SaPzNH+PnlfAZxzj2N4Bndpm8z4 YGjUMxgBkvR7SEa7UguRx1ghLytypfy21UekLS4msc5u5z6eNDFXxfx4+1orE2FO7XjV LCKmaRbt9iOYupyWXtM5kYp4RTXfG5Mwzi6LGU4fuT56K4EIb0IpMZa6eT7ZLDGe3Ikd y1FC/swjhMphUaWMi2NUWthaAvHXoeuZQKEc/BXDUVpdVHhbHrfb0blxHs7O7e21DYrs S4Ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=JXH2pop5kFbVoanfp3kSA6NmCkm4dsbPA1ISv3YipCk=; b=p/dTDb6DgJMOQ3xsORXzCdpQ5Pvf7HhP4eDThnwweYRop0hiDwxg2QLiUMIRjPq8Be 52LuBZkVrZfANNfz39Gp0WoiILGMRbS+pL9nXfD0SCg7d2UOb9a0FZNCSg7GwjOIbEsh o26b5Ti9m1mJ3PdlXrXF/nV+Z+3//cZMKw2L/AxEb54CV3fcz5dw8Q2hKB4Z/NRboqoH FvDWtR2HwtZWIfgPFycZ6eCx5XnDJmJCTtd2PYWA9Hou8Jn7FL89g/bRffrDWbSr4Swr j2K27sBocJGrXVKmRHjjwFtfxgIM9+N1gEVNddywtgGnpx5dUBOAueO0AytKwEWdccvS aNqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e17si11316860ejk.740.2020.07.20.18.36.11; Mon, 20 Jul 2020 18:36:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726173AbgGUBfz (ORCPT + 99 others); Mon, 20 Jul 2020 21:35:55 -0400 Received: from mail.kernel.org ([198.145.29.99]:40794 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725857AbgGUBfz (ORCPT ); Mon, 20 Jul 2020 21:35:55 -0400 Received: from oasis.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0C57C206F2; Tue, 21 Jul 2020 01:35:52 +0000 (UTC) Date: Mon, 20 Jul 2020 21:35:51 -0400 From: Steven Rostedt To: "Ahmed S. Darwish" Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner , "Paul E. McKenney" , "Sebastian A. Siewior" , LKML , Jonathan Corbet , linux-doc@vger.kernel.org Subject: Re: [PATCH v4 01/24] Documentation: locking: Describe seqlock design and usage Message-ID: <20200720213551.5ba9bc6d@oasis.local.home> In-Reply-To: <20200720155530.1173732-2-a.darwish@linutronix.de> References: <20200519214547.352050-1-a.darwish@linutronix.de> <20200720155530.1173732-1-a.darwish@linutronix.de> <20200720155530.1173732-2-a.darwish@linutronix.de> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 20 Jul 2020 17:55:07 +0200 "Ahmed S. Darwish" wrote: > +Read path, three categories: > + > +1. Normal Sequence readers which never block a writer but they must > + retry if a writer is in progress by detecting change in the sequence > + number. Writers do not wait for a sequence reader:: > + > + do { > + seq = read_seqbegin(&foo_seqlock); > + > + /* ... [[read-side critical section]] ... */ > + > + } while (read_seqretry(&foo_seqlock, seq)); > + > +2. Locking readers which will wait if a writer or another locking reader > + is in progress. A locking reader in progress will also block a writer > + from entering its critical section. This read lock is > + exclusive. Unlike rwlock_t, only one locking reader can acquire it:: Nit, but I would mention that this acts similar to a normal spin_lock, and even disables preeption (in the non-RT case). > + > + read_seqlock_excl(&foo_seqlock); > + > + /* ... [[read-side critical section]] ... */ > + > + read_sequnlock_excl(&foo_seqlock); > + > +3. Conditional lockless reader (as in 1), or locking reader (as in 2), > + according to a passed marker. This is used to avoid lockless readers > + starvation (too much retry loops) in case of a sharp spike in write > + activity. First, a lockless read is tried (even marker passed). If > + that trial fails (odd sequence counter is returned, which is used as > + the next iteration marker), the lockless read is transformed to a > + full locking read and no retry loop is necessary:: > + > + /* marker; even initialization */ > + int seq = 0; > + do { > + read_seqbegin_or_lock(&foo_seqlock, &seq); > + > + /* ... [[read-side critical section]] ... */ > + > + } while (need_seqretry(&foo_seqlock, seq)); > + done_seqretry(&foo_seqlock, seq); > + > + > +API documentation > +================= > + > +.. kernel-doc:: include/linux/seqlock.h > diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h > index 8b97204f35a7..299d68f10325 100644 > --- a/include/linux/seqlock.h > +++ b/include/linux/seqlock.h > @@ -1,36 +1,15 @@ > /* SPDX-License-Identifier: GPL-2.0 */ > #ifndef __LINUX_SEQLOCK_H > #define __LINUX_SEQLOCK_H > + > /* > - * Reader/writer consistent mechanism without starving writers. This type of > - * lock for data where the reader wants a consistent set of information > - * and is willing to retry if the information changes. There are two types > - * of readers: > - * 1. Sequence readers which never block a writer but they may have to retry > - * if a writer is in progress by detecting change in sequence number. > - * Writers do not wait for a sequence reader. > - * 2. Locking readers which will wait if a writer or another locking reader > - * is in progress. A locking reader in progress will also block a writer > - * from going forward. Unlike the regular rwlock, the read lock here is > - * exclusive so that only one locking reader can get it. > + * seqcount_t / seqlock_t - a reader-writer consistency mechanism with > + * lockless readers (read-only retry loops), and no writer starvation. > * > - * This is not as cache friendly as brlock. Also, this may not work well > - * for data that contains pointers, because any writer could > - * invalidate a pointer that a reader was following. > + * See Documentation/locking/seqlock.rst > * > - * Expected non-blocking reader usage: > - * do { > - * seq = read_seqbegin(&foo); > - * ... > - * } while (read_seqretry(&foo, seq)); > - * > - * > - * On non-SMP the spin locks disappear but the writer still needs > - * to increment the sequence variables because an interrupt routine could > - * change the state of the data. > - * > - * Based on x86_64 vsyscall gettimeofday > - * by Keith Owens and Andrea Arcangeli > + * Copyrights: > + * - Based on x86_64 vsyscall gettimeofday: Keith Owens, Andrea Arcangeli > */ > > #include > @@ -41,8 +20,8 @@ > #include > > /* > - * The seqlock interface does not prescribe a precise sequence of read > - * begin/retry/end. For readers, typically there is a call to > + * The seqlock seqcount_t interface does not prescribe a precise sequence of > + * read begin/retry/end. For readers, typically there is a call to > * read_seqcount_begin() and read_seqcount_retry(), however, there are more > * esoteric cases which do not follow this pattern. > * > @@ -50,16 +29,30 @@ > * via seqcount_t under KCSAN: upon beginning a seq-reader critical section, > * pessimistically mark the next KCSAN_SEQLOCK_REGION_MAX memory accesses as > * atomics; if there is a matching read_seqcount_retry() call, no following > - * memory operations are considered atomic. Usage of seqlocks via seqlock_t > - * interface is not affected. > + * memory operations are considered atomic. Usage of the seqlock_t interface > + * is not affected. > */ > #define KCSAN_SEQLOCK_REGION_MAX 1000 > > /* > - * Version using sequence counter only. > - * This can be used when code has its own mutex protecting the > - * updating starting before the write_seqcountbeqin() and ending > - * after the write_seqcount_end(). > + * Sequence counters (seqcount_t) > + * > + * This is the raw counting mechanism, without any writer protection. > + * > + * Write side critical sections must be serialized and non-preemptible. > + * > + * If readers can be invoked from hardirq or softirq contexts, > + * interrupts or bottom halves must also be respectively disabled before > + * entering the write section. > + * > + * This mechanism can't be used if the protected data contains pointers, > + * as the writer can invalidate a pointer that a reader is following. > + * > + * If it's desired to automatically handle the sequence counter writer > + * serialization and non-preemptibility requirements, use a sequential > + * lock (seqlock_t) instead. > + * > + * See Documentation/locking/seqlock.rst > */ > typedef struct seqcount { > unsigned sequence; > @@ -398,10 +391,6 @@ static inline void raw_write_seqcount_latch(seqcount_t *s) > smp_wmb(); /* increment "sequence" before following stores */ > } > > -/* > - * Sequence counter only version assumes that callers are using their > - * own mutexing. > - */ > static inline void write_seqcount_begin_nested(seqcount_t *s, int subclass) > { > raw_write_seqcount_begin(s); > @@ -434,15 +423,21 @@ static inline void write_seqcount_invalidate(seqcount_t *s) > kcsan_nestable_atomic_end(); > } > > +/* > + * Sequential locks (seqlock_t) > + * > + * Sequence counters with an embedded spinlock for writer serialization > + * and non-preemptibility. > + * > + * For more info, see: > + * - Comments on top of seqcount_t > + * - Documentation/locking/seqlock.rst > + */ > typedef struct { > struct seqcount seqcount; > spinlock_t lock; > } seqlock_t; > > -/* > - * These macros triggered gcc-3.x compile-time problems. We think these are > - * OK now. Be cautious. > - */ > #define __SEQLOCK_UNLOCKED(lockname) \ > { \ > .seqcount = SEQCNT_ZERO(lockname), \