2020-12-06 16:25:20

by Ahmed S. Darwish

[permalink] [raw]
Subject: [PATCH -tip v1 0/3] seqlock: assorted cleanups

Hi,

When the seqcount_LOCKNAME_t group of data types were introduced, two
classes of seqlock.h sequence counter macros were added:

- An external public API which can either take a plain seqcount_t or
any of the seqcount_LOCKNAME_t variants.

- An internal API which takes only a plain seqcount_t.

To distinguish between the two groups, the "*_seqcount_t_*" pattern was
used for the latter. This confused a number of mm/ call-site developers,
and Linus also commented that this was not the standard practice for
marking kernel internal APIs. [1]

Distinguish the latter group of macros by prefixing a "do_".

A number of call-site developers also complained that the automatic
preemption disable/enable for the write side macros was not obvious, or
documented. [2] Linus also suggested adding few comments explaining that
behavior. [3] Fix it by completing the seqcount write side kernel-doc
annotations.

Finally, fix a minor naming inconsistency w.r.t. seqlock.h
vs. Documentation/locking/seqlock.rst.

This series does not change the output "allyesconfig" kernel binary:

text data bss ... filename
247616963 289662125 81498728 ... ../build-x86-64/vmlinux.old
247616963 289662125 81498728 ... ../build-x86-64/vmlinux
145054028 78270273 18435468 ... ../build-arm/vmlinux.old
145054028 78270273 18435468 ... ../build-arm/vmlinux

Note: based over -tip locking/core, instead of latest -rc, due to -tip
ab440b2c604b ("seqlock: Rename __seqprop() users").

References:

[1] https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
[2] https://lkml.kernel.org/r/[email protected]
[3] https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com

8<--------------

Ahmed S. Darwish (3):
Documentation: seqlock: s/LOCKTYPE/LOCKNAME/g
seqlock: Prefix internal seqcount_t-only macros with a "do_"
seqlock: kernel-doc: Specify when preemption is automatically altered

Documentation/locking/seqlock.rst | 21 ++++----
include/linux/seqlock.h | 83 ++++++++++++++++---------------
2 files changed, 54 insertions(+), 50 deletions(-)

base-commit: 97d62caa32d6d79dadae3f8d19af5c92ea9a589a
--
2.29.2


2020-12-06 16:25:45

by Ahmed S. Darwish

[permalink] [raw]
Subject: [PATCH -tip v1 2/3] seqlock: Prefix internal seqcount_t-only macros with a "do_"

When the seqcount_LOCKNAME_t group of data types were introduced, two
classes of seqlock.h sequence counter macros were added:

- An external public API which can either take a plain seqcount_t or
any of the seqcount_LOCKNAME_t variants.

- An internal API which takes only a plain seqcount_t.

To distinguish between the two groups, the "*_seqcount_t_*" pattern was
used for the latter. This confused a number of mm/ call-site developers,
and Linus also commented that it was not a standard practice for marking
seqlock.h internal APIs.

Distinguish the latter group of macros by prefixing a "do_".

Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com
Signed-off-by: Ahmed S. Darwish <[email protected]>
---
include/linux/seqlock.h | 66 ++++++++++++++++++++---------------------
1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index d89134c74fba..235cbc65fd71 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -425,9 +425,9 @@ SEQCOUNT_LOCKNAME(ww_mutex, struct ww_mutex, true, &s->lock->base, ww_mu
* Return: true if a read section retry is required, else false
*/
#define __read_seqcount_retry(s, start) \
- __read_seqcount_t_retry(seqprop_ptr(s), start)
+ do___read_seqcount_retry(seqprop_ptr(s), start)

-static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
+static inline int do___read_seqcount_retry(const seqcount_t *s, unsigned start)
{
kcsan_atomic_next(0);
return unlikely(READ_ONCE(s->sequence) != start);
@@ -445,12 +445,12 @@ static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
* Return: true if a read section retry is required, else false
*/
#define read_seqcount_retry(s, start) \
- read_seqcount_t_retry(seqprop_ptr(s), start)
+ do_read_seqcount_retry(seqprop_ptr(s), start)

-static inline int read_seqcount_t_retry(const seqcount_t *s, unsigned start)
+static inline int do_read_seqcount_retry(const seqcount_t *s, unsigned start)
{
smp_rmb();
- return __read_seqcount_t_retry(s, start);
+ return do___read_seqcount_retry(s, start);
}

/**
@@ -462,10 +462,10 @@ do { \
if (seqprop_preemptible(s)) \
preempt_disable(); \
\
- raw_write_seqcount_t_begin(seqprop_ptr(s)); \
+ do_raw_write_seqcount_begin(seqprop_ptr(s)); \
} while (0)

-static inline void raw_write_seqcount_t_begin(seqcount_t *s)
+static inline void do_raw_write_seqcount_begin(seqcount_t *s)
{
kcsan_nestable_atomic_begin();
s->sequence++;
@@ -478,13 +478,13 @@ static inline void raw_write_seqcount_t_begin(seqcount_t *s)
*/
#define raw_write_seqcount_end(s) \
do { \
- raw_write_seqcount_t_end(seqprop_ptr(s)); \
+ do_raw_write_seqcount_end(seqprop_ptr(s)); \
\
if (seqprop_preemptible(s)) \
preempt_enable(); \
} while (0)

-static inline void raw_write_seqcount_t_end(seqcount_t *s)
+static inline void do_raw_write_seqcount_end(seqcount_t *s)
{
smp_wmb();
s->sequence++;
@@ -506,12 +506,12 @@ do { \
if (seqprop_preemptible(s)) \
preempt_disable(); \
\
- write_seqcount_t_begin_nested(seqprop_ptr(s), subclass); \
+ do_write_seqcount_begin_nested(seqprop_ptr(s), subclass); \
} while (0)

-static inline void write_seqcount_t_begin_nested(seqcount_t *s, int subclass)
+static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
{
- raw_write_seqcount_t_begin(s);
+ do_raw_write_seqcount_begin(s);
seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
}

@@ -533,12 +533,12 @@ do { \
if (seqprop_preemptible(s)) \
preempt_disable(); \
\
- write_seqcount_t_begin(seqprop_ptr(s)); \
+ do_write_seqcount_begin(seqprop_ptr(s)); \
} while (0)

-static inline void write_seqcount_t_begin(seqcount_t *s)
+static inline void do_write_seqcount_begin(seqcount_t *s)
{
- write_seqcount_t_begin_nested(s, 0);
+ do_write_seqcount_begin_nested(s, 0);
}

/**
@@ -549,16 +549,16 @@ static inline void write_seqcount_t_begin(seqcount_t *s)
*/
#define write_seqcount_end(s) \
do { \
- write_seqcount_t_end(seqprop_ptr(s)); \
+ do_write_seqcount_end(seqprop_ptr(s)); \
\
if (seqprop_preemptible(s)) \
preempt_enable(); \
} while (0)

-static inline void write_seqcount_t_end(seqcount_t *s)
+static inline void do_write_seqcount_end(seqcount_t *s)
{
seqcount_release(&s->dep_map, _RET_IP_);
- raw_write_seqcount_t_end(s);
+ do_raw_write_seqcount_end(s);
}

/**
@@ -603,9 +603,9 @@ static inline void write_seqcount_t_end(seqcount_t *s)
* }
*/
#define raw_write_seqcount_barrier(s) \
- raw_write_seqcount_t_barrier(seqprop_ptr(s))
+ do_raw_write_seqcount_barrier(seqprop_ptr(s))

-static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
+static inline void do_raw_write_seqcount_barrier(seqcount_t *s)
{
kcsan_nestable_atomic_begin();
s->sequence++;
@@ -623,9 +623,9 @@ static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
* will complete successfully and see data older than this.
*/
#define write_seqcount_invalidate(s) \
- write_seqcount_t_invalidate(seqprop_ptr(s))
+ do_write_seqcount_invalidate(seqprop_ptr(s))

-static inline void write_seqcount_t_invalidate(seqcount_t *s)
+static inline void do_write_seqcount_invalidate(seqcount_t *s)
{
smp_wmb();
kcsan_nestable_atomic_begin();
@@ -865,9 +865,9 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
}

/*
- * For all seqlock_t write side functions, use write_seqcount_*t*_begin()
- * instead of the generic write_seqcount_begin(). This way, no redundant
- * lockdep_assert_held() checks are added.
+ * For all seqlock_t write side functions, use the the internal
+ * do_write_seqcount_begin() instead of generic write_seqcount_begin().
+ * This way, no redundant lockdep_assert_held() checks are added.
*/

/**
@@ -886,7 +886,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
static inline void write_seqlock(seqlock_t *sl)
{
spin_lock(&sl->lock);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ do_write_seqcount_begin(&sl->seqcount.seqcount);
}

/**
@@ -898,7 +898,7 @@ static inline void write_seqlock(seqlock_t *sl)
*/
static inline void write_sequnlock(seqlock_t *sl)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock(&sl->lock);
}

@@ -912,7 +912,7 @@ static inline void write_sequnlock(seqlock_t *sl)
static inline void write_seqlock_bh(seqlock_t *sl)
{
spin_lock_bh(&sl->lock);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ do_write_seqcount_begin(&sl->seqcount.seqcount);
}

/**
@@ -925,7 +925,7 @@ static inline void write_seqlock_bh(seqlock_t *sl)
*/
static inline void write_sequnlock_bh(seqlock_t *sl)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock_bh(&sl->lock);
}

@@ -939,7 +939,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl)
static inline void write_seqlock_irq(seqlock_t *sl)
{
spin_lock_irq(&sl->lock);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ do_write_seqcount_begin(&sl->seqcount.seqcount);
}

/**
@@ -951,7 +951,7 @@ static inline void write_seqlock_irq(seqlock_t *sl)
*/
static inline void write_sequnlock_irq(seqlock_t *sl)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock_irq(&sl->lock);
}

@@ -960,7 +960,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
unsigned long flags;

spin_lock_irqsave(&sl->lock, flags);
- write_seqcount_t_begin(&sl->seqcount.seqcount);
+ do_write_seqcount_begin(&sl->seqcount.seqcount);
return flags;
}

@@ -989,7 +989,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
static inline void
write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
{
- write_seqcount_t_end(&sl->seqcount.seqcount);
+ do_write_seqcount_end(&sl->seqcount.seqcount);
spin_unlock_irqrestore(&sl->lock, flags);
}

--
2.29.2

2020-12-06 16:26:41

by Ahmed S. Darwish

[permalink] [raw]
Subject: [PATCH -tip v1 3/3] seqlock: kernel-doc: Specify when preemption is automatically altered

The kernel-doc annotations for sequence counters write side functions
are incomplete: they do not specify when preemption is automatically
disabled and re-enabled.

This has confused a number of call-site developers. Fix it.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com
Signed-off-by: Ahmed S. Darwish <[email protected]>
Cc: Jonathan Corbet <[email protected]>
---
include/linux/seqlock.h | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 235cbc65fd71..2f7bb92b4c9e 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -456,6 +456,8 @@ static inline int do_read_seqcount_retry(const seqcount_t *s, unsigned start)
/**
* raw_write_seqcount_begin() - start a seqcount_t write section w/o lockdep
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
+ *
+ * Context: check write_seqcount_begin()
*/
#define raw_write_seqcount_begin(s) \
do { \
@@ -475,6 +477,8 @@ static inline void do_raw_write_seqcount_begin(seqcount_t *s)
/**
* raw_write_seqcount_end() - end a seqcount_t write section w/o lockdep
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
+ *
+ * Context: check write_seqcount_end()
*/
#define raw_write_seqcount_end(s) \
do { \
@@ -498,6 +502,7 @@ static inline void do_raw_write_seqcount_end(seqcount_t *s)
* @subclass: lockdep nesting level
*
* See Documentation/locking/lockdep-design.rst
+ * Context: check write_seqcount_begin()
*/
#define write_seqcount_begin_nested(s, subclass) \
do { \
@@ -519,11 +524,10 @@ static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
* write_seqcount_begin() - start a seqcount_t write side critical section
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
*
- * write_seqcount_begin opens a write side critical section of the given
- * seqcount_t.
- *
- * Context: seqcount_t write side critical sections must be serialized and
- * non-preemptible. If readers can be invoked from hardirq or softirq
+ * Context: sequence counter write side sections must be serialized and
+ * non-preemptible. Preemption will be automatically disabled if and
+ * only if the seqcount write serialization lock is associated, and
+ * preemptible. If readers can be invoked from hardirq or softirq
* context, interrupts or bottom halves must be respectively disabled.
*/
#define write_seqcount_begin(s) \
@@ -545,7 +549,8 @@ static inline void do_write_seqcount_begin(seqcount_t *s)
* write_seqcount_end() - end a seqcount_t write side critical section
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
*
- * The write section must've been opened with write_seqcount_begin().
+ * Context: Preemption will be automatically re-enabled if and only if
+ * the seqcount write serialization lock is associated, and preemptible.
*/
#define write_seqcount_end(s) \
do { \
--
2.29.2

2020-12-07 14:10:27

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v1 0/3] seqlock: assorted cleanups

On Sun, Dec 06, 2020 at 05:21:40PM +0100, Ahmed S. Darwish wrote:
> Ahmed S. Darwish (3):
> Documentation: seqlock: s/LOCKTYPE/LOCKNAME/g
> seqlock: Prefix internal seqcount_t-only macros with a "do_"
> seqlock: kernel-doc: Specify when preemption is automatically altered

Thanks!

2020-12-07 20:42:38

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH -tip v1 2/3] seqlock: Prefix internal seqcount_t-only macros with a "do_"

On Sun, Dec 06, 2020 at 05:21:42PM +0100, Ahmed S. Darwish wrote:
> When the seqcount_LOCKNAME_t group of data types were introduced, two
> classes of seqlock.h sequence counter macros were added:
>
> - An external public API which can either take a plain seqcount_t or
> any of the seqcount_LOCKNAME_t variants.
>
> - An internal API which takes only a plain seqcount_t.
>
> To distinguish between the two groups, the "*_seqcount_t_*" pattern was
> used for the latter. This confused a number of mm/ call-site developers,
> and Linus also commented that it was not a standard practice for marking
> seqlock.h internal APIs.
>
> Distinguish the latter group of macros by prefixing a "do_".
>
> Link: https://lkml.kernel.org/r/CAHk-=wgB8nyOQufpn0o6a5BpJCJPnXvH+kRxApujhsgG+7qAwQ@mail.gmail.com
> Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com
> Signed-off-by: Ahmed S. Darwish <[email protected]>
> ---
> include/linux/seqlock.h | 66 ++++++++++++++++++++---------------------
> 1 file changed, 33 insertions(+), 33 deletions(-)

It is clearer, thank you

Jason

2020-12-07 20:47:01

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH -tip v1 3/3] seqlock: kernel-doc: Specify when preemption is automatically altered

On Sun, Dec 06, 2020 at 05:21:43PM +0100, Ahmed S. Darwish wrote:
> @@ -519,11 +524,10 @@ static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
> * write_seqcount_begin() - start a seqcount_t write side critical section
> * @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
> *
> - * write_seqcount_begin opens a write side critical section of the given
> - * seqcount_t.
> - *
> - * Context: seqcount_t write side critical sections must be serialized and
> - * non-preemptible. If readers can be invoked from hardirq or softirq
> + * Context: sequence counter write side sections must be serialized and
> + * non-preemptible. Preemption will be automatically disabled if and
> + * only if the seqcount write serialization lock is associated, and
> + * preemptible. If readers can be invoked from hardirq or softirq
> * context, interrupts or bottom halves must be respectively disabled.
> */

The thing that was confusing is if it was appropriate to use a
seqcount in case where write side preemption was not disabled - which
is safe only if the read side doesn't spin.

We seem to have only two places that do this, but since this comment
reads like it is absolutely forbidden, it is still confusing..

To make it clear a read side API to work with the seqlock for
non-premption cases would be nice, then the language could be 'must be
non-premeptible if using read_seqcount_retry(), but if using NEWTHING
then it is not required'

Jason

2020-12-08 14:36:40

by Ahmed S. Darwish

[permalink] [raw]
Subject: Re: [PATCH -tip v1 3/3] seqlock: kernel-doc: Specify when preemption is automatically altered

Hi Jason,

On Mon, Dec 07, 2020 at 04:43:16PM -0400, Jason Gunthorpe wrote:
...
>
> The thing that was confusing is if it was appropriate to use a
> seqcount in case where write side preemption was not disabled - which
> is safe only if the read side doesn't spin.
>

No, that's not correct.

What was confusing was that *everyone* in that mm/ thread, including
yourself, thought that write_seqcount_begin() will automatically disable
preemption for plain seqcount_t:

https://lkml.kernel.org/r/[email protected]

Quoting Peter Xu: "My understanding is that we used
raw_write_seqcount_t_begin() because we're with spin lock so assuming we
disabled preemption already."

Quoting you: "write_seqcount_begin - Enforces preemption off".

And that's why this patch explicitly adds to the kernel-doc: "Preemption
will be automatically disabled if and only if the seqcount write
serialization lock is associated, and preemptible."

Honestly, I should've added that statement anyway in my original
"sequence counters with associated locks" submission. I take the blame
for the resulting confusion, and I'm sorry...

~~~~~~~~~~~~~~~~~~~~

Now regarding the other point, where the seqcount_t write side does not
need to disable preemption if the reader does not spin. We've already
discussed that (at length) last time.

There comes a point where you decide what level of documentation to add,
and what level to skip.

Because in the end, you don't want to confuse "Joe, the generic driver
developer" with too much details that's not relevant to their task at
hand.

You want to keep the general case simple, but the special case do-able.
And you also want to encourage people to use the standard write side
critical section entry and exit functions, as much as possible.

Because, oh, look at that:

[PATCH v2 0/6] seqlock: seqcount_t call sites bugfixes
https://lkml.kernel.org/r/[email protected]

Sequence counters are already hard to get right. And driver developers
get things wrong, a lot -- you don't want to confuse them even further.

For developers who're advanced enough to know the difference, they don't
need the kernel-doc anyway. And that's why I've kindly asked to add the
following to your mm/ patch (which you did, thanks):

/*
* Disabling preemption is not needed for the write side, as
* the read side does not spin, but goes to mmap_lock.
* ...
*/

And IMHO, that should be enough. Developers of such special cases are
already assumed to know what they're doing.

Thanks a lot,

--
Ahmed S. Darwish
Linutronix GmbH

2020-12-08 15:51:19

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH -tip v1 3/3] seqlock: kernel-doc: Specify when preemption is automatically altered

On Tue, Dec 08, 2020 at 03:31:39PM +0100, Ahmed S. Darwish wrote:
> Hi Jason,
>
> On Mon, Dec 07, 2020 at 04:43:16PM -0400, Jason Gunthorpe wrote:
> ...
> >
> > The thing that was confusing is if it was appropriate to use a
> > seqcount in case where write side preemption was not disabled - which
> > is safe only if the read side doesn't spin.
> >
>
> No, that's not correct.

Well, that is where I started from.. seqcount in normal pre-emption
disabled cases was well understood, I needed a no-pre-emption disable
case.

> For developers who're advanced enough to know the difference, they don't
> need the kernel-doc anyway. And that's why I've kindly asked to add the
> following to your mm/ patch (which you did, thanks):

That is probably over stating things quite a lot. If there are valid
locking patterns then I think we should document them, otherwise
people simply do something crazy and get it wrong.

It was not entirely easy to figure out why preemption disable is
necessary here, though in hindsight it is obvious..

Jason