Subject: [PATCH v2 0/2] seqlock,mm: lockdep annotation + write_seqlock_irqsave()

Hi,

this has been a single patch (2/2) but then it was pointed out that the
lockdep annotation in seqlock needs to be adjusted to fully close the
printk window so that there is no printing after the seq-lock has been
acquired and before printk_deferred_enter() takes effect.

I'm sending both patches in this series so both sides (locking and mm)
are aware of the situation.
I hope that both patches can be applied independently via their subsystem
tree (the lockdep splat + deadlock is low risk).

The original thread starts at
https://lore.kernel.org/[email protected]

Sebastian




2023-06-25 03:18:50

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] seqlock,mm: lockdep annotation + write_seqlock_irqsave()

On 2023/06/24 2:12, Sebastian Andrzej Siewior wrote:
> Hi,
>
> this has been a single patch (2/2) but then it was pointed out that the
> lockdep annotation in seqlock needs to be adjusted to fully close the
> printk window so that there is no printing after the seq-lock has been
> acquired and before printk_deferred_enter() takes effect.
>
> I'm sending both patches in this series so both sides (locking and mm)
> are aware of the situation.
> I hope that both patches can be applied independently via their subsystem
> tree (the lockdep splat + deadlock is low risk).
>
> The original thread starts at
> https://lore.kernel.org/[email protected]
>
> Sebastian

The original thread is too long to read. Below is a full summary for locking
maintainers to accept [PATCH 1/2].



When commit 1ca7d67cf5d5 ("seqcount: Add lockdep functionality to
seqcount/seqlock structures") added seqcount_acquire(&s->dep_map) check to
write_seqcount_begin_nested() and seqcount_release(&s->dep_map) check to
write_seqcount_end(), the ordering of updating s->sequence and doing lockdep
annotation was not important.

But since commit 3d36424b3b58 ("mm/page_alloc: fix race condition between
build_all_zonelists and page allocation") started calling
read_seqbegin(&zonelist_update_seq)/read_seqretry(&zonelist_update_seq) from
kmalloc(GFP_ATOMIC) path, commit 1007843a9190 ("mm/page_alloc: fix potential
deadlock on zonelist_update_seq seqlock") tried to close the race window using

__build_all_zonelists() {
+ local_irq_save(flags);
+ printk_deferred_enter();
write_seqlock(&zonelist_update_seq);
(...snipped...)
write_sequnlock(&zonelist_update_seq);
+ printk_deferred_exit();
+ local_irq_restore(flags);
}

pattern. The reason behind this ordering was to

satisfy "printk_deferred_enter() depends on local IRQs being disabled"

and

make sure that "no synchronous printk() (for whatever reasons, not only
printk() from build_zonelists() from __build_all_zonelists(), but also
including printk() from lockdep, soft-lockup, KCSAN etc.) happens between
write_seqlock() and write_sequnlock()

. However, Sebastian Andrzej Siewior mentioned that this ordering is
problematic if CONFIG_PREEMPT_RT=y, for disabling local IRQs conflicts with
"spin_lock(&zonelist_update_seq.lock) from write_seqlock(&zonelist_update_seq)
needs to be able to sleep", and Sebastian is proposing

__build_all_zonelists() {
- local_irq_save(flags);
- printk_deferred_enter();
- write_seqlock(&zonelist_update_seq);
+ write_seqlock_irqsave(&zonelist_update_seq, flags);
+ printk_deferred_enter();
(...snipped...)
+ printk_deferred_exit();
+ write_sequnlock_irqrestore(&zonelist_update_seq, flags);
- write_sequnlock(&zonelist_update_seq);
- printk_deferred_exit();
- local_irq_restore(flags);
}

change as [PATCH 2/2]. Since write_seqlock_irqsave() becomes write_seqlock()
if CONFIG_PREEMPT_RT=y, this change can solve the conflict.

In order to accept this proposal, we need to make sure that

no synchronous printk() happens between

write_seqlock_irqsave(&zonelist_update_seq, flags) made
zonelist_update_seq.seqcount odd

and

printk_deferred_enter() takes effect

and

no synchronous printk() happens between

printk_deferred_exit() took effect

and

write_sequnlock_irqrestore(&zonelist_update_seq, flags) makes
zonelist_update_seq.seqcount even

, and Sebastian is proposing

static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
{
- do_raw_write_seqcount_begin(s);
seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
+ do_raw_write_seqcount_begin(s);
}

static inline void do_write_seqcount_end(seqcount_t *s)
{
- seqcount_release(&s->dep_map, _RET_IP_);
do_raw_write_seqcount_end(s);
+ seqcount_release(&s->dep_map, _RET_IP_);
}

as [PATCH 1/2].

With [PATCH 1/2] and [PATCH 2/2], possibility of synchronous printk() changes like below.

__build_all_zonelists() {
write_seqlock_irqsave(&zonelist_update_seq, flags) {
__write_seqlock_irqsave(&zonelist_update_seq) {
spin_lock_irqsave(&zonelist_update_seq.lock, flags); // local IRQs disabled = synchronous printk() from IRQs is disabled here
do_write_seqcount_begin(&zonelist_update_seq.seqcount.seqcount) {
do_write_seqcount_begin_nested(&zonelist_update_seq.seqcount.seqcount, 0) {
seqcount_acquire(&zonelist_update_seq.seqcount.seqcount.dep_map, 0, 0, _RET_IP_); // synchronous printk() from lockdep might happen here
do_raw_write_seqcount_begin(&zonelist_update_seq.seqcount.seqcount) {
seqcount_acquire(&zonelist_update_seq.seqcount.seqcount.dep_map, 0, 0, _RET_IP_); // zonelist_update_seq.seqcount.seqcount.sequence is guaranteed to be even = kmalloc(GFP_ATOMIC) with port lock held is safe = synchronous printk() is safe
kcsan_nestable_atomic_begin(); // KCSAN is diabled = synchronous printk() from KCSAN is disabled here
zonelist_update_seq.seqcount.seqcount.sequence++; // zonelist_update_seq.seqcount.seqcount.sequence is now odd = kmalloc(GFP_ATOMIC) with port lock held is not safe = synchronous printk() is not safe
}
}
}
}
}
printk_deferred_enter(); // synchronous printk() from whatever reason is disabled here
(...snipped...)
printk_deferred_exit(); // synchronous printk() from whatever reason is enabled here
write_sequnlock_irqrestore(&zonelist_update_seq, flags) {
do_write_seqcount_end(&zonelist_update_seq.seqcount.seqcount) {
do_raw_write_seqcount_end(&zonelist_update_seq.seqcount.seqcount) {
zonelist_update_seq.seqcount.seqcount.sequence++; // zonelist_update_seq.seqcount.seqcount.sequence is now even = kmalloc(GFP_ATOMIC) with port lock held is safe = synchronous printk() is safe
kcsan_nestable_atomic_end(); // KCSAN is enabled = synchronous printk() from KCSAN is enabled here
}
seqcount_release(&zonelist_update_seq.seqcount.seqcount.dep_map, _RET_IP_); // synchronous printk() from lockdep might happen here
}
spin_unlock_irqrestore(&zonelist_update_seq.lock, flags); // local IRQs enabled = synchronous printk() from IRQs is enabled here
}
}