LinuxLists.cc - [PATCH next v2 2/3] printk: change @clear_seq to atomic64

2020-12-01 20:56:50

Subject: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

Currently @clear_seq access is protected by @logbuf_lock. Once
@logbuf_lock is removed some other form of synchronization will be
required. Change the type of @clear_seq to atomic64_t to provide the
synchronization.

Signed-off-by: John Ogness <[email protected]>
---
kernel/printk/printk.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fc5e3a7d6d89..e9018c4e1b66 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -403,7 +403,7 @@ static u64 exclusive_console_stop_seq;
static unsigned long console_dropped;

/* the next printk record to read after the last 'clear' command */
-static u64 clear_seq;
+static atomic64_t clear_seq = ATOMIC64_INIT(0);

#ifdef CONFIG_PRINTK_CALLER
#define PREFIX_MAX 48
@@ -797,7 +797,7 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
* like issued by 'dmesg -c'. Reading /dev/kmsg itself
* changes no global state, and does not clear anything.
*/
- user->seq = clear_seq;
+ user->seq = atomic64_read(&clear_seq);
break;
case SEEK_END:
/* after the last record */
@@ -914,6 +914,9 @@ void log_buf_vmcoreinfo_setup(void)
* parse it and detect any changes to structure down the line.
*/

+ VMCOREINFO_SIZE(atomic64_t);
+ VMCOREINFO_TYPE_OFFSET(atomic64_t, counter);
+
VMCOREINFO_STRUCT_SIZE(printk_ringbuffer);
VMCOREINFO_OFFSET(printk_ringbuffer, desc_ring);
VMCOREINFO_OFFSET(printk_ringbuffer, text_data_ring);
@@ -1476,6 +1479,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
struct printk_info info;
unsigned int line_count;
struct printk_record r;
+ u64 clr_seq;
char *text;
int len = 0;
u64 seq;
@@ -1487,15 +1491,17 @@ static int syslog_print_all(char __user *buf, int size, bool clear)

time = printk_time;
logbuf_lock_irq();
+ clr_seq = atomic64_read(&clear_seq);
+
/*
* Find first record that fits, including all following records,
* into the user-provided buffer for this dump.
*/
- prb_for_each_info(clear_seq, prb, seq, &info, &line_count)
+ prb_for_each_info(clr_seq, prb, seq, &info, &line_count)
len += get_record_print_text_size(&info, line_count, true, time);

/* move first record forward until length fits into the buffer */
- prb_for_each_info(clear_seq, prb, seq, &info, &line_count) {
+ prb_for_each_info(clr_seq, prb, seq, &info, &line_count) {
if (len <= size)
break;
len -= get_record_print_text_size(&info, line_count, true, time);
@@ -1526,7 +1532,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
}

if (clear)
- clear_seq = seq;
+ atomic64_set(&clear_seq, seq);
logbuf_unlock_irq();

kfree(text);
@@ -1536,7 +1542,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
static void syslog_clear(void)
{
logbuf_lock_irq();
- clear_seq = prb_next_seq(prb);
+ atomic64_set(&clear_seq, prb_next_seq(prb));
logbuf_unlock_irq();
}

@@ -3205,7 +3211,7 @@ void kmsg_dump(enum kmsg_dump_reason reason)
dumper->active = true;

logbuf_lock_irqsave(flags);
- dumper->cur_seq = clear_seq;
+ dumper->cur_seq = atomic64_read(&clear_seq);
dumper->next_seq = prb_next_seq(prb);
logbuf_unlock_irqrestore(flags);

@@ -3412,7 +3418,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
*/
void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
{
- dumper->cur_seq = clear_seq;
+ dumper->cur_seq = atomic64_read(&clear_seq);
dumper->next_seq = prb_next_seq(prb);
}

--
2.20.1

2020-12-04 09:15:24

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Tue 2020-12-01 21:59:40, John Ogness wrote:
> Currently @clear_seq access is protected by @logbuf_lock. Once
> @logbuf_lock is removed some other form of synchronization will be
> required. Change the type of @clear_seq to atomic64_t to provide the
> synchronization.
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index fc5e3a7d6d89..e9018c4e1b66 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3412,7 +3418,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
> */
> void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
> {
> - dumper->cur_seq = clear_seq;
> + dumper->cur_seq = atomic64_read(&clear_seq);

Sigh, atomic64_read() uses a spin lock in the generic implementation
that is used on some architectures.

Hmm, this seems to be the only location where the lock must not be
used. At the same time, a best effort might be acceptable here.

I am not super-happy with the following hack but it might work:

/*
* Use the best effort to avoid locks. In the worst case,
* the bottom and upper halves will be inconsistent. Then
* the value will be far too big or far too low. Fallback
* to the first available sequence number when it is
* too big.
*/
if (IS_ENABLED(CONFIG_GENERIC_ATOMIC64)) {
u64 first_seq = prb_first_seq(prb);

dumper->cur_seq = READ_ONCE(&clear_seq->counter);
if (dumper->cur_seq > first_seq)
dumper->cur_seq = first_seq;
} else {
dumper->cur_seq = atomic64_read(&clear_seq);
}

Alternative solution would to always fallback to the first_seq
on these architectures. Few people would complain when they see
more messages. We could always improve it when it causes problems.

Adding Peter Zijstra for his opinion [*].

> dumper->next_seq = prb_next_seq(prb);

[*] I am going to hide under a stone for the above hack.

Best Regards,
Petr

2020-12-06 20:29:01

by John Ogness

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On 2020-12-04, Petr Mladek <[email protected]> wrote:
> On Tue 2020-12-01 21:59:40, John Ogness wrote:
>> Currently @clear_seq access is protected by @logbuf_lock. Once
>> @logbuf_lock is removed some other form of synchronization will be
>> required. Change the type of @clear_seq to atomic64_t to provide the
>> synchronization.
>>
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index fc5e3a7d6d89..e9018c4e1b66 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -3412,7 +3418,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
>> */
>> void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
>> {
>> - dumper->cur_seq = clear_seq;
>> + dumper->cur_seq = atomic64_read(&clear_seq);
>
> Sigh, atomic64_read() uses a spin lock in the generic implementation
> that is used on some architectures.
>
> Hmm, this seems to be the only location where the lock must not be
> used.

Yes, and it is read-only access. Perhaps atomic64_t is the wrong thing
to use here. We could use a seqcount_latch and a shadow variable so that
if a writer has been preempted, we can use the previous value. (Only
kmsg_dump would need to use the lockless variant to read the value.)

void clear_seq_set(u64 val)
{
spin_lock_irq(&clear_lock);
raw_write_seqcount_latch(&clear_latch);
clear_seq[0] = val;
raw_write_seqcount_latch(&clear_latch);
clear_seq[1] = val;
spin_unlock_irq(&clear_lock);
}

u64 clear_seq_get_nolock(void)
{
unsigned int seq, idx;
u64 val;

do {
seq = raw_read_seqcount_latch(&clear_latch);
idx = seq & 0x1;
val = clear_seq[idx];
} while (read_seqcount_latch_retry(&clear_latch, seq));

return val;
}

u64 clear_seq_get(void)
{
u64 val;

spin_lock_irq(&clear_lock);
val = clear_seq[0];
spin_unlock_irq(&clear_lock);
return val;
}

> Alternative solution would to always fallback to the first_seq on
> these architectures. Few people would complain when they see more
> messages. We could always improve it when it causes problems.

I am also OK with this solution.

John Ogness

2020-12-07 09:38:49

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Sun, Dec 06, 2020 at 09:29:59PM +0106, John Ogness wrote:
> Yes, and it is read-only access. Perhaps atomic64_t is the wrong thing
> to use here. We could use a seqcount_latch and a shadow variable so that
> if a writer has been preempted, we can use the previous value. (Only
> kmsg_dump would need to use the lockless variant to read the value.)
>
> void clear_seq_set(u64 val)
> {
> spin_lock_irq(&clear_lock);
> raw_write_seqcount_latch(&clear_latch);
> clear_seq[0] = val;
> raw_write_seqcount_latch(&clear_latch);
> clear_seq[1] = val;
> spin_unlock_irq(&clear_lock);
> }
>
> u64 clear_seq_get_nolock(void)
> {
> unsigned int seq, idx;
> u64 val;
>
> do {
> seq = raw_read_seqcount_latch(&clear_latch);
> idx = seq & 0x1;
> val = clear_seq[idx];
> } while (read_seqcount_latch_retry(&clear_latch, seq));
>
> return val;
> }

That's overly complicated.

If you're going to double the storage you can simply do:

seq = val
smp_wmb();
seq_copy = val;

vs

do {
tmp = seq_copy;
smp_rmb();
val = seq;
} while (val != tmp);

Also look for CONFIG_64_BIT in kernel/sched/fair.c.

2020-12-07 10:08:08

by John Ogness

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On 2020-12-07, Peter Zijlstra <[email protected]> wrote:
>> Yes, and it is read-only access. Perhaps atomic64_t is the wrong thing
>> to use here. We could use a seqcount_latch and a shadow variable so that
>> if a writer has been preempted, we can use the previous value. (Only
>> kmsg_dump would need to use the lockless variant to read the value.)
>>
>> void clear_seq_set(u64 val)
>> {
>> spin_lock_irq(&clear_lock);
>> raw_write_seqcount_latch(&clear_latch);
>> clear_seq[0] = val;
>> raw_write_seqcount_latch(&clear_latch);
>> clear_seq[1] = val;
>> spin_unlock_irq(&clear_lock);
>> }
>>
>> u64 clear_seq_get_nolock(void)
>> {
>> unsigned int seq, idx;
>> u64 val;
>>
>> do {
>> seq = raw_read_seqcount_latch(&clear_latch);
>> idx = seq & 0x1;
>> val = clear_seq[idx];
>> } while (read_seqcount_latch_retry(&clear_latch, seq));
>>
>> return val;
>> }
>
> That's overly complicated.
>
> If you're going to double the storage you can simply do:
>
>
> seq = val
> smp_wmb();
> seq_copy = val;
>
> vs
>
> do {
> tmp = seq_copy;
> smp_rmb();
> val = seq;
> } while (val != tmp);

That will not work. We are talking about a situation where the writer is
preempted. So seq will never equal seq_copy in that situation. I expect
that the seqcount_latch is necessary.

John Ogness

2020-12-07 12:58:51

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Mon, Dec 07, 2020 at 11:09:39AM +0106, John Ogness wrote:
> That will not work. We are talking about a situation where the writer is
> preempted. So seq will never equal seq_copy in that situation. I expect
> that the seqcount_latch is necessary.

Interrupted rather, I would think, specifically NMIs? Then yes, latch
should work. Gets you either the old or new, but never something in
between.

2020-12-07 13:01:41

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Mon 2020-12-07 11:09:39, John Ogness wrote:
> On 2020-12-07, Peter Zijlstra <[email protected]> wrote:
> >> Yes, and it is read-only access. Perhaps atomic64_t is the wrong thing
> >> to use here. We could use a seqcount_latch and a shadow variable so that
> >> if a writer has been preempted, we can use the previous value. (Only
> >> kmsg_dump would need to use the lockless variant to read the value.)
> >>
> >> void clear_seq_set(u64 val)
> >> {
> >> spin_lock_irq(&clear_lock);
> >> raw_write_seqcount_latch(&clear_latch);
> >> clear_seq[0] = val;
> >> raw_write_seqcount_latch(&clear_latch);
> >> clear_seq[1] = val;
> >> spin_unlock_irq(&clear_lock);
> >> }
> >>
> >> u64 clear_seq_get_nolock(void)
> >> {
> >> unsigned int seq, idx;
> >> u64 val;
> >>
> >> do {
> >> seq = raw_read_seqcount_latch(&clear_latch);
> >> idx = seq & 0x1;
> >> val = clear_seq[idx];
> >> } while (read_seqcount_latch_retry(&clear_latch, seq));
> >>
> >> return val;
> >> }
> >
> > That's overly complicated.
> >
> > If you're going to double the storage you can simply do:
> >
> >
> > seq = val
> > smp_wmb();
> > seq_copy = val;
> >
> > vs
> >
> > do {
> > tmp = seq_copy;
> > smp_rmb();
> > val = seq;
> > } while (val != tmp);
>
> That will not work. We are talking about a situation where the writer is
> preempted. So seq will never equal seq_copy in that situation. I expect
> that the seqcount_latch is necessary.

Or we could disable interrupts around the writer.

But seqcount_latch will actually be need so that it works in panic().
The writer might be on a CPU that has been stopped using NMI. And this
code is used by dumpers() that are called during panic().

Sigh, I have to take a coffee and try to really understand the latch code ;-)

Best Regards,
Petr

2020-12-07 16:51:07

by David Laight

[permalink] [raw]

Subject: RE: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

From: John Ogness
> Sent: 07 December 2020 10:04
>
> On 2020-12-07, Peter Zijlstra <[email protected]> wrote:
> >> Yes, and it is read-only access. Perhaps atomic64_t is the wrong thing
> >> to use here. We could use a seqcount_latch and a shadow variable so that
> >> if a writer has been preempted, we can use the previous value. (Only
> >> kmsg_dump would need to use the lockless variant to read the value.)
> >>
> >> void clear_seq_set(u64 val)
> >> {
> >> spin_lock_irq(&clear_lock);
> >> raw_write_seqcount_latch(&clear_latch);
> >> clear_seq[0] = val;
> >> raw_write_seqcount_latch(&clear_latch);
> >> clear_seq[1] = val;
> >> spin_unlock_irq(&clear_lock);
> >> }
> >>
> >> u64 clear_seq_get_nolock(void)
> >> {
> >> unsigned int seq, idx;
> >> u64 val;
> >>
> >> do {
> >> seq = raw_read_seqcount_latch(&clear_latch);
> >> idx = seq & 0x1;
> >> val = clear_seq[idx];
> >> } while (read_seqcount_latch_retry(&clear_latch, seq));
> >>
> >> return val;
> >> }
> >
> > That's overly complicated.
> >
> > If you're going to double the storage you can simply do:
> >
> >
> > seq = val
> > smp_wmb();
> > seq_copy = val;
> >
> > vs
> >
> > do {
> > tmp = seq_copy;
> > smp_rmb();
> > val = seq;
> > } while (val != tmp);
>
> That will not work. We are talking about a situation where the writer is
> preempted. So seq will never equal seq_copy in that situation. I expect
> that the seqcount_latch is necessary.

Is the value just being incremented??
If so you can do:
seq_hi_0 = val >> 32;
smp_wmb();
seq_lo = val;
smp_wmb();
seq_hi_1 = val >> 32;

Then the reader can assume that seq_lo is zero if seq_h1_0 and
seq_hi_1 differ.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-12-08 21:03:22

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On (20/12/04 10:12), Petr Mladek wrote:
> On Tue 2020-12-01 21:59:40, John Ogness wrote:
> > Currently @clear_seq access is protected by @logbuf_lock. Once
> > @logbuf_lock is removed some other form of synchronization will be
> > required. Change the type of @clear_seq to atomic64_t to provide the
> > synchronization.
> >
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index fc5e3a7d6d89..e9018c4e1b66 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -3412,7 +3418,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
> > */
> > void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
> > {
> > - dumper->cur_seq = clear_seq;
> > + dumper->cur_seq = atomic64_read(&clear_seq);
>
> Sigh, atomic64_read() uses a spin lock in the generic implementation
> that is used on some architectures.

Oh... So on those archs prb is not lockless in fact, it actually
takes the spin_lock each time we read the descriptor state?

desc_read()
atomic_long_read(state_var)
atomic64_read()
raw_spin_lock_irqsave(lock, flags)
<< NMI panic >>

Am I missing something?

-ss

2020-12-08 22:34:51

by John Ogness

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On 2020-12-09, Sergey Senozhatsky <[email protected]> wrote:
>> Sigh, atomic64_read() uses a spin lock in the generic implementation
>> that is used on some architectures.
>
> Oh... So on those archs prb is not lockless in fact, it actually
> takes the spin_lock each time we read the descriptor state?
>
> desc_read()
> atomic_long_read(state_var)
> atomic64_read()
> raw_spin_lock_irqsave(lock, flags)
> << NMI panic >>
>
> Am I missing something?

For the state variable we chose atomic_long_t instead of atomic64_t for
this reason. atomic_long_t operations are available atomically on all
architectures. However, for clear_seq we need 64-bit (even on 32-bit
machines). The seqcount_latch is an excellent solution here since
clear_seq does not require lockless writers.

John Ogness

2020-12-09 01:18:24

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On (20/12/08 23:36), John Ogness wrote:
> On 2020-12-09, Sergey Senozhatsky <[email protected]> wrote:
> >> Sigh, atomic64_read() uses a spin lock in the generic implementation
> >> that is used on some architectures.
> >
> > Oh... So on those archs prb is not lockless in fact, it actually
> > takes the spin_lock each time we read the descriptor state?
> >
> > desc_read()
> > atomic_long_read(state_var)
> > atomic64_read()
> > raw_spin_lock_irqsave(lock, flags)
> > << NMI panic >>
> >
> > Am I missing something?
>
> For the state variable we chose atomic_long_t instead of atomic64_t for
> this reason. atomic_long_t operations are available atomically on all
> architectures.

Right. Looking more at Kconfigs, it seems that when atomic_long_t is
atomic64 (64BIT) then GENERIC_ATOMIC64 is not selected. Those archs
that select GENERIC_ATOMIC64 unconditionally all seem to be 32-bit.

Thanks.

-ss

2020-12-09 08:12:36

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Wed, Dec 09, 2020 at 05:34:19AM +0900, Sergey Senozhatsky wrote:
> On (20/12/04 10:12), Petr Mladek wrote:
> > On Tue 2020-12-01 21:59:40, John Ogness wrote:
> > > Currently @clear_seq access is protected by @logbuf_lock. Once
> > > @logbuf_lock is removed some other form of synchronization will be
> > > required. Change the type of @clear_seq to atomic64_t to provide the
> > > synchronization.
> > >
> > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > > index fc5e3a7d6d89..e9018c4e1b66 100644
> > > --- a/kernel/printk/printk.c
> > > +++ b/kernel/printk/printk.c
> > > @@ -3412,7 +3418,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
> > > */
> > > void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
> > > {
> > > - dumper->cur_seq = clear_seq;
> > > + dumper->cur_seq = atomic64_read(&clear_seq);
> >
> > Sigh, atomic64_read() uses a spin lock in the generic implementation
> > that is used on some architectures.
>
> Oh... So on those archs prb is not lockless in fact, it actually
> takes the spin_lock each time we read the descriptor state?

Yeah, many 32bit archs cannot natively do 64bit atomics and get to use
the horrible hashed spinlock crap.

But it gets even worse, we have a few architectures that cannot do
atomics _at_all_ and _always_ use the horrible hashed spinlock crap for
all atomics, even native word length ones.

I consider these architectures broken crap, and they work mostly by
accident than anything else, but we have them :/ The good new is that
they don't have NMIs either, so that helps.

2020-12-09 09:26:18

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On (20/12/09 09:16), Peter Zijlstra wrote:
> On Tue, Dec 08, 2020 at 11:36:44PM +0106, John Ogness wrote:
> > For the state variable we chose atomic_long_t instead of atomic64_t for
> > this reason. atomic_long_t operations are available atomically on all
> > architectures.
>
> Please put on your eye cancer gear and inspect the atomic implementation
> of PA-RISC, Sparc32, feh, I forgot who else.
>
> Those SMP capable architectures are gifted with just one XCHG like
> atomic instruction :/ Anyway, as said in the other email, they also
> don't have NMIs so it mostly works.

Hmm, wow. OK, I definitely want to look further.

When some CONFIG_DEBUG_FOO_BAR code wants to pr_err from prb->atomic_op
on those archs then we deadlock in printk once again?

-ss

2020-12-09 10:50:31

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On (20/12/09 18:22), Sergey Senozhatsky wrote:
> >
> > Please put on your eye cancer gear and inspect the atomic implementation
> > of PA-RISC, Sparc32, feh, I forgot who else.
> >
> > Those SMP capable architectures are gifted with just one XCHG like
> > atomic instruction :/ Anyway, as said in the other email, they also
> > don't have NMIs so it mostly works.

PeterZ, thanks for the pointers!

> Hmm, wow. OK, I definitely want to look further.
>
> When some CONFIG_DEBUG_FOO_BAR code wants to pr_err from prb->atomic_op
> on those archs then we deadlock in printk once again?

E.g. arch/sparc/lib/atomic32.c

spinlock_t __atomic_hash[ATOMIC_HASH_SIZE];
atomic_foo()
{
spin_lock_irqsave(ATOMIC_HASH(v), flags)
...
spin_unlock_irqrestore(ATOMIC_HASH(v), flags);
}

So another potential re-entry path is

atomic_foo()
spin_lock_irqsave(ATOMIC_HASH(v), flags)
printk()
prb()
atomic_foo()
spin_lock_irqsave(ATOMIC_HASH(v), flags)

which can deadlock, in theory, if both atomics HASH to the same
key (same spin_lock).

I wonder what else am I missing.

-ss

2020-12-09 11:05:02

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Wed, Dec 09, 2020 at 07:46:13PM +0900, Sergey Senozhatsky wrote:
> On (20/12/09 18:22), Sergey Senozhatsky wrote:
> > >
> > > Please put on your eye cancer gear and inspect the atomic implementation
> > > of PA-RISC, Sparc32, feh, I forgot who else.
> > >
> > > Those SMP capable architectures are gifted with just one XCHG like
> > > atomic instruction :/ Anyway, as said in the other email, they also
> > > don't have NMIs so it mostly works.
>
> PeterZ, thanks for the pointers!
>
>
> > Hmm, wow. OK, I definitely want to look further.
> >
> > When some CONFIG_DEBUG_FOO_BAR code wants to pr_err from prb->atomic_op
> > on those archs then we deadlock in printk once again?
>
> E.g. arch/sparc/lib/atomic32.c
>
> spinlock_t __atomic_hash[ATOMIC_HASH_SIZE];
> atomic_foo()
> {
> spin_lock_irqsave(ATOMIC_HASH(v), flags)
> ...
> spin_unlock_irqrestore(ATOMIC_HASH(v), flags);
> }
>
> So another potential re-entry path is
>
> atomic_foo()
> spin_lock_irqsave(ATOMIC_HASH(v), flags)
> printk()
> prb()
> atomic_foo()
> spin_lock_irqsave(ATOMIC_HASH(v), flags)
>
> which can deadlock, in theory, if both atomics HASH to the same
> key (same spin_lock).

Yep, but see the 'mostly' in the 'they mostly work'. Given the
limitiations of these architectures there's really only so much you can
do.

2020-12-09 11:34:30

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On (20/12/09 12:00), Peter Zijlstra wrote:
> > So another potential re-entry path is
> >
> > atomic_foo()
> > spin_lock_irqsave(ATOMIC_HASH(v), flags)
> > printk()
> > prb()
> > atomic_foo()
> > spin_lock_irqsave(ATOMIC_HASH(v), flags)
> >
> > which can deadlock, in theory, if both atomics HASH to the same
> > key (same spin_lock).
>
> Yep, but see the 'mostly' in the 'they mostly work'. Given the
> limitiations of these architectures there's really only so much you can
> do.

Right, agreed.

Nevertheless TIL that lockless printk buffer is not always lockless.
Perhaps, people that work with those archs need to also know this.
I haven't checked all the archs, but if, somehow, (IF) some of them
can panic the system with the atomic hash entries locked, then on
those archs new printk may not be able to flush-on-panic. Because
while printk iterates logbuf it may HASH to the atomic hash table
entry, that will never be unlocked. So there are some changes in
atomic/printk department on those archs.

-ss

2020-12-09 12:32:13

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Wed, Dec 09, 2020 at 08:28:52PM +0900, Sergey Senozhatsky wrote:
> On (20/12/09 12:00), Peter Zijlstra wrote:
> > > So another potential re-entry path is
> > >
> > > atomic_foo()
> > > spin_lock_irqsave(ATOMIC_HASH(v), flags)
> > > printk()
> > > prb()
> > > atomic_foo()
> > > spin_lock_irqsave(ATOMIC_HASH(v), flags)
> > >
> > > which can deadlock, in theory, if both atomics HASH to the same
> > > key (same spin_lock).
> >
> > Yep, but see the 'mostly' in the 'they mostly work'. Given the
> > limitiations of these architectures there's really only so much you can
> > do.
>
> Right, agreed.
>
> Nevertheless TIL that lockless printk buffer is not always lockless.
> Perhaps, people that work with those archs need to also know this.

Last time I broke them, they were aware they're 'special' and IIRC
they're mostly just limping along on prayers.

> I haven't checked all the archs, but if, somehow, (IF) some of them
> can panic the system with the atomic hash entries locked, then on
> those archs new printk may not be able to flush-on-panic. Because
> while printk iterates logbuf it may HASH to the atomic hash table
> entry, that will never be unlocked. So there are some changes in
> atomic/printk department on those archs.

Yeah, so I wouldn't put too much effort into thinking about it.
Hopefully we can eventually delete these architectures and really forget
they exist.

2020-12-09 14:02:11

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH next v2 2/3] printk: change @clear_seq to atomic64_t

On Tue, Dec 08, 2020 at 11:36:44PM +0106, John Ogness wrote:
> For the state variable we chose atomic_long_t instead of atomic64_t for
> this reason. atomic_long_t operations are available atomically on all
> architectures.

Please put on your eye cancer gear and inspect the atomic implementation
of PA-RISC, Sparc32, feh, I forgot who else.

Those SMP capable architectures are gifted with just one XCHG like
atomic instruction :/ Anyway, as said in the other email, they also
don't have NMIs so it mostly works.