2020-10-13 05:29:31

by Petr Mladek

[permalink] [raw]
Subject: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

Linus,

please pull the latest printk changes from

git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux tags/printk-for-5.10

============================

- Fully lockless ringbuffer implementation, including the support for
continuous lines. It will allow to store and read messages in any
situation wihtout the risk of deadlocks and without the need
of temporary per-CPU buffers.

The access is still serialized by logbuf_lock. It synchronizes few more
operations, for example, temporary buffer for formatting the message,
syslog and kmsg_dump operations. The lock removal is being discussed
and should be ready for the next release.

The continuous lines are handled exactly the same way as before
to avoid regressions in user space. It means that they are appended
to the last message when the caller is the same. Only the last
message can be extended.

The data ring includes plain text of the messages. Except for
an integer at the beginning of each message that points back
to the descriptor ring with other metadata.

The dictionary has to stay. journalctl uses it to filter the log.
It allows to show messages related to a given device. The dictionary
values are stored in the descriptor ring with the other metadata.

This is the first part of the printk rework as discussed at Plumbers 2019,
see https://lore.kernel.org/r/[email protected]. The next
big step will be handling consoles by kthreads during the normal
system operation. It will require special handling of situations
when the kthreads could not get scheduled, for example, early boot,
suspend, panic.


Other changes:

+ Add John Ogness as a reviewer for printk subsystem. He is author of
the rework and is familiar with the code and history.

+ Fix locking in serial8250_do_startup() to prevent lockdep report.

+ Few code cleanups.


=========================

The fully lockless ringbuffer code has been tested in linux-next
since Jul 10. The pr_cont() handling has been pushed in Sep 15.
I am not aware of any regression at the moment. But of course,
there will be some. And we are ready to work on the fixes.

----------------------------------------------------------------
Andy Shevchenko (1):
kernel.h: Move oops_in_progress to printk.h

Gustavo A. R. Silva (1):
printk: Use fallthrough pseudo-keyword

John Ogness (22):
crash: add VMCOREINFO macro to define offset in a struct declared by typedef
printk: add lockless ringbuffer
Revert "printk: lock/unlock console only for new logbuf entries"
printk: use the lockless ringbuffer
printk: ringbuffer: support dataless records
printk: reduce LOG_BUF_SHIFT range for H8300
docs: vmcoreinfo: add lockless printk ringbuffer vmcoreinfo
scripts/gdb: add utils.read_ulong()
scripts/gdb: update for lockless printk ringbuffer
printk: ringbuffer: fix setting state in desc_read()
printk: ringbuffer: avoid memcpy() on state_var
printk: ringbuffer: relocate get_data()
printk: ringbuffer: add BLK_DATALESS() macro
printk: ringbuffer: clear initial reserved fields
printk: ringbuffer: change representation of states
printk: ringbuffer: add finalization/extension support
printk: reimplement log_cont using record extension
printk: move printk_info into separate array
printk: move dictionary keys to dev_printk_info
printk: remove dict ring
printk: avoid and/or handle record truncation
printk: reduce setup_text_buf size to LOG_LINE_MAX

Petr Mladek (2):
MAINTAIERS: Add John Ogness as printk reviewer
Merge branch 'printk-rework' into for-linus

Randy Dunlap (1):
kernel: printk: delete repeated words in comments

Sergey Senozhatsky (1):
serial: 8250: change lock order in serial8250_do_startup()

Documentation/admin-guide/kdump/gdbmacros.txt | 159 +-
Documentation/admin-guide/kdump/vmcoreinfo.rst | 131 +-
MAINTAINERS | 1 +
drivers/base/core.c | 46 +-
drivers/tty/serial/8250/8250_port.c | 9 +-
include/linux/crash_core.h | 3 +
include/linux/debug_locks.h | 2 +-
include/linux/dev_printk.h | 8 +
include/linux/kernel.h | 1 -
include/linux/printk.h | 8 +-
init/Kconfig | 3 +-
kernel/printk/Makefile | 1 +
kernel/printk/internal.h | 4 +-
kernel/printk/printk.c | 1153 +++++++------
kernel/printk/printk_ringbuffer.c | 2083 ++++++++++++++++++++++++
kernel/printk/printk_ringbuffer.h | 382 +++++
kernel/printk/printk_safe.c | 2 +-
scripts/gdb/linux/dmesg.py | 147 +-
scripts/gdb/linux/utils.py | 7 +
19 files changed, 3416 insertions(+), 734 deletions(-)
create mode 100644 kernel/printk/printk_ringbuffer.c
create mode 100644 kernel/printk/printk_ringbuffer.h


2020-10-14 10:16:30

by pr-tracker-bot

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

The pull request you sent on Mon, 12 Oct 2020 16:49:16 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux tags/printk-for-5.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/d594d8f411d47bf7b583ec3474b11fec348c88bb

Thank you!

--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

2020-10-14 17:08:12

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

Hi Petr,

On Mon, Oct 12, 2020 at 4:50 PM Petr Mladek <[email protected]> wrote:
> - Fully lockless ringbuffer implementation, including the support for
> continuous lines. It will allow to store and read messages in any
> situation wihtout the risk of deadlocks and without the need
> of temporary per-CPU buffers.

linux-m68k-atari_defconfig$ bloat-o-meter vmlinux.old
vmlinux.lockless_ringbuffer
add/remove: 39/16 grow/shrink: 9/15 up/down: 214075/-4362 (209713)
Function old new delta
_printk_rb_static_infos - 180224 +180224
_printk_rb_static_descs - 24576 +24576
[...]

Seriously?!? Or am I being misled by the tools?

linux-m68k-atari_defconfig$ size vmlinux.old vmlinux.lockless_ringbuffer
text data bss dec hex filename
3559108 941716 177772 4678596 4763c4 vmlinux.old
3563922 1152496 175276 4891694 4aa42e vmlinux.lockless_ringbuffer

Apparently not...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2020-10-14 17:38:47

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

Hi Rasmus,

On Wed, Oct 14, 2020 at 4:58 PM Rasmus Villemoes
<[email protected]> wrote:
> On 14/10/2020 16.16, Geert Uytterhoeven wrote:
> > On Mon, Oct 12, 2020 at 4:50 PM Petr Mladek <[email protected]> wrote:
> >> - Fully lockless ringbuffer implementation, including the support for
> >> continuous lines. It will allow to store and read messages in any
> >> situation wihtout the risk of deadlocks and without the need
> >> of temporary per-CPU buffers.
> >
> > linux-m68k-atari_defconfig$ bloat-o-meter vmlinux.old
> > vmlinux.lockless_ringbuffer
> > add/remove: 39/16 grow/shrink: 9/15 up/down: 214075/-4362 (209713)
> > Function old new delta
> > _printk_rb_static_infos - 180224 +180224
> > _printk_rb_static_descs - 24576 +24576
> > [...]
> >
> > Seriously?!? Or am I being misled by the tools?
> >
> > linux-m68k-atari_defconfig$ size vmlinux.old vmlinux.lockless_ringbuffer
> > text data bss dec hex filename
> > 3559108 941716 177772 4678596 4763c4 vmlinux.old
> > 3563922 1152496 175276 4891694 4aa42e vmlinux.lockless_ringbuffer
> >
> > Apparently not...
>
> Hm, that's quite a lot. And the only reason the buffers don't live
> entirely in .bss is because a few of their entries have non-zero
> initializers.

Even if this would live in BSS, it would still consume 200 KiB of RAM.
Or am I missing something?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2020-10-14 23:05:08

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

On 14/10/2020 16.16, Geert Uytterhoeven wrote:
> Hi Petr,
>
> On Mon, Oct 12, 2020 at 4:50 PM Petr Mladek <[email protected]> wrote:
>> - Fully lockless ringbuffer implementation, including the support for
>> continuous lines. It will allow to store and read messages in any
>> situation wihtout the risk of deadlocks and without the need
>> of temporary per-CPU buffers.
>
> linux-m68k-atari_defconfig$ bloat-o-meter vmlinux.old
> vmlinux.lockless_ringbuffer
> add/remove: 39/16 grow/shrink: 9/15 up/down: 214075/-4362 (209713)
> Function old new delta
> _printk_rb_static_infos - 180224 +180224
> _printk_rb_static_descs - 24576 +24576
> [...]
>
> Seriously?!? Or am I being misled by the tools?
>
> linux-m68k-atari_defconfig$ size vmlinux.old vmlinux.lockless_ringbuffer
> text data bss dec hex filename
> 3559108 941716 177772 4678596 4763c4 vmlinux.old
> 3563922 1152496 175276 4891694 4aa42e vmlinux.lockless_ringbuffer
>
> Apparently not...

Hm, that's quite a lot. And the only reason the buffers don't live
entirely in .bss is because a few of their entries have non-zero
initializers.

Perhaps one could add a .init.text.initialize_static_data section of
function pointers, with the _DEFINE_PRINTKRB macro growing something like

static void __init __initialize_printkrb_##name(void) { \
_##name##_descs[_DESCS_COUNT(descbits) - 1] = ...; \
_##name##_infos[0] = ...; \
_##name##_infos[_DESCS_COUNT(descbits) - 1] = ...; \
} \
static_data_initializer(__initialize_printkrb_##name);

with static_data_initalizer being the obvious yoga for putting a
function pointer in the .init.text.initialize_static_data section. Then
very early in start_kernel(), probably first thing, iterate that section
and call all the functions. But maybe that's not even early enough?

Rasmus

2020-10-15 04:01:38

by John Ogness

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

On 2020-10-14, Geert Uytterhoeven <[email protected]> wrote:
>> - Fully lockless ringbuffer implementation, including the support for
>> continuous lines. It will allow to store and read messages in any
>> situation wihtout the risk of deadlocks and without the need
>> of temporary per-CPU buffers.
>
> linux-m68k-atari_defconfig$ bloat-o-meter vmlinux.old vmlinux.lockless_ringbuffer
> add/remove: 39/16 grow/shrink: 9/15 up/down: 214075/-4362 (209713)
> Function old new delta
> _printk_rb_static_infos - 180224 +180224
> _printk_rb_static_descs - 24576 +24576

131072 of _printk_rb_static_infos is reserved for dictionary usage. The
rest (49152) is for the record meta data. Previously any dictionary
usage and record meta data was embedded in the message buffer (log_buf,
65536).

Since the meta data is now separate, setting CONFIG_LOG_BUF_SHIFT=15
would provide roughly the same amount of record storage as
CONFIG_LOG_BUF_SHIFT=16 did with vmlinux.old. Then there would be:

32768: message buffer
24576: meta data
65536: dictionary data
12288: descriptor data

Excluding the dictionary data, the total is 65536.

(With vmlinux.old there is 65536 total with CONFIG_LOG_BUF_SHIFT=16.)

It is the reserved dictionary data that is hurting us here.

Should we provide a config option to kill the dictionary data?

Should CONFIG_LOG_BUF_SHIFT do an implicit -1 so that the sizes (without
dictionary data) are about the same as before?

Maybe dictionaries should only exist in the dynamically allocated
ringbuffer?

John Ogness

2020-10-15 11:15:23

by Petr Mladek

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

On Wed 2020-10-14 23:01:30, John Ogness wrote:
> On 2020-10-14, Geert Uytterhoeven <[email protected]> wrote:
> >> - Fully lockless ringbuffer implementation, including the support for
> >> continuous lines. It will allow to store and read messages in any
> >> situation wihtout the risk of deadlocks and without the need
> >> of temporary per-CPU buffers.
> >
> > linux-m68k-atari_defconfig$ bloat-o-meter vmlinux.old vmlinux.lockless_ringbuffer
> > add/remove: 39/16 grow/shrink: 9/15 up/down: 214075/-4362 (209713)
> > Function old new delta
> > _printk_rb_static_infos - 180224 +180224
> > _printk_rb_static_descs - 24576 +24576
>
> 131072 of _printk_rb_static_infos is reserved for dictionary usage. The
> rest (49152) is for the record meta data. Previously any dictionary
> usage and record meta data was embedded in the message buffer (log_buf,
> 65536).
>
> Since the meta data is now separate, setting CONFIG_LOG_BUF_SHIFT=15
> would provide roughly the same amount of record storage as
> CONFIG_LOG_BUF_SHIFT=16 did with vmlinux.old. Then there would be:
>
> 32768: message buffer
> 24576: meta data
> 65536: dictionary data
> 12288: descriptor data
>
> Excluding the dictionary data, the total is 65536.
>
> (With vmlinux.old there is 65536 total with CONFIG_LOG_BUF_SHIFT=16.)
>
> It is the reserved dictionary data that is hurting us here.
>
> Should we provide a config option to kill the dictionary data?

This might be the best solution. AFAIK, they are currently
used only by journalctl. IMHO, most people are not aware of it
and it is not a widely used feature.


> Should CONFIG_LOG_BUF_SHIFT do an implicit -1 so that the sizes (without
> dictionary data) are about the same as before?

Beware that this might result in slightly reduced amount of stored
messages. The storage of the dictionary meta data is less effective
because they are not longer stored together with the messages and
we need to guarantee a space for them.

I would prefer to keep the current global default. In average,
the size of the systems is growing. They produce more messages
and have more memory available.

Yeah, the space hurts on small systems. We might modify the default
for them. Or we could allow to disable the dictionary.

Note that this is not the full picture of the printk memory usage.
I believe that we will be able to get rid of printk_safe and printk_nmi.
It would safe 16kB per-CPU at runtime.

If we make the dictionary optional and remove the per-CPU buffers
then we might end-up with less memory needs than before.

Best Regards,
Petr

2020-10-15 11:16:13

by Petr Mladek

[permalink] [raw]
Subject: Re: [GIT PULL] printk for 5.10 (includes lockless ringbuffer)

On Wed 2020-10-14 16:58:27, Rasmus Villemoes wrote:
> On 14/10/2020 16.16, Geert Uytterhoeven wrote:
> > Hi Petr,
> >
> > On Mon, Oct 12, 2020 at 4:50 PM Petr Mladek <[email protected]> wrote:
> >> - Fully lockless ringbuffer implementation, including the support for
> >> continuous lines. It will allow to store and read messages in any
> >> situation wihtout the risk of deadlocks and without the need
> >> of temporary per-CPU buffers.
> >
> > linux-m68k-atari_defconfig$ bloat-o-meter vmlinux.old
> > vmlinux.lockless_ringbuffer
> > add/remove: 39/16 grow/shrink: 9/15 up/down: 214075/-4362 (209713)
> > Function old new delta
> > _printk_rb_static_infos - 180224 +180224
> > _printk_rb_static_descs - 24576 +24576
> > [...]
> >
> > Seriously?!? Or am I being misled by the tools?
> >
> > linux-m68k-atari_defconfig$ size vmlinux.old vmlinux.lockless_ringbuffer
> > text data bss dec hex filename
> > 3559108 941716 177772 4678596 4763c4 vmlinux.old
> > 3563922 1152496 175276 4891694 4aa42e vmlinux.lockless_ringbuffer
> >
> > Apparently not...
>
> Hm, that's quite a lot. And the only reason the buffers don't live
> entirely in .bss is because a few of their entries have non-zero
> initializers.
>
> Perhaps one could add a .init.text.initialize_static_data section of
> function pointers, with the _DEFINE_PRINTKRB macro growing something like
>
> static void __init __initialize_printkrb_##name(void) { \
> _##name##_descs[_DESCS_COUNT(descbits) - 1] = ...; \
> _##name##_infos[0] = ...; \
> _##name##_infos[_DESCS_COUNT(descbits) - 1] = ...; \
> } \
> static_data_initializer(__initialize_printkrb_##name);
>
> with static_data_initalizer being the obvious yoga for putting a
> function pointer in the .init.text.initialize_static_data section. Then
> very early in start_kernel(), probably first thing, iterate that section
> and call all the functions. But maybe that's not even early enough?

A solution might be to initialize the buffer during the first
printk() call. We could make sure that it is done in
setup_log_buf() at latest. It is called when only one CPU is
running so it should be safe. The only problem might be NMI.

Best Regards,
Petr