2024-03-06 18:42:00

by Steven Rostedt

[permalink] [raw]
Subject: [for-linus][PATCH 0/3] tracing: Fixes for v6.8


Tracing fixes for v6.8-rc7:

- The size of a string written into trace_marker was determined by
the size of the sub-buffer in the ring buffer. That size is
dependent on the PAGE_SIZE of the architecture as it can be mapped
into user space. But on PowerPC, where PAGE_SIZE is 64K, that made
the limit of the string of writing into trace_marker 64K.

One of the selftests looks at the size of the ring buffer sub-buffers
and writes that plus more into the trace_marker. The write will take
what it can and report back what it consumed so that the user space
application (like echo) will write the rest of the string. The string
is stored in the ring buffer and can be read via the "trace" or
"trace_pipe" files.

The reading of the ring buffer uses vsnprintf(), which uses a precision
"%.*s" to make sure it only reads what is stored in the buffer, as
a bug could cause the string to be non terminated.

With the combination of the precision change and the PAGE_SIZE of 64K
allowing huge strings to be added into the ring buffer, plus the test
that would actually stress that limit, a bug was reported that
the precision used was too big for "%.*s" as the string was close to
64K in size and the max precision of vsnprintf is 32K.

Linus suggested not to have that precision as it could hide a bug
if the string was again stored without a nul byte.

Another issue that was brought up is that the trace_seq buffer is
also based on PAGE_SIZE even though it is not tied to the architecture
limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
simply just too big and wasting memory on systems with 64K page sizes.
It is now hardcoded to 8K which is what all other architectures with
4K PAGE_SIZE has.

Finally, the write to trace_marker is now limited to 4K as there is no
reason to write larger strings into trace_marker.

Steven Rostedt (Google) (3):
tracing: Remove precision vsnprintf() check from print event
tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE
tracing: Limit trace_marker writes to just 4K

----
include/linux/trace_seq.h | 8 +++++++-
kernel/trace/trace.c | 10 +++++-----
kernel/trace/trace_output.c | 6 ++----
3 files changed, 14 insertions(+), 10 deletions(-)


2024-03-10 18:16:11

by David Laight

[permalink] [raw]
Subject: RE: [for-linus][PATCH 0/3] tracing: Fixes for v6.8

..
> Another issue that was brought up is that the trace_seq buffer is
> also based on PAGE_SIZE even though it is not tied to the architecture
> limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
> simply just too big and wasting memory on systems with 64K page sizes.
> It is now hardcoded to 8K which is what all other architectures with
> 4K PAGE_SIZE has.

Does Linux use a 2k PAGE_SIZE on any architectures?
IIRC m68k hardware has a 2k page, but Linux might always pair them.
A 2k page might (or might not) cause grief.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2024-03-10 18:36:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: [for-linus][PATCH 0/3] tracing: Fixes for v6.8

On Sun, 10 Mar 2024 18:16:06 +0000
David Laight <[email protected]> wrote:

> ...
> > Another issue that was brought up is that the trace_seq buffer is
> > also based on PAGE_SIZE even though it is not tied to the architecture
> > limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
> > simply just too big and wasting memory on systems with 64K page sizes.
> > It is now hardcoded to 8K which is what all other architectures with
> > 4K PAGE_SIZE has.
>
> Does Linux use a 2k PAGE_SIZE on any architectures?
> IIRC m68k hardware has a 2k page, but Linux might always pair them.
> A 2k page might (or might not) cause grief.
>

The trace_seq is just a buffer to build up the event output string. The
ring buffer sub-buffer is set to page size. For trace_marker, it is
still limited to the size of the ring buffer sub-buffer. If the
sub-buffer is only 2K, the trace_marker write will be broken up by less
than 2K.

The problem that is being fixed here had nothing to do with the limited
size of the resources. The issue was actually the opposite. On PowerPC,
the PAGE_SIZE being 64K allowed the strings to be that big too. And
what broke was that it was passed to a vsprintf(s, "%.*s", len, str);
where the len was greater than 32K and that caused a warning as the
precision of "%.*s" has a max of signed short.

2K PAGE_SIZE will still just "work".

-- Steve

2024-03-10 19:12:25

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [for-linus][PATCH 0/3] tracing: Fixes for v6.8

Hi David,

On Sun, Mar 10, 2024 at 7:16 PM David Laight <[email protected]> wrote:
> > Another issue that was brought up is that the trace_seq buffer is
> > also based on PAGE_SIZE even though it is not tied to the architecture
> > limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
> > simply just too big and wasting memory on systems with 64K page sizes.
> > It is now hardcoded to 8K which is what all other architectures with
> > 4K PAGE_SIZE has.
>
> Does Linux use a 2k PAGE_SIZE on any architectures?
> IIRC m68k hardware has a 2k page, but Linux might always pair them.
> A 2k page might (or might not) cause grief.

Linux/m68k supports only 4 or 8 KiB page sizes, depending on the
MMU hardware, cfr. [1]. While the MC68851 MMU also supports page sizes
of 256 and 512 bytes, and 1, 2, 8, 16, and 32 KiB, that is not yet
supported by Linux.

I really doubt Linux will ever support pages smaller than 4 KiB...

[1] https://lore.kernel.org/all/[email protected]/#Z2e.:20240306141453.3900574-4-arnd::40kernel.org:1arch:m68k:Kconfig

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68korg

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds