2022-10-06 23:02:41

by Guilherme G. Piccoli

[permalink] [raw]
Subject: [PATCH 2/8] pstore: Expose kmsg_bytes as a module parameter

Currently this tuning is only exposed as a filesystem option,
but most Linux distros automatically mount pstore, hence changing
this setting requires remounting it. Also, if that mount option
wasn't explicitly set it doesn't show up in mount information,
so users cannot check what is the current value of kmsg_bytes.

Let's then expose it as a module parameter, allowing both user
visibility at all times (even if not manually set) and also the
possibility of setting that as a boot/module parameter.

Signed-off-by: Guilherme G. Piccoli <[email protected]>
---
fs/pstore/platform.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index c32957e4b256..be05090076ce 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -89,6 +89,11 @@ static char *compress =
module_param(compress, charp, 0444);
MODULE_PARM_DESC(compress, "compression to use");

+/* How much of the kernel log to snapshot */
+unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES;
+module_param(kmsg_bytes, ulong, 0444);
+MODULE_PARM_DESC(kmsg_bytes, "amount of kernel log to snapshot (in bytes)");
+
/* Compression parameters */
static struct crypto_comp *tfm;

@@ -100,9 +105,6 @@ struct pstore_zbackend {
static char *big_oops_buf;
static size_t big_oops_buf_sz;

-/* How much of the console log to snapshot */
-unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES;
-
void pstore_set_kmsg_bytes(int bytes)
{
kmsg_bytes = bytes;
--
2.38.0


2022-10-06 23:47:08

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/8] pstore: Expose kmsg_bytes as a module parameter

On Thu, Oct 06, 2022 at 07:42:06PM -0300, Guilherme G. Piccoli wrote:
> Currently this tuning is only exposed as a filesystem option,
> but most Linux distros automatically mount pstore, hence changing
> this setting requires remounting it. Also, if that mount option
> wasn't explicitly set it doesn't show up in mount information,
> so users cannot check what is the current value of kmsg_bytes.
>
> Let's then expose it as a module parameter, allowing both user
> visibility at all times (even if not manually set) and also the
> possibility of setting that as a boot/module parameter.

I've been meaning to do this too. :)

>
> Signed-off-by: Guilherme G. Piccoli <[email protected]>
> ---
> fs/pstore/platform.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index c32957e4b256..be05090076ce 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -89,6 +89,11 @@ static char *compress =
> module_param(compress, charp, 0444);
> MODULE_PARM_DESC(compress, "compression to use");
>
> +/* How much of the kernel log to snapshot */
> +unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES;
> +module_param(kmsg_bytes, ulong, 0444);
> +MODULE_PARM_DESC(kmsg_bytes, "amount of kernel log to snapshot (in bytes)");
> +
> /* Compression parameters */
> static struct crypto_comp *tfm;
>
> @@ -100,9 +105,6 @@ struct pstore_zbackend {
> static char *big_oops_buf;
> static size_t big_oops_buf_sz;
>
> -/* How much of the console log to snapshot */
> -unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES;
> -
> void pstore_set_kmsg_bytes(int bytes)
> {
> kmsg_bytes = bytes;
> --
> 2.38.0

Doing a mount will override the result, so I wonder if there should be
two variables, etc... not a concern for the normal use case.

Also, I've kind of wanted to get rid of a "default" for this and instead
use a value based on the compression vs record sizes, etc. But I didn't
explore it.

--
Kees Cook

2022-10-12 16:17:06

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH 2/8] pstore: Expose kmsg_bytes as a module parameter

On 06/10/2022 20:32, Kees Cook wrote:
> [...]
> Doing a mount will override the result, so I wonder if there should be
> two variables, etc... not a concern for the normal use case.
>
> Also, I've kind of wanted to get rid of a "default" for this and instead
> use a value based on the compression vs record sizes, etc. But I didn't
> explore it.
>

For some reason I forgot to respond that, sorry!

I didn't understand exactly how the mount would override things; I've
done some tests:

(1) booted with the new kmsg_bytes module parameter set to 64k, and it
was preserved across multiple mount/umount cycles.

(2) When I manually had "-o kmsg_bytes=16k" set during the mount
operation, it worked as expected, setting the thing to 16k (and
reflecting in the module parameter, as observed in /sys/modules).

Maybe I'm missing something?

Now, regarding the idea of setting that as a function of
compression/record_sizes, I feel it makes sense and could be worked,
like a heuristic right?

In the end, if you think properly, what is the purpose of kmsg_bytes?
Wouldn't make sense to just fill the record_size with the maximum amount
of data it can handle? Of course there is the partitioning thing, but in
the end kmsg_bytes seems a mechanism to _restrict_ the data collection,
so maybe the default would be a value that means "save whatever you can
handle" (maybe 0), and if the parameter/mount option is set, then pstore
would restrict the saved size.
Thoughts?

Thanks,


Guilherme

2022-10-12 18:00:42

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/8] pstore: Expose kmsg_bytes as a module parameter

On Wed, Oct 12, 2022 at 12:33:36PM -0300, Guilherme G. Piccoli wrote:
> On 06/10/2022 20:32, Kees Cook wrote:
> > [...]
> > Doing a mount will override the result, so I wonder if there should be
> > two variables, etc... not a concern for the normal use case.
> >
> > Also, I've kind of wanted to get rid of a "default" for this and instead
> > use a value based on the compression vs record sizes, etc. But I didn't
> > explore it.
> >
>
> For some reason I forgot to respond that, sorry!
>
> I didn't understand exactly how the mount would override things; I've
> done some tests:
>
> (1) booted with the new kmsg_bytes module parameter set to 64k, and it
> was preserved across multiple mount/umount cycles.
>
> (2) When I manually had "-o kmsg_bytes=16k" set during the mount
> operation, it worked as expected, setting the thing to 16k (and
> reflecting in the module parameter, as observed in /sys/modules).

What I was imagining was the next step:

(3) umount, unload the backend, load a new backend, and mount it
without kmsg_bytes specified -- kmsg_bytes will be 16k, not 64k.

It's a pretty extreme corner-case, I realize. :) However, see below...

> In the end, if you think properly, what is the purpose of kmsg_bytes?
> Wouldn't make sense to just fill the record_size with the maximum amount
> of data it can handle? Of course there is the partitioning thing, but in
> the end kmsg_bytes seems a mechanism to _restrict_ the data collection,
> so maybe the default would be a value that means "save whatever you can
> handle" (maybe 0), and if the parameter/mount option is set, then pstore
> would restrict the saved size.

Right, kmsg_bytes is the maximum size to save from the console on a
crash. The design of the ram backend was to handle really small amounts
of persistent RAM -- if a single crash would eat all of it and possibly
wrap around, it could write over useful parts at the end (since it's
written from the end to the front). However, I think somewhere along
the way, stricter logic was added to the ram backend:

/*
* Explicitly only take the first part of any new crash.
* If our buffer is larger than kmsg_bytes, this can never happen,
* and if our buffer is smaller than kmsg_bytes, we don't want the
* report split across multiple records.
*/
if (record->part != 1)
return -ENOSPC;

This limits it to just a single record.

However, this does _not_ exist for other backends, so they will see up
to kmsg_bytes-size dumps split across psinfo->bufsize many records. For
the backends, this record size is not always fixed:

- efi uses 1024, even though it allocates 4096 (as was pointed out earlier)
- zone uses kmsg_bytes
- acpi-erst uses some ACPI value from ACPI_ERST_GET_ERROR_LENGTH
- ppc-nvram uses the configured size of nvram partition

Honestly, it seems like the 64k default is huge, but I don't think it
should be "unlimited" given the behaviors of ppc-nvram, and acpi-erst.
For ram and efi, it's effectively unlimited because of the small bufsizes
(and the "only 1 record" logic in ram).

Existing documentation I can find online seem to imply making it smaller
(8000 bytes[1], 16000 bytes), but without justification. Even the "main"
documentation[2] doesn't mention it.

-Kees

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/pstore
[2] https://docs.kernel.org/admin-guide/ramoops.html

--
Kees Cook

2022-11-01 19:32:55

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH 2/8] pstore: Expose kmsg_bytes as a module parameter

Hi Kees, my apologies for the (big) delay in answering that! I kept it
marked to respond after tests, ended-up forgetting, but now did all the
tests and finally I'm able to respond.


On 12/10/2022 14:58, Kees Cook wrote:
> [...]
>> I didn't understand exactly how the mount would override things; I've
>> done some tests:
>>
>> (1) booted with the new kmsg_bytes module parameter set to 64k, and it
>> was preserved across multiple mount/umount cycles.
>>
>> (2) When I manually had "-o kmsg_bytes=16k" set during the mount
>> operation, it worked as expected, setting the thing to 16k (and
>> reflecting in the module parameter, as observed in /sys/modules).
>
> What I was imagining was the next step:
>
> (3) umount, unload the backend, load a new backend, and mount it
> without kmsg_bytes specified -- kmsg_bytes will be 16k, not 64k.
>
> It's a pretty extreme corner-case, I realize. :) However, see below...

Oh okay, thanks for pointing that! Indeed, in your test-case I've faced
the issue of the retained kmsg_bytes...although, I agree it's an extreme
corner-case heheh



> [...]
> Right, kmsg_bytes is the maximum size to save from the console on a
> crash. The design of the ram backend was to handle really small amounts
> of persistent RAM -- if a single crash would eat all of it and possibly
> wrap around, it could write over useful parts at the end (since it's
> written from the end to the front). However, I think somewhere along
> the way, stricter logic was added to the ram backend:
>
> /*
> * Explicitly only take the first part of any new crash.
> * If our buffer is larger than kmsg_bytes, this can never happen,
> * and if our buffer is smaller than kmsg_bytes, we don't want the
> * report split across multiple records.
> */
> if (record->part != 1)
> return -ENOSPC;
>
> This limits it to just a single record.

Indeed, and I already considered that in the past...why was that
restricted to a single record, right? I had plans to change it, lemme
know your thoughts. (Reference:
https://lore.kernel.org/linux-fsdevel/[email protected]/)



> However, this does _not_ exist for other backends, so they will see up
> to kmsg_bytes-size dumps split across psinfo->bufsize many records. For
> the backends, this record size is not always fixed:
>
> - efi uses 1024, even though it allocates 4096 (as was pointed out earlier)
> - zone uses kmsg_bytes
> - acpi-erst uses some ACPI value from ACPI_ERST_GET_ERROR_LENGTH
> - ppc-nvram uses the configured size of nvram partition
>
> Honestly, it seems like the 64k default is huge, but I don't think it
> should be "unlimited" given the behaviors of ppc-nvram, and acpi-erst.
> For ram and efi, it's effectively unlimited because of the small bufsizes
> (and the "only 1 record" logic in ram).
>
> Existing documentation I can find online seem to imply making it smaller
> (8000 bytes[1], 16000 bytes), but without justification. Even the "main"
> documentation[2] doesn't mention it.

Right! Also, on top of that, there is a kind of "tricky" logic in which
this value is not always respected. For example, in the Steam Deck case
we have a region of ~10MB, and set record size of the ramoops backend to
2MB. This is the amount collected, it doesn't respect kmsg_bytes, since
it checks the amount dumped vs kmsg_bytes effectively _after_ the first
part is written (which in the ramoops case, it's a single write), hence
this check is never "respected" there. I don't consider that as a bug,
more a flexibility for the ramoops case heh

In any way, lemme know if you want to have a "revamp" in the meaning of
kmsg_bytes, I'd be glad in discuss/work on that =)

Thanks,


Guilherme