2023-08-29 02:00:34

by Kees Cook

[permalink] [raw]
Subject: Re: [GIT PULL] pstore updates for v6.6-rc1

On August 28, 2023 4:56:00 PM PDT, Linus Torvalds <[email protected]> wrote:
>On Mon, 28 Aug 2023 at 11:21, Kees Cook <[email protected]> wrote:
>>
>> Please pull these pstore updates for v6.6-rc1. This contains a fair bit
>> of code _removal_ which is always nice.
>
>Hmm. The diffstat certainly looks good, but the end result isn't great..
>
>I now get 124 lines of
>
> pstore: zlib_inflate() failed, ret = -5!
>
>in my bootup dmesg.
>
>Considering that there's no reason for pstore to even be active on
>this machine, I think it's because pstore now goes and tries to
>uncompress something entirely invalid.
>
>The message itself does not seem to be new, but with the switch from
>the crypto code, it apparently used to be
>
> crypto_comp_decompress failed, ret = %d!
>
>but the key word here is *apparently*. I never got that message
>before. So something else has changed, and I'm thinking that the old
>code probably didn't even try to decompress the bogus data it found?
>
>I dunno. But 124 lines of insane garbage in the kernel messages is not
>a good thing.

Oh dear! That's obviously unexpected. I have so many questions. :P

- does this happen at every boot? (I assume yes.)
- what CONFIG are you built with?
- what was the prior CONFIG?
- what backend is in use? (Or better yet, what does "dmesg | grep pstore" report?)
- are you using systemd?

Decompression is only attempted if it's a valid record. If the records aren't being removed after boot (i.e. unlinked from /sys/fs/pstore) they won't get cleared. Normally systemd-pstore moves everything to /var/lib/systemd/pstore. But that must not be happening since you keep seeing the warnings.

That you have 124 of these makes me think you've got the EFI backend (CONFIG_EFI_VARS_PSTORE) built and it's default enabled (CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=n). The latter config was created to keep the EFI backend from filling the EFI variable space. I think distros started setting it to "n" once systemd-pstore was added, which keeps the EFI variables from piling up...

So, I assume either systemd-pstore isn't running for you or something has gone sideways with it. And since I did testing of "changed compression type" without systemd-pstore, I bet systemd-pstore ignores the failed records...
https://github.com/systemd/systemd/blob/599a3124849819ba5af0a71b7572e87256814881/src/pstore/pstore.c#L225
Yup. Ugh. (Though I still find it odd that you have 124 records...)

Let me think about the best way to deal with this. I expect I'll have pstore wipe the failed records as it is expressly not expected to work across differing configs/kernel versions. And permanently spewing errors is not ok.

In the meantime, you can make the warnings go away with:

rm /sys/fs/pstore/*enc.z


--
Kees Cook