2024-06-06 15:28:02

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH v2 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

Reserve unspecified location of physical memory from kernel command line

Background:

In ChromeOS, we have 1 MB of pstore ramoops reserved so that we can extract
dmesg output and some other information when a crash happens in the field.
(This is only done when the user selects "Allow Google to collect data for
improving the system"). But there are cases when there's a bug that
requires more data to be retrieved to figure out what is happening. We would
like to increase the pstore size, either temporarily, or maybe even
permanently. The pstore on these devices are at a fixed location in RAM (as
the RAM is not cleared on soft reboots nor crashes). The location is chosen
by the BIOS (coreboot) and passed to the kernel via ACPI tables on x86.
There's a driver that queries for this to initialize the pstore for
ChromeOS:

See drivers/platform/chrome/chromeos_pstore.c

Problem:

The problem is that, even though there's a process to change the kernel on
these systems, and is done regularly to install updates, the firmware is
updated much less frequently. Choosing the place in RAM also takes special
care, and may be in a different address for different boards. Updating the
size via firmware is a large effort and not something that many are willing
to do for a temporary pstore size change.

Requirement:

Need a way to reserve memory that will be at a consistent location for
every boot, if the kernel and system are the same. Does not need to work
if rebooting to a different kernel, or if the system can change the
memory layout between boots.

The reserved memory can not be an hard coded address, as the same kernel /
command line needs to run on several different machines. The picked memory
reservation just needs to be the same for a given machine, but may be
different for different machines.

Solution:

The solution I have come up with is to introduce a new "reserve_mem=" kernel
command line. This parameter takes the following format:

reserve_mem=nn:align:label

Where nn is the size of memory to reserve, the align is the alignment of
that memory, and label is the way for other sub-systems to find that memory.
This way the kernel command line could have:

reserve_mem=12M:4096:oops ramoops.mem_name=oops

At boot up, the kernel will search for 12 megabytes in usable memory regions
with an alignment of 4096. It will start at the highest regions and work its
way down (for those old devices that want access to lower address DMA). When
it finds a region, it will save it off in a small table and mark it with the
"oops" label. Then the pstore ramoops sub-system could ask for that memory
and location, and it will map itself there.

This prototype allows for 8 different mappings (which may be overkill, 4 is
probably plenty) with 16 byte size to store the label.

I have tested this and it works for us to solve the above problem. We can
update the kernel and command line and increase the size of pstore without
needing to update the firmware, or knowing every memory layout of each
board. I only tested this locally, it has not been tested in the field.

Changes since the v1: https://lore.kernel.org/all/[email protected]/

- Updated the change log of the first patch as well as added an entry
into kernel-parameters.txt about how reserve_mem is for soft reboots
and may not be reliable.


git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
reserve-mem

Head SHA1: 94c7d2d9093e9a7a899215c65adf28180d44a247


Steven Rostedt (Google) (2):
mm/memblock: Add "reserve_mem" to reserved named memory at boot up
pstore/ramoops: Add ramoops.mem_name= command line option

----
Documentation/admin-guide/kernel-parameters.txt | 20 +++++
fs/pstore/ram.c | 15 ++++
include/linux/mm.h | 2 +
mm/memblock.c | 97 +++++++++++++++++++++++++
4 files changed, 134 insertions(+)


2024-06-07 19:55:30

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

On 06/06/2024 12:01, Steven Rostedt wrote:
> Reserve unspecified location of physical memory from kernel command line
> [...]
> Solution:
>
> The solution I have come up with is to introduce a new "reserve_mem=" kernel
> command line. This parameter takes the following format:
>
> reserve_mem=nn:align:label
>
> Where nn is the size of memory to reserve, the align is the alignment of
> that memory, and label is the way for other sub-systems to find that memory.
> This way the kernel command line could have:
>
> reserve_mem=12M:4096:oops ramoops.mem_name=oops
>
> At boot up, the kernel will search for 12 megabytes in usable memory regions
> with an alignment of 4096. It will start at the highest regions and work its
> way down (for those old devices that want access to lower address DMA). When
> it finds a region, it will save it off in a small table and mark it with the
> "oops" label. Then the pstore ramoops sub-system could ask for that memory
> and location, and it will map itself there.
>
> This prototype allows for 8 different mappings (which may be overkill, 4 is
> probably plenty) with 16 byte size to store the label.
>
> I have tested this and it works for us to solve the above problem. We can
> update the kernel and command line and increase the size of pstore without
> needing to update the firmware, or knowing every memory layout of each
> board. I only tested this locally, it has not been tested in the field.
>

Hi Steve, first of all, thanks for this work! This is much appreciated.
The kdumpst tooling (Arch Linux) makes use of pstore when available, and
the recommendation so far was to reserve memory somehow, like "mem=" or
use kdump instead, if no free RAM area was available.

With your solution, things get way more "elegant". Also, I think we all
know pstore is not 100% reliable, specially the RAM backend due to
already mentioned reasons (like FW memory retraining, ECC memory, etc),
but it's great we have a mechanism to **try it**. If it works, awesome -
for statistical analysis, this is very useful; pstore has been used with
success in the Steam Deck, for example.

With all that said, I've tested your patches on top of 6.10-rc2 in 2
qemu VMs (one running legacy BIOS - seabios - and the other UEFI - using
ovmf) and on Steam Deck, and it's working flawlessly. I've tested only
using ramoops as module.

Some code review in the patches themselves (like a missing
EXPORT_SYMBOL_GPL), but all in all, that's a great addition! Feel free
to add my:

Tested-by: Guilherme G. Piccoli <[email protected]>

Thanks,


Guilherme

2024-06-10 17:02:15

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

On Fri, Jun 07, 2024 at 04:54:41PM -0300, Guilherme G. Piccoli wrote:
> On 06/06/2024 12:01, Steven Rostedt wrote:
> > Reserve unspecified location of physical memory from kernel command line
> > [...]
> > Solution:
> >
> > The solution I have come up with is to introduce a new "reserve_mem=" kernel
> > command line. This parameter takes the following format:
> >
> > reserve_mem=nn:align:label
> >
> > Where nn is the size of memory to reserve, the align is the alignment of
> > that memory, and label is the way for other sub-systems to find that memory.
> > This way the kernel command line could have:
> >
> > reserve_mem=12M:4096:oops ramoops.mem_name=oops
> >
> > At boot up, the kernel will search for 12 megabytes in usable memory regions
> > with an alignment of 4096. It will start at the highest regions and work its
> > way down (for those old devices that want access to lower address DMA). When
> > it finds a region, it will save it off in a small table and mark it with the
> > "oops" label. Then the pstore ramoops sub-system could ask for that memory
> > and location, and it will map itself there.
> >
> > This prototype allows for 8 different mappings (which may be overkill, 4 is
> > probably plenty) with 16 byte size to store the label.
> >
> > I have tested this and it works for us to solve the above problem. We can
> > update the kernel and command line and increase the size of pstore without
> > needing to update the firmware, or knowing every memory layout of each
> > board. I only tested this locally, it has not been tested in the field.
> >
>
> Hi Steve, first of all, thanks for this work! This is much appreciated.
> The kdumpst tooling (Arch Linux) makes use of pstore when available, and
> the recommendation so far was to reserve memory somehow, like "mem=" or
> use kdump instead, if no free RAM area was available.
>
> With your solution, things get way more "elegant". Also, I think we all
> know pstore is not 100% reliable, specially the RAM backend due to
> already mentioned reasons (like FW memory retraining, ECC memory, etc),
> but it's great we have a mechanism to **try it**. If it works, awesome -
> for statistical analysis, this is very useful; pstore has been used with
> success in the Steam Deck, for example.
>
> With all that said, I've tested your patches on top of 6.10-rc2 in 2
> qemu VMs (one running legacy BIOS - seabios - and the other UEFI - using
> ovmf) and on Steam Deck, and it's working flawlessly. I've tested only
> using ramoops as module.
>
> Some code review in the patches themselves (like a missing
> EXPORT_SYMBOL_GPL), but all in all, that's a great addition! Feel free
> to add my:
>
> Tested-by: Guilherme G. Piccoli <[email protected]>

Yeah, I think this looks good as long as it's understood to be a "best
effort", and will radically simplify doing qemu testing, etc. I expect I
can take v3 into -next with the fixes Guilherme noted.

-Kees

--
Kees Cook

2024-06-10 17:09:29

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

On Fri, 7 Jun 2024 16:54:41 -0300
"Guilherme G. Piccoli" <[email protected]> wrote:

> Some code review in the patches themselves (like a missing
> EXPORT_SYMBOL_GPL), but all in all, that's a great addition! Feel free
> to add my:
>
> Tested-by: Guilherme G. Piccoli <[email protected]>

Thanks a lot Guilherme! Much appreciated.

I'll send out a v3 with your comments addressed. And may even add parts
of this email in the change logs on how it does work in various cases.

-- Steve