2022-01-22 00:47:18

by Amadeusz Sławiński

[permalink] [raw]
Subject: [PATCH] x86: Preserve ACPI memory area during hibernation

When overriding NHLT ACPI-table tests show that on some platforms
there is problem that NHLT contains garbage after hibernation/resume
cycle.

Problem stems from the fact that ACPI override performs early memory
allocation using memblock_phys_alloc_range() in
memblock_phys_alloc_range(). This memory block is later being marked as
ACPI memory block in arch_reserve_mem_area(). Later when memory areas
are considered for hibernation it is being marked as nosave in
e820__register_nosave_regions().

Fix this by skipping ACPI memory area altogether when considering areas
to mark as nosave.

Signed-off-by: Amadeusz Sławiński <[email protected]>
Reviewed-by: Cezary Rojewski <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
arch/x86/kernel/e820.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index bc0657f0deed..88c1b785ffe4 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -758,6 +758,18 @@ void __init e820__register_nosave_regions(unsigned long limit_pfn)
for (i = 0; i < e820_table->nr_entries; i++) {
struct e820_entry *entry = &e820_table->entries[i];

+ /*
+ * Areas containing ACPI tables should be preserved during
+ * hibernation to prevent potential problems caused by BIOS
+ * upgrades when offline, as well as to preserve initrd
+ * ACPI tables overrides which are applied on boot.
+ * See also acpi_table_upgrade() & arch_reserve_mem_area()
+ */
+ if (entry->type == E820_TYPE_ACPI) {
+ pfn = PFN_UP(entry->addr + entry->size);
+ continue;
+ }
+
if (pfn < PFN_UP(entry->addr))
register_nosave_region(pfn, PFN_UP(entry->addr));

--
2.25.1


2022-02-14 21:24:03

by Wysocki, Rafael J

[permalink] [raw]
Subject: Re: [PATCH] x86: Preserve ACPI memory area during hibernation

On 1/21/2022 11:39 AM, Amadeusz Sławiński wrote:
> When overriding NHLT ACPI-table tests show that on some platforms
> there is problem that NHLT contains garbage after hibernation/resume
> cycle.
>
> Problem stems from the fact that ACPI override performs early memory
> allocation using memblock_phys_alloc_range() in
> memblock_phys_alloc_range(). This memory block is later being marked as
> ACPI memory block in arch_reserve_mem_area(). Later when memory areas
> are considered for hibernation it is being marked as nosave in
> e820__register_nosave_regions().
>
> Fix this by skipping ACPI memory area altogether when considering areas
> to mark as nosave.

This patch looks correct to me and I'm going to apply it as 5.18
material unless there are any objections or concerns (in which case
please let me know).


> Signed-off-by: Amadeusz Sławiński <[email protected]>
> Reviewed-by: Cezary Rojewski <[email protected]>
> Acked-by: Rafael J. Wysocki <[email protected]>
> ---
> arch/x86/kernel/e820.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index bc0657f0deed..88c1b785ffe4 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -758,6 +758,18 @@ void __init e820__register_nosave_regions(unsigned long limit_pfn)
> for (i = 0; i < e820_table->nr_entries; i++) {
> struct e820_entry *entry = &e820_table->entries[i];
>
> + /*
> + * Areas containing ACPI tables should be preserved during
> + * hibernation to prevent potential problems caused by BIOS
> + * upgrades when offline, as well as to preserve initrd
> + * ACPI tables overrides which are applied on boot.
> + * See also acpi_table_upgrade() & arch_reserve_mem_area()
> + */
> + if (entry->type == E820_TYPE_ACPI) {
> + pfn = PFN_UP(entry->addr + entry->size);
> + continue;
> + }
> +
> if (pfn < PFN_UP(entry->addr))
> register_nosave_region(pfn, PFN_UP(entry->addr));
>


2022-02-15 16:00:53

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH] x86: Preserve ACPI memory area during hibernation

On 2/14/2022 8:34 PM, Rafael J. Wysocki wrote:
> On 1/21/2022 11:39 AM, Amadeusz Sławiński wrote:
>> When overriding NHLT ACPI-table tests show that on some platforms
>> there is problem that NHLT contains garbage after hibernation/resume
>> cycle.
>>
>> Problem stems from the fact that ACPI override performs early memory
>> allocation using memblock_phys_alloc_range() in
>> memblock_phys_alloc_range(). This memory block is later being marked as
>> ACPI memory block in arch_reserve_mem_area(). Later when memory areas
>> are considered for hibernation it is being marked as nosave in
>> e820__register_nosave_regions().
>>
>> Fix this by skipping ACPI memory area altogether when considering areas
>> to mark as nosave.
>
> This patch looks correct to me and I'm going to apply it as 5.18
> material unless there are any objections or concerns (in which case
> please let me know).
>
>

Well, what do you know? I've checked with validation team to make sure
that it works as expected and while it causes no problem on almost all
platforms and fixes problem with NHLT ACPI-table override, there is this
one platform where it causes oops on hibernation which of course is gone
after reverting the patch.

? set_direct_map_default_noflush+0x130/0x130
? memory_bm_test_bit+0x29/0x60
saveable_page+0xce/0xf2
count_data_pages+0x50/0x76
hibernate_preallocate_memory+0x9c/0x377
? __mutex_lock_slowpath+0x20/0x20
hibernation_snapshot+0x1cf/0x610
snapshot_ioctl+0x3d2/0x690
? snapshot_release+0xd0/0xd0
? new_sync_write+0x36b/0x390
__x64_sys_ioctl+0x6dc/0xe20
? vfs_fileattr_set+0x520/0x520
? _raw_read_unlock+0x2a/0x50
? __kasan_check_read+0x11/0x20
? vfs_write+0x131/0x3d0
? ksys_write+0x13b/0x170
? debug_smp_processor_id+0x17/0x20
? fpregs_assert_state_consistent+0x5f/0x70
? exit_to_user_mode_prepare+0x3e/0x170
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

Above trace points at functions using pfn, so I suspect there may be
need for some additional checks, but I will need to investigate.
I guess you can skip this patch for now, until I figure what exactly is
going on.

2022-02-15 20:13:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Preserve ACPI memory area during hibernation

On Tue, Feb 15, 2022 at 1:22 PM Amadeusz Sławiński
<[email protected]> wrote:
>
> On 2/14/2022 8:34 PM, Rafael J. Wysocki wrote:
> > On 1/21/2022 11:39 AM, Amadeusz Sławiński wrote:
> >> When overriding NHLT ACPI-table tests show that on some platforms
> >> there is problem that NHLT contains garbage after hibernation/resume
> >> cycle.
> >>
> >> Problem stems from the fact that ACPI override performs early memory
> >> allocation using memblock_phys_alloc_range() in
> >> memblock_phys_alloc_range(). This memory block is later being marked as
> >> ACPI memory block in arch_reserve_mem_area(). Later when memory areas
> >> are considered for hibernation it is being marked as nosave in
> >> e820__register_nosave_regions().
> >>
> >> Fix this by skipping ACPI memory area altogether when considering areas
> >> to mark as nosave.
> >
> > This patch looks correct to me and I'm going to apply it as 5.18
> > material unless there are any objections or concerns (in which case
> > please let me know).
> >
> >
>
> Well, what do you know? I've checked with validation team to make sure
> that it works as expected and while it causes no problem on almost all
> platforms and fixes problem with NHLT ACPI-table override, there is this
> one platform where it causes oops on hibernation which of course is gone
> after reverting the patch.
>
> ? set_direct_map_default_noflush+0x130/0x130
> ? memory_bm_test_bit+0x29/0x60
> saveable_page+0xce/0xf2
> count_data_pages+0x50/0x76
> hibernate_preallocate_memory+0x9c/0x377
> ? __mutex_lock_slowpath+0x20/0x20
> hibernation_snapshot+0x1cf/0x610
> snapshot_ioctl+0x3d2/0x690
> ? snapshot_release+0xd0/0xd0
> ? new_sync_write+0x36b/0x390
> __x64_sys_ioctl+0x6dc/0xe20
> ? vfs_fileattr_set+0x520/0x520
> ? _raw_read_unlock+0x2a/0x50
> ? __kasan_check_read+0x11/0x20
> ? vfs_write+0x131/0x3d0
> ? ksys_write+0x13b/0x170
> ? debug_smp_processor_id+0x17/0x20
> ? fpregs_assert_state_consistent+0x5f/0x70
> ? exit_to_user_mode_prepare+0x3e/0x170
> do_syscall_64+0x43/0x90
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Above trace points at functions using pfn, so I suspect there may be
> need for some additional checks, but I will need to investigate.
> I guess you can skip this patch for now, until I figure what exactly is
> going on.

I'll do that, thanks!