2020-01-30 17:34:04

by Jörg Otte

[permalink] [raw]
Subject: 5.6-### doesn't boot

Hi,
my notebook doesn't boot with current kernel. Booting stops right after
displaying "loading initial ramdisk..". No further displays.
Also nothing is wriiten to the logs.

last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d

Thanks, Jörg


2020-01-30 18:10:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: 5.6-### doesn't boot

On Thu, Jan 30, 2020 at 9:32 AM Jörg Otte <[email protected]> wrote:
>
> my notebook doesn't boot with current kernel. Booting stops right after
> displaying "loading initial ramdisk..". No further displays.
> Also nothing is wriiten to the logs.
>
> last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d

It would be lovely if you can bisect a bit. But my merges in that
range are all from Ingo:

Ingo Molnar (7):
header cleanup
objtool updates
RCU updates
EFI updates
locking updates
perf updates
scheduler updates

but not having any messages at all makes it hard to guess where it would be.

A few bisect runs would narrow it down a fair amount. Bisecting all
the way would be even better, of course,

Linus

2020-01-30 19:08:44

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 5.6-### doesn't boot



> On Jan 30, 2020, at 10:08 AM, Linus Torvalds <[email protected]> wrote:
>
> On Thu, Jan 30, 2020 at 9:32 AM Jörg Otte <[email protected]> wrote:
>>
>> my notebook doesn't boot with current kernel. Booting stops right after
>> displaying "loading initial ramdisk..". No further displays.
>> Also nothing is wriiten to the logs.
>>
>> last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
>> first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
>
> It would be lovely if you can bisect a bit. But my merges in that
> range are all from Ingo:
>
> Ingo Molnar (7):
> header cleanup
> objtool updates
> RCU updates
> EFI updates
> locking updates
> perf updates
> scheduler updates
>
> but not having any messages at all makes it hard to guess where it would be.
>
> A few bisect runs would narrow it down a fair amount. Bisecting all
> the way would be even better, of course,
>
>

It would also be nice to know: are you EFI-booting or BIOS-booting? And are you using EFI mixed mode? (That is, are you booting a 64-bit kernel using 32-bit EFI?)

2020-01-31 06:45:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: 5.6-### doesn't boot


* Linus Torvalds <[email protected]> wrote:

> On Thu, Jan 30, 2020 at 9:32 AM J?rg Otte <[email protected]> wrote:
> >
> > my notebook doesn't boot with current kernel. Booting stops right after
> > displaying "loading initial ramdisk..". No further displays.
> > Also nothing is wriiten to the logs.
> >
> > last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> > first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
>
> It would be lovely if you can bisect a bit. But my merges in that
> range are all from Ingo:
>
> Ingo Molnar (7):
> header cleanup
> objtool updates
> RCU updates
> EFI updates
> locking updates
> perf updates
> scheduler updates

If I had to guess then perhaps the EFI changes look the most dangerous
ones from these trees - but in principle most of these trees could
contain a boot crasher/hang bug.

> but not having any messages at all makes it hard to guess where it
> would be.

To improve debug output:

Removing any 'fbcon' options in /boot/grub/grub.cfg and adding this to
the boot options might improve the debug output:

earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn

So for example if the relevant kernel boot entry in grub.cfg looks like
this:

linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro splash $vt_handoff

Then editing it to the following could in principle produce (much) more
verbose boot output:

linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn $vt_handoff

If this produces more output than just "loading initial ramdisk..' then a
photo of the hung screen would be sufficient, no need to transcribe it.

> A few bisect runs would narrow it down a fair amount. Bisecting all the
> way would be even better, of course,

Agreed!

If compiling full kernels for bisections takes too long (for example
because the .config is from a distro kernel) then running "make
localmodconfig" to create a config tailored to the currently active
modules will cut down significantly on build time.

Also, a warning: if the normal boot log contains spurious warnings then
the new 'panic_on_warn' option will cause additional trouble on good
kernels.

Thanks,

Ingo

2020-01-31 10:37:39

by Jörg Otte

[permalink] [raw]
Subject: Re: 5.6-### doesn't boot

Am Fr., 31. Jan. 2020 um 07:43 Uhr schrieb Ingo Molnar <[email protected]>:
>
>
> * Linus Torvalds <[email protected]> wrote:
>
> > On Thu, Jan 30, 2020 at 9:32 AM Jörg Otte <[email protected]> wrote:
> > >
> > > my notebook doesn't boot with current kernel. Booting stops right after
> > > displaying "loading initial ramdisk..". No further displays.
> > > Also nothing is wriiten to the logs.
> > >
> > > last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> > > first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
> >
> > It would be lovely if you can bisect a bit. But my merges in that
> > range are all from Ingo:
> >
> > Ingo Molnar (7):
> > header cleanup
> > objtool updates
> > RCU updates
> > EFI updates
> > locking updates
> > perf updates
> > scheduler updates
>
> If I had to guess then perhaps the EFI changes look the most dangerous
> ones from these trees - but in principle most of these trees could
> contain a boot crasher/hang bug.
>
> > but not having any messages at all makes it hard to guess where it
> > would be.
>
> To improve debug output:
>
> Removing any 'fbcon' options in /boot/grub/grub.cfg and adding this to
> the boot options might improve the debug output:
>
> earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn
>
> So for example if the relevant kernel boot entry in grub.cfg looks like
> this:
>
> linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro splash $vt_handoff
>
> Then editing it to the following could in principle produce (much) more
> verbose boot output:
>
> linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn $vt_handoff
>
> If this produces more output than just "loading initial ramdisk..' then a
> photo of the hung screen would be sufficient, no need to transcribe it.
>
> > A few bisect runs would narrow it down a fair amount. Bisecting all the
> > way would be even better, of course,
>
> Agreed!
>
> If compiling full kernels for bisections takes too long (for example
> because the .config is from a distro kernel) then running "make
> localmodconfig" to create a config tailored to the currently active
> modules will cut down significantly on build time.
>
> Also, a warning: if the normal boot log contains spurious warnings then
> the new 'panic_on_warn' option will cause additional trouble on good
> kernels.

It's bisected.
The first bad commit is :
1db91035d01aa8bfa2350c00ccb63d629b4041ad
efi: Add tracking for dynamically allocated memmaps

Unfortunately I can not revert because of compile errors!

In file included from /media/jojo/deftoshiba/kernel/linux/init/main.c:48:
/media/jojo/deftoshiba/kernel/linux/include/linux/efi.h:975:1: error:
version control conflict marker in file
<<<<<<< HEAD
^~~~~~~
/media/jojo/deftoshiba/kernel/linux/include/linux/efi.h:980:1: error:
version control conflict marker in file
=======
^~~~~~~
/media/jojo/deftoshiba/kernel/linux/include/linux/efi.h:982:1: error:
version control conflict marker in file
>>>>>>> parent of 1db91035d01a... efi: Add tracking for dynamically allocated memmaps


....

2020-01-31 18:39:07

by Ingo Molnar

[permalink] [raw]
Subject: EFI boot crash regression (was: Re: 5.6-### doesn't boot)


(Cc:ed Dan and Ard)

* J?rg Otte <[email protected]> wrote:

> Am Fr., 31. Jan. 2020 um 07:43 Uhr schrieb Ingo Molnar <[email protected]>:
> >
> >
> > * Linus Torvalds <[email protected]> wrote:
> >
> > > On Thu, Jan 30, 2020 at 9:32 AM J?rg Otte <[email protected]> wrote:
> > > >
> > > > my notebook doesn't boot with current kernel. Booting stops right after
> > > > displaying "loading initial ramdisk..". No further displays.
> > > > Also nothing is wriiten to the logs.
> > > >
> > > > last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> > > > first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
> > >
> > > It would be lovely if you can bisect a bit. But my merges in that
> > > range are all from Ingo:
> > >
> > > Ingo Molnar (7):
> > > header cleanup
> > > objtool updates
> > > RCU updates
> > > EFI updates
> > > locking updates
> > > perf updates
> > > scheduler updates
> >
> > If I had to guess then perhaps the EFI changes look the most dangerous
> > ones from these trees - but in principle most of these trees could
> > contain a boot crasher/hang bug.
> >
> > > but not having any messages at all makes it hard to guess where it
> > > would be.
> >
> > To improve debug output:
> >
> > Removing any 'fbcon' options in /boot/grub/grub.cfg and adding this to
> > the boot options might improve the debug output:
> >
> > earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn
> >
> > So for example if the relevant kernel boot entry in grub.cfg looks like
> > this:
> >
> > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro splash $vt_handoff
> >
> > Then editing it to the following could in principle produce (much) more
> > verbose boot output:
> >
> > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn $vt_handoff
> >
> > If this produces more output than just "loading initial ramdisk..' then a
> > photo of the hung screen would be sufficient, no need to transcribe it.
> >
> > > A few bisect runs would narrow it down a fair amount. Bisecting all the
> > > way would be even better, of course,
> >
> > Agreed!
> >
> > If compiling full kernels for bisections takes too long (for example
> > because the .config is from a distro kernel) then running "make
> > localmodconfig" to create a config tailored to the currently active
> > modules will cut down significantly on build time.
> >
> > Also, a warning: if the normal boot log contains spurious warnings then
> > the new 'panic_on_warn' option will cause additional trouble on good
> > kernels.
>
> It's bisected.
> The first bad commit is :
> 1db91035d01aa8bfa2350c00ccb63d629b4041ad
> efi: Add tracking for dynamically allocated memmaps

Thanks a ton, that's very useful!

I've Cc:-ed the EFI gents who are developing this code, maybe they'll
spot the bug.

Thanks,

Ingo

2020-01-31 18:50:28

by Dan Williams

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Fri, Jan 31, 2020 at 10:37 AM Ingo Molnar <[email protected]> wrote:
>
>
> (Cc:ed Dan and Ard)
>
> * Jörg Otte <[email protected]> wrote:
>
> > Am Fr., 31. Jan. 2020 um 07:43 Uhr schrieb Ingo Molnar <[email protected]>:
> > >
> > >
> > > * Linus Torvalds <[email protected]> wrote:
> > >
> > > > On Thu, Jan 30, 2020 at 9:32 AM Jörg Otte <[email protected]> wrote:
> > > > >
> > > > > my notebook doesn't boot with current kernel. Booting stops right after
> > > > > displaying "loading initial ramdisk..". No further displays.
> > > > > Also nothing is wriiten to the logs.
> > > > >
> > > > > last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> > > > > first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
> > > >
> > > > It would be lovely if you can bisect a bit. But my merges in that
> > > > range are all from Ingo:
> > > >
> > > > Ingo Molnar (7):
> > > > header cleanup
> > > > objtool updates
> > > > RCU updates
> > > > EFI updates
> > > > locking updates
> > > > perf updates
> > > > scheduler updates
> > >
> > > If I had to guess then perhaps the EFI changes look the most dangerous
> > > ones from these trees - but in principle most of these trees could
> > > contain a boot crasher/hang bug.
> > >
> > > > but not having any messages at all makes it hard to guess where it
> > > > would be.
> > >
> > > To improve debug output:
> > >
> > > Removing any 'fbcon' options in /boot/grub/grub.cfg and adding this to
> > > the boot options might improve the debug output:
> > >
> > > earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn
> > >
> > > So for example if the relevant kernel boot entry in grub.cfg looks like
> > > this:
> > >
> > > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro splash $vt_handoff
> > >
> > > Then editing it to the following could in principle produce (much) more
> > > verbose boot output:
> > >
> > > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn $vt_handoff
> > >
> > > If this produces more output than just "loading initial ramdisk..' then a
> > > photo of the hung screen would be sufficient, no need to transcribe it.
> > >
> > > > A few bisect runs would narrow it down a fair amount. Bisecting all the
> > > > way would be even better, of course,
> > >
> > > Agreed!
> > >
> > > If compiling full kernels for bisections takes too long (for example
> > > because the .config is from a distro kernel) then running "make
> > > localmodconfig" to create a config tailored to the currently active
> > > modules will cut down significantly on build time.
> > >
> > > Also, a warning: if the normal boot log contains spurious warnings then
> > > the new 'panic_on_warn' option will cause additional trouble on good
> > > kernels.
> >
> > It's bisected.
> > The first bad commit is :
> > 1db91035d01aa8bfa2350c00ccb63d629b4041ad
> > efi: Add tracking for dynamically allocated memmaps
>
> Thanks a ton, that's very useful!
>
> I've Cc:-ed the EFI gents who are developing this code, maybe they'll
> spot the bug.

I'll take a look. Jörg, can you paste a full dmesg from a good boot?

2020-02-01 09:48:19

by Ingo Molnar

[permalink] [raw]
Subject: [TEST PATCH RFC] Revert the EFI leak fixes for now (was: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot))



J?rg Otte wrote:

> It's bisected.
> The first bad commit is :
> 1db91035d01aa8bfa2350c00ccb63d629b4041ad
> efi: Add tracking for dynamically allocated memmaps

> Unfortunately I can not revert because of compile errors!
>
> In file included from /media/jojo/deftoshiba/kernel/linux/init/main.c:48:
> /media/jojo/deftoshiba/kernel/linux/include/linux/efi.h:975:1: error:
> version control conflict marker in file
> <<<<<<< HEAD

So 1db91035d0 doesn't revert cleanly, because 484a418d0754 depends on it,
plus there a third commit (f0ef6523475f) that has a semantic dependency
on 1db91035d01a.

But you can revert them all, if done in reverse chronological order:

git revert f0ef6523475f # ("efi: Fix efi_memmap_alloc() leaks")
git revert 484a418d0754 # ("efi: Fix handling of multiple efi_fake_mem= entries")
git revert 1db91035d01a # ("efi: Add tracking for dynamically allocated memmaps")

You can paste those three lines into a shell as-is, or you can apply the
patch below which is a combination of these three reverts.

J?rg, can you confirm that doing these reverts fixes booting on your
system? If it does then a full dmesg from that kernel would be useful.

FWIW I reviewed the bisected commit and didn't find the bug but I also
couldn't convince myself it's a 1:1 identity transformation and cleanup
of the existing logic.

The size arithmethics transformation looks correct at first sight, but
the data->flags handling in particular looks rather interwoven.

Thanks,

Ingo

============================>

arch/x86/platform/efi/efi.c | 10 ++----
arch/x86/platform/efi/quirks.c | 23 +++++++------
drivers/firmware/efi/fake_mem.c | 43 ++++++++++++------------
drivers/firmware/efi/memmap.c | 72 ++++++++++++-----------------------------
include/linux/efi.h | 13 +++-----
5 files changed, 63 insertions(+), 98 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 59f7f6d60cf6..4e46d2d24352 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -304,16 +304,10 @@ static void __init efi_clean_memmap(void)
}

if (n_removal > 0) {
- struct efi_memory_map_data data = {
- .phys_map = efi.memmap.phys_map,
- .desc_version = efi.memmap.desc_version,
- .desc_size = efi.memmap.desc_size,
- .size = data.desc_size * (efi.memmap.nr_map - n_removal),
- .flags = 0,
- };
+ u64 size = efi.memmap.nr_map - n_removal;

pr_warn("Removing %d invalid memory map entries.\n", n_removal);
- efi_memmap_install(&data);
+ efi_memmap_install(efi.memmap.phys_map, size);
}
}

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 88d32c06cffa..86b44289ac55 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -244,7 +244,7 @@ EXPORT_SYMBOL_GPL(efi_query_variable_store);
*/
void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
{
- struct efi_memory_map_data data = { 0 };
+ phys_addr_t new_phys, new_size;
struct efi_mem_range mr;
efi_memory_desc_t md;
int num_entries;
@@ -272,21 +272,24 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
num_entries = efi_memmap_split_count(&md, &mr.range);
num_entries += efi.memmap.nr_map;

- if (efi_memmap_alloc(num_entries, &data) != 0) {
+ new_size = efi.memmap.desc_size * num_entries;
+
+ new_phys = efi_memmap_alloc(num_entries);
+ if (!new_phys) {
pr_err("Could not allocate boot services memmap\n");
return;
}

- new = early_memremap(data.phys_map, data.size);
+ new = early_memremap(new_phys, new_size);
if (!new) {
pr_err("Failed to map new boot services memmap\n");
return;
}

efi_memmap_insert(&efi.memmap, new, &mr);
- early_memunmap(new, data.size);
+ early_memunmap(new, new_size);

- efi_memmap_install(&data);
+ efi_memmap_install(new_phys, num_entries);
e820__range_update(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED);
e820__update_table(e820_table);
}
@@ -405,7 +408,7 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md)

void __init efi_free_boot_services(void)
{
- struct efi_memory_map_data data = { 0 };
+ phys_addr_t new_phys, new_size;
efi_memory_desc_t *md;
int num_entries = 0;
void *new, *new_md;
@@ -460,12 +463,14 @@ void __init efi_free_boot_services(void)
if (!num_entries)
return;

- if (efi_memmap_alloc(num_entries, &data) != 0) {
+ new_size = efi.memmap.desc_size * num_entries;
+ new_phys = efi_memmap_alloc(num_entries);
+ if (!new_phys) {
pr_err("Failed to allocate new EFI memmap\n");
return;
}

- new = memremap(data.phys_map, data.size, MEMREMAP_WB);
+ new = memremap(new_phys, new_size, MEMREMAP_WB);
if (!new) {
pr_err("Failed to map new EFI memmap\n");
return;
@@ -489,7 +494,7 @@ void __init efi_free_boot_services(void)

memunmap(new);

- if (efi_memmap_install(&data) != 0) {
+ if (efi_memmap_install(new_phys, num_entries)) {
pr_err("Could not install new EFI memmap\n");
return;
}
diff --git a/drivers/firmware/efi/fake_mem.c b/drivers/firmware/efi/fake_mem.c
index 6e0f34a38171..bb9fc70d0cfa 100644
--- a/drivers/firmware/efi/fake_mem.c
+++ b/drivers/firmware/efi/fake_mem.c
@@ -34,45 +34,46 @@ static int __init cmp_fake_mem(const void *x1, const void *x2)
return 0;
}

-static void __init efi_fake_range(struct efi_mem_range *efi_range)
+void __init efi_fake_memmap(void)
{
- struct efi_memory_map_data data = { 0 };
int new_nr_map = efi.memmap.nr_map;
efi_memory_desc_t *md;
+ phys_addr_t new_memmap_phy;
void *new_memmap;
+ int i;
+
+ if (!efi_enabled(EFI_MEMMAP) || !nr_fake_mem)
+ return;

/* count up the number of EFI memory descriptor */
- for_each_efi_memory_desc(md)
- new_nr_map += efi_memmap_split_count(md, &efi_range->range);
+ for (i = 0; i < nr_fake_mem; i++) {
+ for_each_efi_memory_desc(md) {
+ struct range *r = &efi_fake_mems[i].range;
+
+ new_nr_map += efi_memmap_split_count(md, r);
+ }
+ }

/* allocate memory for new EFI memmap */
- if (efi_memmap_alloc(new_nr_map, &data) != 0)
+ new_memmap_phy = efi_memmap_alloc(new_nr_map);
+ if (!new_memmap_phy)
return;

/* create new EFI memmap */
- new_memmap = early_memremap(data.phys_map, data.size);
+ new_memmap = early_memremap(new_memmap_phy,
+ efi.memmap.desc_size * new_nr_map);
if (!new_memmap) {
- __efi_memmap_free(data.phys_map, data.size, data.flags);
+ memblock_free(new_memmap_phy, efi.memmap.desc_size * new_nr_map);
return;
}

- efi_memmap_insert(&efi.memmap, new_memmap, efi_range);
+ for (i = 0; i < nr_fake_mem; i++)
+ efi_memmap_insert(&efi.memmap, new_memmap, &efi_fake_mems[i]);

/* swap into new EFI memmap */
- early_memunmap(new_memmap, data.size);
-
- efi_memmap_install(&data);
-}
-
-void __init efi_fake_memmap(void)
-{
- int i;
+ early_memunmap(new_memmap, efi.memmap.desc_size * new_nr_map);

- if (!efi_enabled(EFI_MEMMAP) || !nr_fake_mem)
- return;
-
- for (i = 0; i < nr_fake_mem; i++)
- efi_fake_range(&efi_fake_mems[i]);
+ efi_memmap_install(new_memmap_phy, new_nr_map);

/* print new EFI memmap */
efi_print_memmap();
diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index 2ff1883dc788..4f98eb12c172 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -29,32 +29,9 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
return PFN_PHYS(page_to_pfn(p));
}

-void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags)
-{
- if (flags & EFI_MEMMAP_MEMBLOCK) {
- if (slab_is_available())
- memblock_free_late(phys, size);
- else
- memblock_free(phys, size);
- } else if (flags & EFI_MEMMAP_SLAB) {
- struct page *p = pfn_to_page(PHYS_PFN(phys));
- unsigned int order = get_order(size);
-
- free_pages((unsigned long) page_address(p), order);
- }
-}
-
-static void __init efi_memmap_free(void)
-{
- __efi_memmap_free(efi.memmap.phys_map,
- efi.memmap.desc_size * efi.memmap.nr_map,
- efi.memmap.flags);
-}
-
/**
* efi_memmap_alloc - Allocate memory for the EFI memory map
* @num_entries: Number of entries in the allocated map.
- * @data: efi memmap installation parameters
*
* Depending on whether mm_init() has already been invoked or not,
* either memblock or "normal" page allocation is used.
@@ -62,29 +39,14 @@ static void __init efi_memmap_free(void)
* Returns the physical address of the allocated memory map on
* success, zero on failure.
*/
-int __init efi_memmap_alloc(unsigned int num_entries,
- struct efi_memory_map_data *data)
+phys_addr_t __init efi_memmap_alloc(unsigned int num_entries)
{
- /* Expect allocation parameters are zero initialized */
- WARN_ON(data->phys_map || data->size);
-
- data->size = num_entries * efi.memmap.desc_size;
- data->desc_version = efi.memmap.desc_version;
- data->desc_size = efi.memmap.desc_size;
- data->flags &= ~(EFI_MEMMAP_SLAB | EFI_MEMMAP_MEMBLOCK);
- data->flags |= efi.memmap.flags & EFI_MEMMAP_LATE;
-
- if (slab_is_available()) {
- data->flags |= EFI_MEMMAP_SLAB;
- data->phys_map = __efi_memmap_alloc_late(data->size);
- } else {
- data->flags |= EFI_MEMMAP_MEMBLOCK;
- data->phys_map = __efi_memmap_alloc_early(data->size);
- }
+ unsigned long size = num_entries * efi.memmap.desc_size;

- if (!data->phys_map)
- return -ENOMEM;
- return 0;
+ if (slab_is_available())
+ return __efi_memmap_alloc_late(size);
+
+ return __efi_memmap_alloc_early(size);
}

/**
@@ -102,7 +64,8 @@ int __init efi_memmap_alloc(unsigned int num_entries,
*
* Returns zero on success, a negative error code on failure.
*/
-static int __init __efi_memmap_init(struct efi_memory_map_data *data)
+static int __init
+__efi_memmap_init(struct efi_memory_map_data *data)
{
struct efi_memory_map map;
phys_addr_t phys_map;
@@ -122,9 +85,6 @@ static int __init __efi_memmap_init(struct efi_memory_map_data *data)
return -ENOMEM;
}

- /* NOP if data->flags & (EFI_MEMMAP_MEMBLOCK | EFI_MEMMAP_SLAB) == 0 */
- efi_memmap_free();
-
map.phys_map = data->phys_map;
map.nr_map = data->size / data->desc_size;
map.map_end = map.map + data->size;
@@ -224,7 +184,8 @@ int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size)

/**
* efi_memmap_install - Install a new EFI memory map in efi.memmap
- * @ctx: map allocation parameters (address, size, flags)
+ * @addr: Physical address of the memory map
+ * @nr_map: Number of entries in the memory map
*
* Unlike efi_memmap_init_*(), this function does not allow the caller
* to switch from early to late mappings. It simply uses the existing
@@ -232,11 +193,20 @@ int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size)
*
* Returns zero on success, a negative error code on failure.
*/
-int __init efi_memmap_install(struct efi_memory_map_data *data)
+int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map)
{
+ struct efi_memory_map_data data;
+ unsigned long flags;
+
efi_memmap_unmap();

- return __efi_memmap_init(data);
+ data.phys_map = addr;
+ data.size = efi.memmap.desc_size * nr_map;
+ data.desc_version = efi.memmap.desc_version;
+ data.desc_size = efi.memmap.desc_size;
+ data.flags = efi.memmap.flags & EFI_MEMMAP_LATE;
+
+ return __efi_memmap_init(&data);
}

/**
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 7efd7072cca5..f117d68c314e 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -759,8 +759,8 @@ typedef union {

/*
* Architecture independent structure for describing a memory map for the
- * benefit of efi_memmap_init_early(), and for passing context between
- * efi_memmap_alloc() and efi_memmap_install().
+ * benefit of efi_memmap_init_early(), saving us the need to pass four
+ * parameters.
*/
struct efi_memory_map_data {
phys_addr_t phys_map;
@@ -778,8 +778,6 @@ struct efi_memory_map {
unsigned long desc_version;
unsigned long desc_size;
#define EFI_MEMMAP_LATE (1UL << 0)
-#define EFI_MEMMAP_MEMBLOCK (1UL << 1)
-#define EFI_MEMMAP_SLAB (1UL << 2)
unsigned long flags;
};

@@ -974,14 +972,11 @@ static inline efi_status_t efi_query_variable_store(u32 attributes,
#endif
extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);

-extern int __init efi_memmap_alloc(unsigned int num_entries,
- struct efi_memory_map_data *data);
-extern void __efi_memmap_free(u64 phys, unsigned long size,
- unsigned long flags);
+extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries);
extern int __init efi_memmap_init_early(struct efi_memory_map_data *data);
extern int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size);
extern void __init efi_memmap_unmap(void);
-extern int __init efi_memmap_install(struct efi_memory_map_data *data);
+extern int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map);
extern int __init efi_memmap_split_count(efi_memory_desc_t *md,
struct range *range);
extern void __init efi_memmap_insert(struct efi_memory_map *old_memmap,

2020-02-01 15:37:57

by Jörg Otte

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

Am Fr., 31. Jan. 2020 um 19:48 Uhr schrieb Dan Williams
<[email protected]>:
>
> On Fri, Jan 31, 2020 at 10:37 AM Ingo Molnar <[email protected]> wrote:
> >
> >
> > (Cc:ed Dan and Ard)
> >
> > * Jörg Otte <[email protected]> wrote:
> >
> > > Am Fr., 31. Jan. 2020 um 07:43 Uhr schrieb Ingo Molnar <[email protected]>:
> > > >
> > > >
> > > > * Linus Torvalds <[email protected]> wrote:
> > > >
> > > > > On Thu, Jan 30, 2020 at 9:32 AM Jörg Otte <[email protected]> wrote:
> > > > > >
> > > > > > my notebook doesn't boot with current kernel. Booting stops right after
> > > > > > displaying "loading initial ramdisk..". No further displays.
> > > > > > Also nothing is wriiten to the logs.
> > > > > >
> > > > > > last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> > > > > > first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
> > > > >
> > > > > It would be lovely if you can bisect a bit. But my merges in that
> > > > > range are all from Ingo:
> > > > >
> > > > > Ingo Molnar (7):
> > > > > header cleanup
> > > > > objtool updates
> > > > > RCU updates
> > > > > EFI updates
> > > > > locking updates
> > > > > perf updates
> > > > > scheduler updates
> > > >
> > > > If I had to guess then perhaps the EFI changes look the most dangerous
> > > > ones from these trees - but in principle most of these trees could
> > > > contain a boot crasher/hang bug.
> > > >
> > > > > but not having any messages at all makes it hard to guess where it
> > > > > would be.
> > > >
> > > > To improve debug output:
> > > >
> > > > Removing any 'fbcon' options in /boot/grub/grub.cfg and adding this to
> > > > the boot options might improve the debug output:
> > > >
> > > > earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn
> > > >
> > > > So for example if the relevant kernel boot entry in grub.cfg looks like
> > > > this:
> > > >
> > > > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro splash $vt_handoff
> > > >
> > > > Then editing it to the following could in principle produce (much) more
> > > > verbose boot output:
> > > >
> > > > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn $vt_handoff
> > > >
> > > > If this produces more output than just "loading initial ramdisk..' then a
> > > > photo of the hung screen would be sufficient, no need to transcribe it.
> > > >
> > > > > A few bisect runs would narrow it down a fair amount. Bisecting all the
> > > > > way would be even better, of course,
> > > >
> > > > Agreed!
> > > >
> > > > If compiling full kernels for bisections takes too long (for example
> > > > because the .config is from a distro kernel) then running "make
> > > > localmodconfig" to create a config tailored to the currently active
> > > > modules will cut down significantly on build time.
> > > >
> > > > Also, a warning: if the normal boot log contains spurious warnings then
> > > > the new 'panic_on_warn' option will cause additional trouble on good
> > > > kernels.
> > >
> > > It's bisected.
> > > The first bad commit is :
> > > 1db91035d01aa8bfa2350c00ccb63d629b4041ad
> > > efi: Add tracking for dynamically allocated memmaps
> >
> > Thanks a ton, that's very useful!
> >
> > I've Cc:-ed the EFI gents who are developing this code, maybe they'll
> > spot the bug.
>
> I'll take a look. Jörg, can you paste a full dmesg from a good boot?

Here it is.

Thanks, Jörg


Attachments:
good-dmesg (49.03 kB)

2020-02-01 16:11:35

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Sat, 1 Feb 2020 at 16:35, Jörg Otte <[email protected]> wrote:
>
> Am Fr., 31. Jan. 2020 um 19:48 Uhr schrieb Dan Williams
> <[email protected]>:
> >
> > On Fri, Jan 31, 2020 at 10:37 AM Ingo Molnar <[email protected]> wrote:
> > >
> > >
> > > (Cc:ed Dan and Ard)
> > >
> > > * Jörg Otte <[email protected]> wrote:
> > >
> > > > Am Fr., 31. Jan. 2020 um 07:43 Uhr schrieb Ingo Molnar <[email protected]>:
> > > > >
> > > > >
> > > > > * Linus Torvalds <[email protected]> wrote:
> > > > >
> > > > > > On Thu, Jan 30, 2020 at 9:32 AM Jörg Otte <[email protected]> wrote:
> > > > > > >
> > > > > > > my notebook doesn't boot with current kernel. Booting stops right after
> > > > > > > displaying "loading initial ramdisk..". No further displays.
> > > > > > > Also nothing is wriiten to the logs.
> > > > > > >
> > > > > > > last known good kernel is : vmlinuz-5.5.0-00849-gb0be0eff1a5a
> > > > > > > first known bad kernel is : vmlinuz-5.5.0-01154-gc677124e631d
> > > > > >
> > > > > > It would be lovely if you can bisect a bit. But my merges in that
> > > > > > range are all from Ingo:
> > > > > >
> > > > > > Ingo Molnar (7):
> > > > > > header cleanup
> > > > > > objtool updates
> > > > > > RCU updates
> > > > > > EFI updates
> > > > > > locking updates
> > > > > > perf updates
> > > > > > scheduler updates
> > > > >
> > > > > If I had to guess then perhaps the EFI changes look the most dangerous
> > > > > ones from these trees - but in principle most of these trees could
> > > > > contain a boot crasher/hang bug.
> > > > >
> > > > > > but not having any messages at all makes it hard to guess where it
> > > > > > would be.
> > > > >
> > > > > To improve debug output:
> > > > >
> > > > > Removing any 'fbcon' options in /boot/grub/grub.cfg and adding this to
> > > > > the boot options might improve the debug output:
> > > > >
> > > > > earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn
> > > > >
> > > > > So for example if the relevant kernel boot entry in grub.cfg looks like
> > > > > this:
> > > > >
> > > > > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro splash $vt_handoff
> > > > >
> > > > > Then editing it to the following could in principle produce (much) more
> > > > > verbose boot output:
> > > > >
> > > > > linux /vmlinuz-5.3.0-26-generic root=UUID=1bcxabe3-0b62-4x04-b456-47cd90c0e6x4 ro earlyprintk=vga initcall_debug ignore_loglevel debug panic_on_warn $vt_handoff
> > > > >
> > > > > If this produces more output than just "loading initial ramdisk..' then a
> > > > > photo of the hung screen would be sufficient, no need to transcribe it.
> > > > >
> > > > > > A few bisect runs would narrow it down a fair amount. Bisecting all the
> > > > > > way would be even better, of course,
> > > > >
> > > > > Agreed!
> > > > >
> > > > > If compiling full kernels for bisections takes too long (for example
> > > > > because the .config is from a distro kernel) then running "make
> > > > > localmodconfig" to create a config tailored to the currently active
> > > > > modules will cut down significantly on build time.
> > > > >
> > > > > Also, a warning: if the normal boot log contains spurious warnings then
> > > > > the new 'panic_on_warn' option will cause additional trouble on good
> > > > > kernels.
> > > >
> > > > It's bisected.
> > > > The first bad commit is :
> > > > 1db91035d01aa8bfa2350c00ccb63d629b4041ad
> > > > efi: Add tracking for dynamically allocated memmaps
> > >
> > > Thanks a ton, that's very useful!
> > >
> > > I've Cc:-ed the EFI gents who are developing this code, maybe they'll
> > > spot the bug.
> >
> > I'll take a look. Jörg, can you paste a full dmesg from a good boot?
>
> Here it is.
>

Hello Jörg,

Could you please try whether the change below fixes the issue?

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 59f7f6d60cf6..ae923ee8e2b4 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -308,7 +308,7 @@ static void __init efi_clean_memmap(void)
.phys_map = efi.memmap.phys_map,
.desc_version = efi.memmap.desc_version,
.desc_size = efi.memmap.desc_size,
- .size = data.desc_size * (efi.memmap.nr_map -
n_removal),
+ .size = efi.memmap.desc_size *
(efi.memmap.nr_map - n_removal),
.flags = 0,
};

2020-02-01 16:24:59

by Dan Williams

[permalink] [raw]
Subject: Re: [TEST PATCH RFC] Revert the EFI leak fixes for now (was: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot))

On Sat, Feb 1, 2020 at 1:45 AM Ingo Molnar <[email protected]> wrote:
>
>
>
> Jörg Otte wrote:
>
> > It's bisected.
> > The first bad commit is :
> > 1db91035d01aa8bfa2350c00ccb63d629b4041ad
> > efi: Add tracking for dynamically allocated memmaps
>
> > Unfortunately I can not revert because of compile errors!
> >
> > In file included from /media/jojo/deftoshiba/kernel/linux/init/main.c:48:
> > /media/jojo/deftoshiba/kernel/linux/include/linux/efi.h:975:1: error:
> > version control conflict marker in file
> > <<<<<<< HEAD
>
> So 1db91035d0 doesn't revert cleanly, because 484a418d0754 depends on it,
> plus there a third commit (f0ef6523475f) that has a semantic dependency
> on 1db91035d01a.
>
> But you can revert them all, if done in reverse chronological order:
>
> git revert f0ef6523475f # ("efi: Fix efi_memmap_alloc() leaks")
> git revert 484a418d0754 # ("efi: Fix handling of multiple efi_fake_mem= entries")
> git revert 1db91035d01a # ("efi: Add tracking for dynamically allocated memmaps")
>
> You can paste those three lines into a shell as-is, or you can apply the
> patch below which is a combination of these three reverts.
>
> Jörg, can you confirm that doing these reverts fixes booting on your
> system? If it does then a full dmesg from that kernel would be useful.
>
> FWIW I reviewed the bisected commit and didn't find the bug but I also
> couldn't convince myself it's a 1:1 identity transformation and cleanup
> of the existing logic.
>
> The size arithmethics transformation looks correct at first sight, but
> the data->flags handling in particular looks rather interwoven.

Agreed, but the only flags change that I couldn't convince myself was
correct is this:

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 59f7f6d60cf6..314b36ac2a08 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -309,7 +309,7 @@ static void __init efi_clean_memmap(void)
.desc_version = efi.memmap.desc_version,
.desc_size = efi.memmap.desc_size,
.size = data.desc_size * (efi.memmap.nr_map -
n_removal),
- .flags = 0,
+ .flags = efi.memmap.flags & EFI_MEMMAP_LATE,
};

pr_warn("Removing %d invalid memory map entries.\n", n_removal);

...but efi_clean_memmap() should "early".

2020-02-01 21:31:50

by Dan Williams

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Sat, Feb 1, 2020 at 7:35 AM Jörg Otte <[email protected]> wrote:
[..]
> > I'll take a look. Jörg, can you paste a full dmesg from a good boot?
>
> Here it is.

Much appreciated, I found an old Haswell laptop and am able to
reproduce the boot hang.

2020-02-01 21:45:34

by Dan Williams

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Fri, Jan 31, 2020 at 10:45 AM Ard Biesheuvel
<[email protected]> wrote:
>
> earlycon=efifb may help to get better diagnostics, but you will need to use a camera to capture the output

https://photos.app.goo.gl/uA3Wkaxc8x6A4gK47

2020-02-01 21:51:45

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Sat, 1 Feb 2020 at 22:44, Dan Williams <[email protected]> wrote:
>
> On Fri, Jan 31, 2020 at 10:45 AM Ard Biesheuvel
> <[email protected]> wrote:
> >
> > earlycon=efifb may help to get better diagnostics, but you will need to use a camera to capture the output
>
> https://photos.app.goo.gl/uA3Wkaxc8x6A4gK47

Yeah, so it definitely has to do with the n_removal > 0 path.

Did you try the change I suggested to Joerg?

2020-02-01 22:32:37

by Dan Williams

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Sat, Feb 1, 2020 at 1:49 PM Ard Biesheuvel <[email protected]> wrote:
>
> On Sat, 1 Feb 2020 at 22:44, Dan Williams <[email protected]> wrote:
> >
> > On Fri, Jan 31, 2020 at 10:45 AM Ard Biesheuvel
> > <[email protected]> wrote:
> > >
> > > earlycon=efifb may help to get better diagnostics, but you will need to use a camera to capture the output
> >
> > https://photos.app.goo.gl/uA3Wkaxc8x6A4gK47
>
> Yeah, so it definitely has to do with the n_removal > 0 path.
>
> Did you try the change I suggested to Joerg?

Nice, it worked.

Tested-by: Dan Williams <[email protected]>

2020-02-01 22:37:31

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Sat, 1 Feb 2020 at 23:31, Dan Williams <[email protected]> wrote:
>
> On Sat, Feb 1, 2020 at 1:49 PM Ard Biesheuvel <[email protected]> wrote:
> >
> > On Sat, 1 Feb 2020 at 22:44, Dan Williams <[email protected]> wrote:
> > >
> > > On Fri, Jan 31, 2020 at 10:45 AM Ard Biesheuvel
> > > <[email protected]> wrote:
> > > >
> > > > earlycon=efifb may help to get better diagnostics, but you will need to use a camera to capture the output
> > >
> > > https://photos.app.goo.gl/uA3Wkaxc8x6A4gK47
> >
> > Yeah, so it definitely has to do with the n_removal > 0 path.
> >
> > Did you try the change I suggested to Joerg?
>
> Nice, it worked.
>
> Tested-by: Dan Williams <[email protected]>

Thanks for confirming. I'll spin a proper patch.

2020-02-02 09:25:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)


* Ard Biesheuvel <[email protected]> wrote:

> Hello J?rg,
>
> Could you please try whether the change below fixes the issue?
>
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 59f7f6d60cf6..ae923ee8e2b4 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -308,7 +308,7 @@ static void __init efi_clean_memmap(void)
> .phys_map = efi.memmap.phys_map,
> .desc_version = efi.memmap.desc_version,
> .desc_size = efi.memmap.desc_size,
> - .size = data.desc_size * (efi.memmap.nr_map - n_removal),
> + .size = efi.memmap.desc_size * (efi.memmap.nr_map - n_removal),
> .flags = 0,

Oh, I actually noticed this one, but convinced myself that it's correct,
because GCC didn't warn about uninitialized data.

But maybe in this weird case data.desc_size as used within its own
initializer is zero?

Thanks,

Ingo

2020-02-02 09:35:47

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

On Sun, 2 Feb 2020 at 10:22, Ingo Molnar <[email protected]> wrote:
>
>
> * Ard Biesheuvel <[email protected]> wrote:
>
> > Hello Jörg,
> >
> > Could you please try whether the change below fixes the issue?
> >
> > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > index 59f7f6d60cf6..ae923ee8e2b4 100644
> > --- a/arch/x86/platform/efi/efi.c
> > +++ b/arch/x86/platform/efi/efi.c
> > @@ -308,7 +308,7 @@ static void __init efi_clean_memmap(void)
> > .phys_map = efi.memmap.phys_map,
> > .desc_version = efi.memmap.desc_version,
> > .desc_size = efi.memmap.desc_size,
> > - .size = data.desc_size * (efi.memmap.nr_map - n_removal),
> > + .size = efi.memmap.desc_size * (efi.memmap.nr_map - n_removal),
> > .flags = 0,
>
> Oh, I actually noticed this one, but convinced myself that it's correct,
> because GCC didn't warn about uninitialized data.
>
> But maybe in this weird case data.desc_size as used within its own
> initializer is zero?
>

Something like that, yes. Note that size and desc_size appear in
opposite order in the struct definition, and this may also affect how
the compiler handles this.

2020-02-02 11:12:34

by Jörg Otte

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)

Am So., 2. Feb. 2020 um 10:22 Uhr schrieb Ingo Molnar <[email protected]>:
>
>
> * Ard Biesheuvel <[email protected]> wrote:
>
> > Hello Jörg,
> >
> > Could you please try whether the change below fixes the issue?
> >
> > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > index 59f7f6d60cf6..ae923ee8e2b4 100644
> > --- a/arch/x86/platform/efi/efi.c
> > +++ b/arch/x86/platform/efi/efi.c
> > @@ -308,7 +308,7 @@ static void __init efi_clean_memmap(void)
> > .phys_map = efi.memmap.phys_map,
> > .desc_version = efi.memmap.desc_version,
> > .desc_size = efi.memmap.desc_size,
> > - .size = data.desc_size * (efi.memmap.nr_map - n_removal),
> > + .size = efi.memmap.desc_size * (efi.memmap.nr_map - n_removal),
> > .flags = 0,
>
> Oh, I actually noticed this one, but convinced myself that it's correct,
> because GCC didn't warn about uninitialized data.
>
> But maybe in this weird case data.desc_size as used within its own
> initializer is zero?
>
Patch makes my kernel booting again :)

Thanks, Jörg

2020-02-03 09:36:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: EFI boot crash regression (was: Re: 5.6-### doesn't boot)


* J?rg Otte <[email protected]> wrote:

> Am So., 2. Feb. 2020 um 10:22 Uhr schrieb Ingo Molnar <[email protected]>:
> >
> >
> > * Ard Biesheuvel <[email protected]> wrote:
> >
> > > Hello J?rg,
> > >
> > > Could you please try whether the change below fixes the issue?
> > >
> > > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > > index 59f7f6d60cf6..ae923ee8e2b4 100644
> > > --- a/arch/x86/platform/efi/efi.c
> > > +++ b/arch/x86/platform/efi/efi.c
> > > @@ -308,7 +308,7 @@ static void __init efi_clean_memmap(void)
> > > .phys_map = efi.memmap.phys_map,
> > > .desc_version = efi.memmap.desc_version,
> > > .desc_size = efi.memmap.desc_size,
> > > - .size = data.desc_size * (efi.memmap.nr_map - n_removal),
> > > + .size = efi.memmap.desc_size * (efi.memmap.nr_map - n_removal),
> > > .flags = 0,
> >
> > Oh, I actually noticed this one, but convinced myself that it's correct,
> > because GCC didn't warn about uninitialized data.
> >
> > But maybe in this weird case data.desc_size as used within its own
> > initializer is zero?
>
> Patch makes my kernel booting again :)

Thank you! I'll send the fix to Linus later today.

Ingo