2023-05-27 18:36:31

by Akihiro Suda

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

[Resending as a plain text email]

Turned out that this is a mixture of an ACPICA issue and an EFISTUB issue.

Kernel v6.2 can boot by reverting the *both* of the following two commits:
- 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 "ACPICA: Events: Support
fixed PCIe wake event"
- e346bebbd36b1576a3335331fed61bb48c6d8823 "efi: libstub: Always
enable initrd command line loader and bump version"

Kernel v6.3 can boot by just reverting e346bebb, as 5c62d5a has been
already reverted in 8e41e0a575664d26bb87e012c39435c4c3914ed9.
The situation is the same for v6.4-rc3 too.

Note that in my test I let Virtualization.framework directly load
bzImage without GRUB (akin to `qemu-system-x86_64 -kernel bzImage`).
Apparently, reverting e346bebb is not necessary for loading bzImage via GRUB.


> Also, the reporter can't provide dmesg log (forget to attach serial console?).

Uploaded v6.1 dmesg in the bugzilla.
v6.2 dmesg can't be provided, as it hangs before printing something in
console=hvc0.
(IIUC, console=ttyS0 (RS-232C) is not implemented in Virtualization.framework.)


> 2023年5月25日(木) 21:46 Bagas Sanjaya <[email protected]>:
>>
>> Hi,
>>
>> I notice a regression report on Bugzilla [1]. Quoting from it:
>>
>> > Linux kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64).
>> >
>> > It is reported that the issue is not reproducible on ARM64: https://github.com/lima-vm/lima/issues/1577#issuecomment-1561577694
>> >
>> >
>> > ## Reproduction
>> > - Checkout the kernel repo, and run `make defconfig bzImage`.
>> >
>> > - Create an initrd (see the attached `initrd-example.txt`)
>> >
>> > - Transfer the bzImage and initrd to an Intel Mac.
>> >
>> > - On Mac, download `RunningLinuxInAVirtualMachine.zip` from https://developer.apple.com/documentation/virtualization/running_linux_in_a_virtual_machine , and build the `LinuxVirtualMachine` binary with Xcode.
>> > Building this binary with Xcode requires logging in to Apple.
>> > If you do not like logging in, a third party equivalent such as https://github.com/Code-Hex/vz/blob/v3.0.6/example/linux/main.go can be used.
>> >
>> > - Run `LinuxVirtualMachine /tmp/bzImage /tmp/initrd.img`.
>> > v6.1 successfully boots into the busybox shell.
>> > v6.2 just hangs before printing something in the console.
>> >
>> >
>> > ## Tested versions
>> > ```
>> > v6.1: OK
>> > ...
>> > v6.1.0-rc2-00002-g60f2096b59bc (included in v6.2-rc1): OK
>> > v6.1.0-rc2-00003-g5c62d5aab875 (included in v6.2-rc1): NG <-- This commit caused a regression
>> > ...
>> > v6.2-rc1: NG
>> > ...
>> > v6.2: NG
>> > ...
>> > v6.3.0-rc7-00181-g8e41e0a57566 (included in v6.3): NG <-- Reverts 5c62d5aab875 but still NG
>> > ...
>> > v6.3: NG
>> > v6.4-rc3: NG
>> > ```
>> >
>> > Tested on MacBookPro 2020 (Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz) running macOS 13.4.
>> >
>> >
>> > The issue seems a regression in [5c62d5aab8752e5ee7bfbe75ed6060db1c787f98](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5c62d5aab8752e5ee7bfbe75ed6060db1c787f98) "ACPICA: Events: Support fixed PCIe wake event".
>> >
>> > This commit was introduced in v6.2-rc1, and apparently reverted in v6.3 ([8e41e0a575664d26bb87e012c39435c4c3914ed9](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e41e0a575664d26bb87e012c39435c4c3914ed9)).
>> > However, v6.3 and the latest v6.4-rc3 still don't boot.
>>
>> See bugzilla for the full thread.
>>
>> Interestingly, this regression still occurs despite the culprit is
>> reverted in 8e41e0a575664d ("Revert "ACPICA: Events: Support fixed
>> PCIe wake event""), so this (obviously) isn't wake-on-lan regression,
>> but rather early boot one.
>>
>> Also, the reporter can't provide dmesg log (forget to attach serial
>> console?).
>>
>> Anyway, I'm adding it to regzbot:
>>
>> #regzbot introduced: 5c62d5aab8752e https://bugzilla.kernel.org/show_bug.cgi?id=217485
>> #regzbot title: Linux v6.2+ (x86_64) no longer boots on Apple's Virtualization framework (ACPICA issue)
>>
>> Thanks.
>>
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217485
>>
>> --
>> An old man doll... just what I always wanted! - Clara


2023-05-27 18:37:57

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

On Sat, 27 May 2023 at 20:00, Akihiro Suda <[email protected]> wrote:
>
> [Resending as a plain text email]
>
> Turned out that this is a mixture of an ACPICA issue and an EFISTUB issue.
>
> Kernel v6.2 can boot by reverting the *both* of the following two commits:
> - 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 "ACPICA: Events: Support
> fixed PCIe wake event"
> - e346bebbd36b1576a3335331fed61bb48c6d8823 "efi: libstub: Always
> enable initrd command line loader and bump version"
>
> Kernel v6.3 can boot by just reverting e346bebb, as 5c62d5a has been
> already reverted in 8e41e0a575664d26bb87e012c39435c4c3914ed9.
> The situation is the same for v6.4-rc3 too.
>
> Note that in my test I let Virtualization.framework directly load
> bzImage without GRUB (akin to `qemu-system-x86_64 -kernel bzImage`).
> Apparently, reverting e346bebb is not necessary for loading bzImage via GRUB.
>

Are you using OVMF? Which versions of qemu and OVMF are you using?

2023-05-27 18:55:53

by Akihiro Suda

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

> Are you using OVMF? Which versions of qemu and OVMF are you using?

I'm using Apple's Virtualization.framework, not QEMU.

It doesn't use UEFI when it directly loads bzImage.
( dmesg: https://bugzilla.kernel.org/attachment.cgi?id=304323 )

Despite that, it still expects LINUX_EFISTUB_MINOR_VERSION
(include/linux/pe.h) referred from arch/x86/boot/header.S to be 0x0.
I confirmed that the kernel can boot by just setting
LINUX_EFISTUB_MINOR_VERSION to 0x0.

Would it be possible to revert the LINUX_EFISTUB_MINOR_VERSION value
(not the actual code) to 0x0?
Or will it break something else?

Anyway, I'll try to make a request to Apple to remove the
LINUX_EFISTUB_MINOR_VERSION check.

2023年5月28日(日) 3:04 Ard Biesheuvel <[email protected]>:
>
> On Sat, 27 May 2023 at 20:00, Akihiro Suda <[email protected]> wrote:
> >
> > [Resending as a plain text email]
> >
> > Turned out that this is a mixture of an ACPICA issue and an EFISTUB issue.
> >
> > Kernel v6.2 can boot by reverting the *both* of the following two commits:
> > - 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 "ACPICA: Events: Support
> > fixed PCIe wake event"
> > - e346bebbd36b1576a3335331fed61bb48c6d8823 "efi: libstub: Always
> > enable initrd command line loader and bump version"
> >
> > Kernel v6.3 can boot by just reverting e346bebb, as 5c62d5a has been
> > already reverted in 8e41e0a575664d26bb87e012c39435c4c3914ed9.
> > The situation is the same for v6.4-rc3 too.
> >
> > Note that in my test I let Virtualization.framework directly load
> > bzImage without GRUB (akin to `qemu-system-x86_64 -kernel bzImage`).
> > Apparently, reverting e346bebb is not necessary for loading bzImage via GRUB.
> >
>
> Are you using OVMF? Which versions of qemu and OVMF are you using?

2023-05-27 18:56:08

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

On Sat, 27 May 2023 at 20:34, Akihiro Suda <[email protected]> wrote:
>
> > Are you using OVMF? Which versions of qemu and OVMF are you using?
>
> I'm using Apple's Virtualization.framework, not QEMU.
>
> It doesn't use UEFI when it directly loads bzImage.
> ( dmesg: https://bugzilla.kernel.org/attachment.cgi?id=304323 )
>
> Despite that, it still expects LINUX_EFISTUB_MINOR_VERSION
> (include/linux/pe.h) referred from arch/x86/boot/header.S to be 0x0.
> I confirmed that the kernel can boot by just setting
> LINUX_EFISTUB_MINOR_VERSION to 0x0.
>

Thanks for checking that, that is very helpful/

> Would it be possible to revert the LINUX_EFISTUB_MINOR_VERSION value
> (not the actual code) to 0x0?
> Or will it break something else?
>
> Anyway, I'll try to make a request to Apple to remove the
> LINUX_EFISTUB_MINOR_VERSION check.
>

Yes, that makes the most sense. If the existing virtual machine BIOS
has a hardcoded check that the EFI stub version is 1.0 even if it does
not boot via EFI to begin with, I don't see how we can reasonably
treat this as a regression that needs fixing on the Linux side.

The version bump to PE image version v1.1 sets a baseline across all
Linux archifectures that can boot via EFI that initrd loading is
supported via the command line as well as via the LoadFile2 protocol.
Reverting that would substantially reduce the value of having this
identification embedded into the image.

2023-05-27 20:09:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

On Sat, May 27, 2023 at 11:42 AM Ard Biesheuvel <[email protected]> wrote:
>
> Yes, that makes the most sense. If the existing virtual machine BIOS
> has a hardcoded check that the EFI stub version is 1.0 even if it does
> not boot via EFI to begin with, I don't see how we can reasonably
> treat this as a regression that needs fixing on the Linux side.

Well, we consider firmware issues to be the same as any hardware
issue. If firmware has a bug that requires us to do things certain
ways, that's really no different from hardware that requires some
insane init sequence.

So why not just say that LINUX_EFISTUB_MINOR_VERSION should be 0, and
just add the comment that versioning doesn't work?

I'm not sure why this was tied into always enabling the initrd command
line loader.

Numbered version checks are a fundamentally broken and stupid concept
anyway. Don't do them. Just leave it at zero, and maybe some day there
is a sane model that actually has a bitfield of capabilities and
requirements.

Linus

2023-05-27 22:29:29

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

On Sat, 27 May 2023 at 21:40, Linus Torvalds
<[email protected]> wrote:
>
> On Sat, May 27, 2023 at 11:42 AM Ard Biesheuvel <[email protected]> wrote:
> >
> > Yes, that makes the most sense. If the existing virtual machine BIOS
> > has a hardcoded check that the EFI stub version is 1.0 even if it does
> > not boot via EFI to begin with, I don't see how we can reasonably
> > treat this as a regression that needs fixing on the Linux side.
>
> Well, we consider firmware issues to be the same as any hardware
> issue. If firmware has a bug that requires us to do things certain
> ways, that's really no different from hardware that requires some
> insane init sequence.
>
> So why not just say that LINUX_EFISTUB_MINOR_VERSION should be 0, and
> just add the comment that versioning doesn't work?
>

Fair enough. Or we could try bumping it from v1.1 to v2.0 (or v3.0 if
we make it a bit mask).

Akihiro, would you mind checking if changing the major/minor to any of
these values results in the same problem?

Unfortunately, the only data point we have is that a non-EFI
bootloader (which is unlikely to carry a PE/COFF loader) needs the
byte at that specific offset to be 0x0, and we really have no idea
why, or whether we could hit this in other ways (i.e., by changing the
PE/COFF header to comply with new MS requirements for secure boot,
which is another thing that is in progress)

> I'm not sure why this was tied into always enabling the initrd command
> line loader.
>

For x86, it doesn't actually make a difference, but on other
architectures, the command line initrd= loader could be disabled, but
that possibility was removed. The idea was that by bumping the version
to v1.1 at the same time, generic EFI loaders would be able to
identify this capability without arch specific conditionals in the
logic.

Currently, GRUB and systemd-stub check this version field, but only
for v1.0 or higher. Upstream GRUB switched to this generic version of
the EFI loader just this week, but does not actually use initrd= at
all for EFI boot (on any architecture).

> Numbered version checks are a fundamentally broken and stupid concept
> anyway. Don't do them. Just leave it at zero, and maybe some day there
> is a sane model that actually has a bitfield of capabilities and
> requirements.
>

Yeah, maybe you're right. Currently, only a single feature is tied to
LINUX_EFISTUB_MAJOR_VERSION==1 (LoadFile2 support for initrd loading),
and this PE/COFF version field has no meaning to UEFI firmware itself,
so we could simply treat these fields as bit masks if we wanted to
(and setting the initrd command line loader bit for x86 would be
redundant anyway)

But not being able to freely set such a bit because some rarely used
non-EFI BIOS implementation imposes requirements on the contents of
the EFI specific image header is rather disappointing.

2023-05-28 07:29:07

by Akihiro Suda

[permalink] [raw]
Subject: Re: mix of ACPICA regression and EFISTUB regression (Was: kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64); likely to be related to ACPICA)

> Fair enough. Or we could try bumping it from v1.1 to v2.0 (or v3.0 if
> we make it a bit mask).
>
> Akihiro, would you mind checking if changing the major/minor to any of
> these values results in the same problem?

Surprisingly, v2.0 and v3.0 boot, although v1.1, v2.1, v2.2, v3.1,
etc. do not boot.

Looks like Apple's vmlinuz loader only requires
LINUX_EFISTUB_MINOR_VERSION to be 0x0
and does not care about LINUX_EFISTUB_MAJOR_VERSION.

2023年5月28日(日) 6:48 Ard Biesheuvel <[email protected]>:
>
> On Sat, 27 May 2023 at 21:40, Linus Torvalds
> <[email protected]> wrote:
> >
> > On Sat, May 27, 2023 at 11:42 AM Ard Biesheuvel <[email protected]> wrote:
> > >
> > > Yes, that makes the most sense. If the existing virtual machine BIOS
> > > has a hardcoded check that the EFI stub version is 1.0 even if it does
> > > not boot via EFI to begin with, I don't see how we can reasonably
> > > treat this as a regression that needs fixing on the Linux side.
> >
> > Well, we consider firmware issues to be the same as any hardware
> > issue. If firmware has a bug that requires us to do things certain
> > ways, that's really no different from hardware that requires some
> > insane init sequence.
> >
> > So why not just say that LINUX_EFISTUB_MINOR_VERSION should be 0, and
> > just add the comment that versioning doesn't work?
> >
>
> Fair enough. Or we could try bumping it from v1.1 to v2.0 (or v3.0 if
> we make it a bit mask).
>
> Akihiro, would you mind checking if changing the major/minor to any of
> these values results in the same problem?
>
> Unfortunately, the only data point we have is that a non-EFI
> bootloader (which is unlikely to carry a PE/COFF loader) needs the
> byte at that specific offset to be 0x0, and we really have no idea
> why, or whether we could hit this in other ways (i.e., by changing the
> PE/COFF header to comply with new MS requirements for secure boot,
> which is another thing that is in progress)
>
> > I'm not sure why this was tied into always enabling the initrd command
> > line loader.
> >
>
> For x86, it doesn't actually make a difference, but on other
> architectures, the command line initrd= loader could be disabled, but
> that possibility was removed. The idea was that by bumping the version
> to v1.1 at the same time, generic EFI loaders would be able to
> identify this capability without arch specific conditionals in the
> logic.
>
> Currently, GRUB and systemd-stub check this version field, but only
> for v1.0 or higher. Upstream GRUB switched to this generic version of
> the EFI loader just this week, but does not actually use initrd= at
> all for EFI boot (on any architecture).
>
> > Numbered version checks are a fundamentally broken and stupid concept
> > anyway. Don't do them. Just leave it at zero, and maybe some day there
> > is a sane model that actually has a bitfield of capabilities and
> > requirements.
> >
>
> Yeah, maybe you're right. Currently, only a single feature is tied to
> LINUX_EFISTUB_MAJOR_VERSION==1 (LoadFile2 support for initrd loading),
> and this PE/COFF version field has no meaning to UEFI firmware itself,
> so we could simply treat these fields as bit masks if we wanted to
> (and setting the initrd command line loader bit for x86 would be
> redundant anyway)
>
> But not being able to freely set such a bit because some rarely used
> non-EFI BIOS implementation imposes requirements on the contents of
> the EFI specific image header is rather disappointing.