2019-01-07 20:06:29

by Wei Huang

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline code before returning to long mode

[adding lkml and linux-x86_64]

On 1/7/19 2:25 AM, Kirill A. Shutemov wrote:
> On Fri, Jan 04, 2019 at 05:44:11AM +0000, Wei Huang wrote:
>> In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM
>> when the hypervsior detects guest sets CR0.PG to 0. This causes guest OS
>> to reboot when it tries to return from 32-bit trampoline code because CPU
>> is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0.
>> As a precaution, this patch sets EFER.LME=1 as part of long mode
>> activation procedure. This extra step won't cause any harm when Linux is
>> booting on bare-metal machine.
>>
>> Signed-off-by: Wei Huang <[email protected]>
>
> Thanks for tracking this down.

BTW I think this patch _might_ be related the recent reboot issue
reported in https://lkml.org/lkml/2018/7/1/836 since the symptoms are
exactly the same.

>
> Acked-by: Kirill A. Shutemov <[email protected]>
> Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage")
>


2019-01-07 21:55:41

by Benjamin Gilbert

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline code before returning to long mode

On Mon, Jan 07, 2019 at 02:03:15PM -0600, Wei Huang wrote:
> On 1/7/19 2:25 AM, Kirill A. Shutemov wrote:
> > On Fri, Jan 04, 2019 at 05:44:11AM +0000, Wei Huang wrote:
> >> In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM
> >> when the hypervsior detects guest sets CR0.PG to 0. This causes guest OS
> >> to reboot when it tries to return from 32-bit trampoline code because CPU
> >> is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0.
> >> As a precaution, this patch sets EFER.LME=1 as part of long mode
> >> activation procedure. This extra step won't cause any harm when Linux is
> >> booting on bare-metal machine.
> >>
> >> Signed-off-by: Wei Huang <[email protected]>
> >
> > Thanks for tracking this down.
>
> BTW I think this patch _might_ be related the recent reboot issue
> reported in https://lkml.org/lkml/2018/7/1/836 since the symptoms are
> exactly the same.

The problem in that case turned out to be https://lkml.org/lkml/2018/7/4/723
which was fixed by d503ac531a.

--Benjamin Gilbert

2019-01-07 22:03:48

by Wei Huang

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline code before returning to long mode



On 1/7/19 3:53 PM, Benjamin Gilbert wrote:
> On Mon, Jan 07, 2019 at 02:03:15PM -0600, Wei Huang wrote:
>> On 1/7/19 2:25 AM, Kirill A. Shutemov wrote:
>>> On Fri, Jan 04, 2019 at 05:44:11AM +0000, Wei Huang wrote:
>>>> In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM
>>>> when the hypervsior detects guest sets CR0.PG to 0. This causes guest OS
>>>> to reboot when it tries to return from 32-bit trampoline code because CPU
>>>> is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0.
>>>> As a precaution, this patch sets EFER.LME=1 as part of long mode
>>>> activation procedure. This extra step won't cause any harm when Linux is
>>>> booting on bare-metal machine.
>>>>
>>>> Signed-off-by: Wei Huang <[email protected]>
>>>
>>> Thanks for tracking this down.
>>
>> BTW I think this patch _might_ be related the recent reboot issue
>> reported in https://lkml.org/lkml/2018/7/1/836 since the symptoms are
>> exactly the same.
>
> The problem in that case turned out to be https://lkml.org/lkml/2018/7/4/723
> which was fixed by d503ac531a.

OK, then it is a different problem. For this specific patch, without it,
the latest kernel can't boot on RHEL6 (and other old KVM distros) as a
guest VM on AMD box.

>
> --Benjamin Gilbert
>