[adding lkml and linux-x86_64]
On 1/7/19 2:25 AM, Kirill A. Shutemov wrote:
> On Fri, Jan 04, 2019 at 05:44:11AM +0000, Wei Huang wrote:
>> In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM
>> when the hypervsior detects guest sets CR0.PG to 0. This causes guest OS
>> to reboot when it tries to return from 32-bit trampoline code because CPU
>> is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0.
>> As a precaution, this patch sets EFER.LME=1 as part of long mode
>> activation procedure. This extra step won't cause any harm when Linux is
>> booting on bare-metal machine.
>>
>> Signed-off-by: Wei Huang <[email protected]>
>
> Thanks for tracking this down.
BTW I think this patch _might_ be related the recent reboot issue
reported in https://lkml.org/lkml/2018/7/1/836 since the symptoms are
exactly the same.
>
> Acked-by: Kirill A. Shutemov <[email protected]>
> Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage")
>
On Mon, Jan 07, 2019 at 02:03:15PM -0600, Wei Huang wrote:
> On 1/7/19 2:25 AM, Kirill A. Shutemov wrote:
> > On Fri, Jan 04, 2019 at 05:44:11AM +0000, Wei Huang wrote:
> >> In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM
> >> when the hypervsior detects guest sets CR0.PG to 0. This causes guest OS
> >> to reboot when it tries to return from 32-bit trampoline code because CPU
> >> is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0.
> >> As a precaution, this patch sets EFER.LME=1 as part of long mode
> >> activation procedure. This extra step won't cause any harm when Linux is
> >> booting on bare-metal machine.
> >>
> >> Signed-off-by: Wei Huang <[email protected]>
> >
> > Thanks for tracking this down.
>
> BTW I think this patch _might_ be related the recent reboot issue
> reported in https://lkml.org/lkml/2018/7/1/836 since the symptoms are
> exactly the same.
The problem in that case turned out to be https://lkml.org/lkml/2018/7/4/723
which was fixed by d503ac531a.
--Benjamin Gilbert
On 1/7/19 3:53 PM, Benjamin Gilbert wrote:
> On Mon, Jan 07, 2019 at 02:03:15PM -0600, Wei Huang wrote:
>> On 1/7/19 2:25 AM, Kirill A. Shutemov wrote:
>>> On Fri, Jan 04, 2019 at 05:44:11AM +0000, Wei Huang wrote:
>>>> In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM
>>>> when the hypervsior detects guest sets CR0.PG to 0. This causes guest OS
>>>> to reboot when it tries to return from 32-bit trampoline code because CPU
>>>> is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0.
>>>> As a precaution, this patch sets EFER.LME=1 as part of long mode
>>>> activation procedure. This extra step won't cause any harm when Linux is
>>>> booting on bare-metal machine.
>>>>
>>>> Signed-off-by: Wei Huang <[email protected]>
>>>
>>> Thanks for tracking this down.
>>
>> BTW I think this patch _might_ be related the recent reboot issue
>> reported in https://lkml.org/lkml/2018/7/1/836 since the symptoms are
>> exactly the same.
>
> The problem in that case turned out to be https://lkml.org/lkml/2018/7/4/723
> which was fixed by d503ac531a.
OK, then it is a different problem. For this specific patch, without it,
the latest kernel can't boot on RHEL6 (and other old KVM distros) as a
guest VM on AMD box.
>
> --Benjamin Gilbert
>