2017-12-17 09:13:13

by Andrew Randrianasulu

[permalink] [raw]
Subject: AMD erratum 665 on f15h processor?

Hello!

I was trying to investigate why all my old kernels can't be booted on my
relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right now -
but old kernels die early ...

After some digging I found this
https://patchwork.kernel.org/patch/9311567/

Patch talk about family 12h, but my machine has this CPU:

[ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor (family: 0x15,
model: 0x2, stepping: 0x0)
[ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.


Because fix applied unconditionally it probably helps me, so please don't remove
it.

fail log from qemu and kernel 4.2 attached


.text : 0xc0100000 - 0xc046ceb7 (3507 kB)
Checking if this processor honours the WP bit even in supervisor mode...Ok.
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Hierarchical RCU implementation.
Build-time adjustment of leaf fanout to 32.
RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=1.
RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=1
NR_IRQS:2304 nr_irqs:256 16
Console: colour VGA+ 80x60
console [tty0] enabled
clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns:
1911260
4467 ns
tsc: Fast TSC calibration failed
tsc: Unable to calibrate against PIT
tsc: HPET/PMTIMER calibration failed
tsc: Marking TSC unstable due to could not calculate TSC khz
Calibrating delay loop... 1253.37 BogoMIPS (lpj=2506752)
pid_max: default: 32768 minimum: 301
ACPI: Core revision 20150619
ACPI: All ACPI Tables successfully acquired
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
Initializing cgroup subsys net_cls
general protection fault: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486 #7
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.11.0-0-g63451fca1
3-prebuilt.qemu-project.org 04/01/2014
task: c05dba40 ti: c05d4000 task.ti: c05d4000
EIP: 0060:[<c010ec47>] EFLAGS: 00210202 CPU: 0
EIP is at cpu_has_amd_erratum+0x23/0xb2
EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c0470b2c
ESI: c0630d00 EDI: c0470b30 EBP: c05d5f24 ESP: c05d5f14
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: ffc77000 CR3: 006d2000 CR4: 00040690
Stack:
02008140 00000000 c0630d00 00000000 c05d5f70 c010f446 000000d0 c05d5f5c
c01e0571 00000010 0000001e 00000000 00000000 00000009 00000010 00000000
c0630d00 00000000 c05d5f70 c010d74d 00000020 c0630d00 c0630d8b c05d5f9c
Call Trace:
[<c010f446>] init_amd+0x4e8/0x662
[<c01e0571>] ? kmem_cache_alloc_trace+0xbe/0xc8
[<c010d74d>] ? get_cpu_cap+0x127/0x12c
[<c010d936>] identify_cpu+0x1e4/0x366
[<c01e044c>] ? kmem_cache_alloc+0x90/0xf7
[<c01c7869>] ? kmem_cache_create+0x118/0x15b
[<c063f1ea>] identify_boot_cpu+0x10/0x99
[<c018fb35>] ? __delayacct_tsk_init+0x15/0x28
[<c063f2a6>] check_bugs+0x9/0x39
[<c0638ae3>] start_kernel+0x3a3/0x3b3
[<c063854d>] ? set_init_arg+0x52/0x52
[<c06382b8>] i386_start_kernel+0x82/0x86
Code: e0 eb 5d c0 89 e5 5d c3 55 89 e5 57 56 89 c6 53 51 8b 1a 8d 7a 04 81 fb
ff
ff 00 00 77 54 8b 40 2c f6 c4 02 74 4c b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8
89
d1 99 39 ca 77 39 72 05 3b 5d f0 73 32
EIP: [<c010ec47>] cpu_has_amd_erratum+0x23/0xb2 SS:ESP 0068:c05d5f14
---[ end trace 8bfd5e6fa0a4fcb2 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!

well, because this bug apparently fixed and fix propogated to -stable it
shouldn't concern me too much, but may be someone in the future will rearrange
those checks and assume only some old AMD CPUs were affected ... so, I leave
this message.

qemu cmd line:
qemu-system-x86_64 -M
q35 -enable-kvm -cdrom /dev/shm/slax_16_12_2017_test.iso -m 512 -soundhw
es1370 -cpu host -device sga -curses

-cpu host really important here. I used VGA mode 6 (vga=6) blindly for getting
maximized output.


2017-12-17 20:52:13

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD erratum 665 on f15h processor?

On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> Hello!
>
> I was trying to investigate why all my old kernels can't be booted on my
> relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right now -
> but old kernels die early ...
>
> After some digging I found this
> https://patchwork.kernel.org/patch/9311567/
>
> Patch talk about family 12h, but my machine has this CPU:
>
> [ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor (family: 0x15,
> model: 0x2, stepping: 0x0)
> [ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.

Yes, your machine is not affected by that erratum. So far so good.

The rest of your mail I have hard time understanding: you're talking
about old kernels not booting on a new machine but then you paste a qemu
32-bit guest kernel boot log and after that I'm lost.

Perhaps you should try again by explaining in detail what exactly you're
trying to do and how exactly you're going about doing that...

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-12-18 03:10:02

by Andrew Randrianasulu

[permalink] [raw]
Subject: Re: AMD erratum 665 on f15h processor?

В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > Hello!
> >
> > I was trying to investigate why all my old kernels can't be booted on my
> > relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right
> > now - but old kernels die early ...
> >
> > After some digging I found this
> > https://patchwork.kernel.org/patch/9311567/
> >
> > Patch talk about family 12h, but my machine has this CPU:
> >
> > [ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > (family: 0x15, model: 0x2, stepping: 0x0)
> > [ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.
>
> Yes, your machine is not affected by that erratum. So far so good.
>
> The rest of your mail I have hard time understanding: you're talking
> about old kernels not booting on a new machine but then you paste a qemu
> 32-bit guest kernel boot log and after that I'm lost.
>
> Perhaps you should try again by explaining in detail what exactly you're
> trying to do and how exactly you're going about doing that...

Hi, Borislav!

I was trying to boot few self-made liveCD/DVDs - they use self-compiled kernels
in 3.2-4.2 range. None of those old disks boots in qemu if I set it to cpu
type 'host'. I have whole collection of old kernels since 2011, and none work
anymore ! Even older CD with 2.6.23.something plainly rebooted after kernel and
initrd were loaded by isolinux on physical machine! But 2.6.27.9 worked at
least in qemu (not really want to reboot machine due to some stuff in tmpfs).
So, because 4.2.0-i486 was my previous failsafe kernel, and it most likely
will not work anymore - I guess I will use 4.12.0-x64.. I was just trying to
find any change explaining this error, and your fix was closer I was able to
find in this time interval (2015-2017). May be it was just some unrelated
purely software bug in amd detection code.. I spend some time trying to figure
out how to copy/paste from qemu, finally -curses interface worked.

I think I missed this misbehavior because I mostly used just qemu, without -cpu
host (but with -enable-kvm), so it worked without problems.

When I first got this machine in early 2017 I already had 4.9+ as one of
possible kernels in lilo menu, so, when 4.2 failed I quickly booted new kernel,
and forgot about it. Lately I compiled 4.12 for using it on friend's machine
with new AMD videocard - but default in syslinux/isolinux was still set to
4.2.0, and it worked on another AMD machine. Few days ago i decided to make
new 'live backup' of my running system, and while playing with new quemu
discovered this oddity.

Still, for me it raises interesting question: as far as I understand qemu's BIOS
(SeaBIOS) doesn't set all those cpu-specific workarounds/fixes - but with
qemu -cpu host guest kernel will see nearly exact cpu model, and will try to
apply (or not, assuming BIOS/firmware already set everything correctly?) some
fixups, or at least run some detection code? Of course I can just compile new
kernel with those checks disabled, but older kernels already compiled ... and
disabling those workarounds will lead to crashes later on, so having runtime
disable for them is not good idea ?

Not sure if I will able to get real boot log from physical machine boot - I
don't think I compiled those old kernels with any way to store early
oops/panic ..:/

Thanks for answering and sorry for possible false positive bug report.

2017-12-18 13:22:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD erratum 665 on f15h processor?

+ kvm ML.

On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote:
> В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > > Hello!
> > >
> > > I was trying to investigate why all my old kernels can't be booted on my
> > > relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right
> > > now - but old kernels die early ...
> > >
> > > After some digging I found this
> > > https://patchwork.kernel.org/patch/9311567/
> > >
> > > Patch talk about family 12h, but my machine has this CPU:
> > >
> > > [ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > > (family: 0x15, model: 0x2, stepping: 0x0)
> > > [ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.
> >
> > Yes, your machine is not affected by that erratum. So far so good.
> >
> > The rest of your mail I have hard time understanding: you're talking
> > about old kernels not booting on a new machine but then you paste a qemu
> > 32-bit guest kernel boot log and after that I'm lost.
> >
> > Perhaps you should try again by explaining in detail what exactly you're
> > trying to do and how exactly you're going about doing that...
>
> Hi, Borislav!
>
> I was trying to boot few self-made liveCD/DVDs - they use self-compiled kernels
> in 3.2-4.2 range. None of those old disks boots in qemu if I set it to cpu
> type 'host'. I have whole collection of old kernels since 2011, and none work
> anymore ! Even older CD with 2.6.23.something plainly rebooted after kernel and
> initrd were loaded by isolinux on physical machine! But 2.6.27.9 worked at
> least in qemu (not really want to reboot machine due to some stuff in tmpfs).
> So, because 4.2.0-i486 was my previous failsafe kernel, and it most likely
> will not work anymore - I guess I will use 4.12.0-x64.. I was just trying to
> find any change explaining this error, and your fix was closer I was able to
> find in this time interval (2015-2017). May be it was just some unrelated
> purely software bug in amd detection code.. I spend some time trying to figure
> out how to copy/paste from qemu, finally -curses interface worked.
>
> I think I missed this misbehavior because I mostly used just qemu, without -cpu
> host (but with -enable-kvm), so it worked without problems.

So -cpu host means:

x86 host KVM processor with all supported host features (only available in KVM mode)

which would theoretically mean that those guest kernel configs shouldn't
boot on the baremetal box either, if they fail on the guest.

But who knows what's happening.

You can give me a guest kernel .config of a kernel which fails along
with the exact qemu cmdline to try out here.

(Leaving in the rest for reference.)

> When I first got this machine in early 2017 I already had 4.9+ as one of
> possible kernels in lilo menu, so, when 4.2 failed I quickly booted new kernel,
> and forgot about it. Lately I compiled 4.12 for using it on friend's machine
> with new AMD videocard - but default in syslinux/isolinux was still set to
> 4.2.0, and it worked on another AMD machine. Few days ago i decided to make
> new 'live backup' of my running system, and while playing with new quemu
> discovered this oddity.
>
> Still, for me it raises interesting question: as far as I understand qemu's BIOS
> (SeaBIOS) doesn't set all those cpu-specific workarounds/fixes - but with
> qemu -cpu host guest kernel will see nearly exact cpu model, and will try to
> apply (or not, assuming BIOS/firmware already set everything correctly?) some
> fixups, or at least run some detection code? Of course I can just compile new
> kernel with those checks disabled, but older kernels already compiled ... and
> disabling those workarounds will lead to crashes later on, so having runtime
> disable for them is not good idea ?
>
> Not sure if I will able to get real boot log from physical machine boot - I
> don't think I compiled those old kernels with any way to store early
> oops/panic ..:/
>
> Thanks for answering and sorry for possible false positive bug report.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-12-18 21:05:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD erratum 665 on f15h processor?

When you git reply, please hit reply-to-all in your mail client so that
mailing lists get CCed too.

On Mon, Dec 18, 2017 at 07:54:52PM +0300, Andrew Randrianasulu wrote:
> В сообщении от Monday 18 December 2017 16:22:15 вы написали:
> > + kvm ML.
> >
> > On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote:
> > > В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> > > > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > > > > Hello!
> > > > >
> > > > > I was trying to investigate why all my old kernels can't be booted on
> > > > > my relatively new machine. Kernels 4.10+ naturally boot - I use
> > > > > 4.14.3 right now - but old kernels die early ...
> > > > >
> > > > > After some digging I found this
> > > > > https://patchwork.kernel.org/patch/9311567/
> > > > >
> > > > > Patch talk about family 12h, but my machine has this CPU:
> > > > >
> > > > > [ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > > > > (family: 0x15, model: 0x2, stepping: 0x0)
> > > > > [ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU
> > > > > driver.
> > > >
> > > > Yes, your machine is not affected by that erratum. So far so good.
> > > >
> > > > The rest of your mail I have hard time understanding: you're talking
> > > > about old kernels not booting on a new machine but then you paste a
> > > > qemu 32-bit guest kernel boot log and after that I'm lost.
> > > >
> > > > Perhaps you should try again by explaining in detail what exactly
> > > > you're trying to do and how exactly you're going about doing that...
> > >
> > > Hi, Borislav!
> > >
> > > I was trying to boot few self-made liveCD/DVDs - they use self-compiled
> > > kernels in 3.2-4.2 range. None of those old disks boots in qemu if I set
> > > it to cpu type 'host'. I have whole collection of old kernels since 2011,
> > > and none work anymore ! Even older CD with 2.6.23.something plainly
> > > rebooted after kernel and initrd were loaded by isolinux on physical
> > > machine! But 2.6.27.9 worked at least in qemu (not really want to reboot
> > > machine due to some stuff in tmpfs). So, because 4.2.0-i486 was my
> > > previous failsafe kernel, and it most likely will not work anymore - I
> > > guess I will use 4.12.0-x64.. I was just trying to find any change
> > > explaining this error, and your fix was closer I was able to find in this
> > > time interval (2015-2017). May be it was just some unrelated purely
> > > software bug in amd detection code.. I spend some time trying to figure
> > > out how to copy/paste from qemu, finally -curses interface worked.
> > >
> > > I think I missed this misbehavior because I mostly used just qemu,
> > > without -cpu host (but with -enable-kvm), so it worked without problems.
> >
> > So -cpu host means:
> >
> > x86 host KVM processor with all supported host features (only
> > available in KVM mode)
> >
> > which would theoretically mean that those guest kernel configs shouldn't
> > boot on the baremetal box either, if they fail on the guest.
> >
> > But who knows what's happening.
> >
> > You can give me a guest kernel .config of a kernel which fails along
> > with the exact qemu cmdline to try out here.
>
> .config attached.
>
> for reproducting just launch qemu like this:
>
> qemu-system-i386 -kernel /home/admin/slax-build/boot/vmlinuz -cpu
> host --enable-kvm (just tried).
>
> Of course replace path to kernel image with your own. I can also attach binary
> image, but I think it will be of little use for you.....

Nah, I built it using your .config.

So my guest stops very early in the BIOS with

"Failed to allocate space for phdrs

-- System halted."

Then I looked at this:

https://bugzilla.kernel.org/show_bug.cgi?id=114671

and there's a patch

https://bugzilla.kernel.org/attachment.cgi?id=209601&action=diff&collapsed=&headers=1&format=raw

With it, it booted a bit further. But I still couldn't see any output.

So I booted with my cmdline to see more output and it did say:

general protection fault: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
task: c05b9a80 ti: c05b2000 task.ti: c05b2000
EIP: 0060:[<c010e390>] EFLAGS: 00210293 CPU: 0
EIP is at cpu_has_amd_erratum+0x24/0xb0
EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c044ccf4
ESI: c0616900 EDI: c044ccf8 EBP: c05b3f68 ESP: c05b3f58
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: ffc77000 CR3: 006ae000 CR4: 00040690
Stack:
02008140 00000000 c0616900 00000000 c05b3fa8 c010ec8b f5001d80 0000001e
00000000 00000000 00000009 00000010 00000000 c0616900 00000000 c05b3fa8
c010cf58 c0616900 c0616900 c061695c c05b3fc8 c010d156 c061698b c061695c
Call Trace:
[<c010ec8b>] init_amd+0x5ee/0x631
[<c010cf58>] ? get_cpu_cap+0x121/0x126
[<c010d156>] identify_cpu+0x1f9/0x37d
[<c0624a18>] identify_boot_cpu+0xd/0x80
[<c0624abd>] check_bugs+0x8/0x35
[<c061ea42>] start_kernel+0x32a/0x339
[<c061e2c2>] i386_start_kernel+0x8c/0x90
Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34
EIP: [<c010e390>] cpu_has_amd_erratum+0x24/0xb0 SS:ESP 0068:c05b3f58
---[ end trace 7fb9e71b486a229a ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!

Which is exactly like the splat you've posted and that fails:

Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34
All code
========
0: cf iret
1: 5b pop %rbx
2: c0 89 e5 5d c3 55 89 rorb $0x89,0x55c35de5(%rcx)
9: e5 57 in $0x57,%eax
b: 56 push %rsi
c: 53 push %rbx
d: 51 push %rcx
e: 89 c6 mov %eax,%esi
10: 8b 1a mov (%rdx),%ebx
12: 8d 7a 04 lea 0x4(%rdx),%edi
15: 81 fb ff ff 00 00 cmp $0xffff,%ebx
1b: 77 57 ja 0x74
1d: 8b 40 2c mov 0x2c(%rax),%eax
20: 0f ba e0 09 bt $0x9,%eax
24: 73 4e jae 0x74
26: b9 40 01 01 c0 mov $0xc0010140,%ecx
2b:* 0f 32 rdmsr <-- trapping instruction
2d: 89 45 f0 mov %eax,-0x10(%rbp)
30: 89 d8 mov %ebx,%eax
32: 89 d1 mov %edx,%ecx
34: 99 cltd
35: 39 ca cmp %ecx,%edx
37: 77 3b ja 0x74
39: 72 05 jb 0x40
3b: 3b 5d f0 cmp -0x10(%rbp),%ebx
3e: 73 34 jae 0x74

because it tries to read from a non-existent MSR - 0xc0010140 - and
maybe it is because of the -cpu host emulation or so but those MSRs do
get virtualized, see

2b036c6b861d ("KVM: SVM: Add support for AMD's OSVW feature in guests")

but I'd refer to the kvm/qemu people to explain what the deal here
exactly is.

What I do, is use -cpu Opteron_G5 which is also F15h and that works.
Oh, and I'd use 64-bit kernels - 32-bit is not really being tested as
extensively.

HTH.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-12-19 05:30:58

by Andrew Randrianasulu

[permalink] [raw]
Subject: Re: AMD erratum 665 on f15h processor?

В сообщении от Tuesday 19 December 2017 00:05:40 Borislav Petkov написал(а):
> When you git reply, please hit reply-to-all in your mail client so that
> mailing lists get CCed too.

ok.

>
> On Mon, Dec 18, 2017 at 07:54:52PM +0300, Andrew Randrianasulu wrote:
> > В сообщении от Monday 18 December 2017 16:22:15 вы написали:
> > > + kvm ML.
> > >
> > > On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote:
> > > > В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> > > > > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > > > > > Hello!
> > > > > >
> > > > > > I was trying to investigate why all my old kernels can't be
> > > > > > booted on my relatively new machine. Kernels 4.10+ naturally boot
> > > > > > - I use 4.14.3 right now - but old kernels die early ...
> > > > > >
> > > > > > After some digging I found this
> > > > > > https://patchwork.kernel.org/patch/9311567/
> > > > > >
> > > > > > Patch talk about family 12h, but my machine has this CPU:
> > > > > >
> > > > > > [ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > > > > > (family: 0x15, model: 0x2, stepping: 0x0)
> > > > > > [ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU
> > > > > > driver.
> > > > >
> > > > > Yes, your machine is not affected by that erratum. So far so good.
> > > > >
> > > > > The rest of your mail I have hard time understanding: you're
> > > > > talking about old kernels not booting on a new machine but then you
> > > > > paste a qemu 32-bit guest kernel boot log and after that I'm lost.
> > > > >
> > > > > Perhaps you should try again by explaining in detail what exactly
> > > > > you're trying to do and how exactly you're going about doing
> > > > > that...
> > > >
> > > > Hi, Borislav!
> > > >
> > > > I was trying to boot few self-made liveCD/DVDs - they use
> > > > self-compiled kernels in 3.2-4.2 range. None of those old disks boots
> > > > in qemu if I set it to cpu type 'host'. I have whole collection of
> > > > old kernels since 2011, and none work anymore ! Even older CD with
> > > > 2.6.23.something plainly rebooted after kernel and initrd were loaded
> > > > by isolinux on physical machine! But 2.6.27.9 worked at least in qemu
> > > > (not really want to reboot machine due to some stuff in tmpfs). So,
> > > > because 4.2.0-i486 was my previous failsafe kernel, and it most
> > > > likely will not work anymore - I guess I will use 4.12.0-x64.. I was
> > > > just trying to find any change explaining this error, and your fix
> > > > was closer I was able to find in this time interval (2015-2017). May
> > > > be it was just some unrelated purely software bug in amd detection
> > > > code.. I spend some time trying to figure out how to copy/paste from
> > > > qemu, finally -curses interface worked.
> > > >
> > > > I think I missed this misbehavior because I mostly used just qemu,
> > > > without -cpu host (but with -enable-kvm), so it worked without
> > > > problems.
> > >
> > > So -cpu host means:
> > >
> > > x86 host KVM processor with all supported host features
> > > (only available in KVM mode)
> > >
> > > which would theoretically mean that those guest kernel configs
> > > shouldn't boot on the baremetal box either, if they fail on the guest.
> > >
> > > But who knows what's happening.
> > >
> > > You can give me a guest kernel .config of a kernel which fails along
> > > with the exact qemu cmdline to try out here.
> >
> > .config attached.
> >
> > for reproducting just launch qemu like this:
> >
> > qemu-system-i386 -kernel /home/admin/slax-build/boot/vmlinuz -cpu
> > host --enable-kvm (just tried).
> >
> > Of course replace path to kernel image with your own. I can also attach
> > binary image, but I think it will be of little use for you.....
>
> Nah, I built it using your .config.
>
> So my guest stops very early in the BIOS with
>
> "Failed to allocate space for phdrs
>
> -- System halted."
>
> Then I looked at this:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=114671
>
> and there's a patch
>
> https://bugzilla.kernel.org/attachment.cgi?id=209601&action=diff&collapsed=
>&headers=1&format=raw


Thanks, looks like I will have more fun building 32-bit kernel, because I
already updated binutils

>
> With it, it booted a bit further. But I still couldn't see any output.
>
> So I booted with my cmdline to see more output and it did say:
>
> general protection fault: 0000 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486+ #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1
> 04/01/2014 task: c05b9a80 ti: c05b2000 task.ti: c05b2000
> EIP: 0060:[<c010e390>] EFLAGS: 00210293 CPU: 0
> EIP is at cpu_has_amd_erratum+0x24/0xb0
> EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c044ccf4
> ESI: c0616900 EDI: c044ccf8 EBP: c05b3f68 ESP: c05b3f58
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: ffc77000 CR3: 006ae000 CR4: 00040690
> Stack:
> 02008140 00000000 c0616900 00000000 c05b3fa8 c010ec8b f5001d80 0000001e
> 00000000 00000000 00000009 00000010 00000000 c0616900 00000000 c05b3fa8
> c010cf58 c0616900 c0616900 c061695c c05b3fc8 c010d156 c061698b c061695c
> Call Trace:
> [<c010ec8b>] init_amd+0x5ee/0x631
> [<c010cf58>] ? get_cpu_cap+0x121/0x126
> [<c010d156>] identify_cpu+0x1f9/0x37d
> [<c0624a18>] identify_boot_cpu+0xd/0x80
> [<c0624abd>] check_bugs+0x8/0x35
> [<c061ea42>] start_kernel+0x32a/0x339
> [<c061e2c2>] i386_start_kernel+0x8c/0x90
> Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb
> ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45
> f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34 EIP: [<c010e390>]
> cpu_has_amd_erratum+0x24/0xb0 SS:ESP 0068:c05b3f58 ---[ end trace
> 7fb9e71b486a229a ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
>
> Which is exactly like the splat you've posted and that fails:
>
> Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb
> ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45
> f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34 All code
> ========
> 0: cf iret
> 1: 5b pop %rbx
> 2: c0 89 e5 5d c3 55 89 rorb $0x89,0x55c35de5(%rcx)
> 9: e5 57 in $0x57,%eax
> b: 56 push %rsi
> c: 53 push %rbx
> d: 51 push %rcx
> e: 89 c6 mov %eax,%esi
> 10: 8b 1a mov (%rdx),%ebx
> 12: 8d 7a 04 lea 0x4(%rdx),%edi
> 15: 81 fb ff ff 00 00 cmp $0xffff,%ebx
> 1b: 77 57 ja 0x74
> 1d: 8b 40 2c mov 0x2c(%rax),%eax
> 20: 0f ba e0 09 bt $0x9,%eax
> 24: 73 4e jae 0x74
> 26: b9 40 01 01 c0 mov $0xc0010140,%ecx
> 2b:* 0f 32 rdmsr <-- trapping instruction
> 2d: 89 45 f0 mov %eax,-0x10(%rbp)
> 30: 89 d8 mov %ebx,%eax
> 32: 89 d1 mov %edx,%ecx
> 34: 99 cltd
> 35: 39 ca cmp %ecx,%edx
> 37: 77 3b ja 0x74
> 39: 72 05 jb 0x40
> 3b: 3b 5d f0 cmp -0x10(%rbp),%ebx
> 3e: 73 34 jae 0x74
>
> because it tries to read from a non-existent MSR - 0xc0010140 - and
> maybe it is because of the -cpu host emulation or so but those MSRs do
> get virtualized, see
>
> 2b036c6b861d ("KVM: SVM: Add support for AMD's OSVW feature in guests")

Thanks again, patch "Add support from AMD's OSVW feature in guests" answered my
question about virtualizing somewhat buggy CPUs.

>
> but I'd refer to the kvm/qemu people to explain what the deal here
> exactly is.
>
> What I do, is use -cpu Opteron_G5 which is also F15h and that works.
> Oh, and I'd use 64-bit kernels - 32-bit is not really being tested as
> extensively.

-cpu Opteron_G5 works here, too.


>
> HTH.