Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760098AbdLRVFw (ORCPT ); Mon, 18 Dec 2017 16:05:52 -0500 Received: from mail.skyhub.de ([5.9.137.197]:58274 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758790AbdLRVFu (ORCPT ); Mon, 18 Dec 2017 16:05:50 -0500 Date: Mon, 18 Dec 2017 22:05:40 +0100 From: Borislav Petkov To: Andrew Randrianasulu Cc: kvm ML , lkml Subject: Re: AMD erratum 665 on f15h processor? Message-ID: <20171218210540.akkwacm7ngabdbvt@pd.tnic> References: <201712171204.29349.randrianasulu@gmail.com> <201712180601.21662.randrianasulu@gmail.com> <20171218132215.qahpo6vn2vdq76gj@pd.tnic> <201712181954.52740.randrianasulu@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <201712181954.52740.randrianasulu@gmail.com> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7700 Lines: 178 When you git reply, please hit reply-to-all in your mail client so that mailing lists get CCed too. On Mon, Dec 18, 2017 at 07:54:52PM +0300, Andrew Randrianasulu wrote: > В сообщении от Monday 18 December 2017 16:22:15 вы написали: > > + kvm ML. > > > > On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote: > > > В сообщении от Sunday 17 December 2017 23:52:05 вы написали: > > > > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote: > > > > > Hello! > > > > > > > > > > I was trying to investigate why all my old kernels can't be booted on > > > > > my relatively new machine. Kernels 4.10+ naturally boot - I use > > > > > 4.14.3 right now - but old kernels die early ... > > > > > > > > > > After some digging I found this > > > > > https://patchwork.kernel.org/patch/9311567/ > > > > > > > > > > Patch talk about family 12h, but my machine has this CPU: > > > > > > > > > > [ 0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor > > > > > (family: 0x15, model: 0x2, stepping: 0x0) > > > > > [ 0.056000] Performance Events: Fam15h core perfctr, AMD PMU > > > > > driver. > > > > > > > > Yes, your machine is not affected by that erratum. So far so good. > > > > > > > > The rest of your mail I have hard time understanding: you're talking > > > > about old kernels not booting on a new machine but then you paste a > > > > qemu 32-bit guest kernel boot log and after that I'm lost. > > > > > > > > Perhaps you should try again by explaining in detail what exactly > > > > you're trying to do and how exactly you're going about doing that... > > > > > > Hi, Borislav! > > > > > > I was trying to boot few self-made liveCD/DVDs - they use self-compiled > > > kernels in 3.2-4.2 range. None of those old disks boots in qemu if I set > > > it to cpu type 'host'. I have whole collection of old kernels since 2011, > > > and none work anymore ! Even older CD with 2.6.23.something plainly > > > rebooted after kernel and initrd were loaded by isolinux on physical > > > machine! But 2.6.27.9 worked at least in qemu (not really want to reboot > > > machine due to some stuff in tmpfs). So, because 4.2.0-i486 was my > > > previous failsafe kernel, and it most likely will not work anymore - I > > > guess I will use 4.12.0-x64.. I was just trying to find any change > > > explaining this error, and your fix was closer I was able to find in this > > > time interval (2015-2017). May be it was just some unrelated purely > > > software bug in amd detection code.. I spend some time trying to figure > > > out how to copy/paste from qemu, finally -curses interface worked. > > > > > > I think I missed this misbehavior because I mostly used just qemu, > > > without -cpu host (but with -enable-kvm), so it worked without problems. > > > > So -cpu host means: > > > > x86 host KVM processor with all supported host features (only > > available in KVM mode) > > > > which would theoretically mean that those guest kernel configs shouldn't > > boot on the baremetal box either, if they fail on the guest. > > > > But who knows what's happening. > > > > You can give me a guest kernel .config of a kernel which fails along > > with the exact qemu cmdline to try out here. > > .config attached. > > for reproducting just launch qemu like this: > > qemu-system-i386 -kernel /home/admin/slax-build/boot/vmlinuz -cpu > host --enable-kvm (just tried). > > Of course replace path to kernel image with your own. I can also attach binary > image, but I think it will be of little use for you..... Nah, I built it using your .config. So my guest stops very early in the BIOS with "Failed to allocate space for phdrs -- System halted." Then I looked at this: https://bugzilla.kernel.org/show_bug.cgi?id=114671 and there's a patch https://bugzilla.kernel.org/attachment.cgi?id=209601&action=diff&collapsed=&headers=1&format=raw With it, it booted a bit further. But I still couldn't see any output. So I booted with my cmdline to see more output and it did say: general protection fault: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 task: c05b9a80 ti: c05b2000 task.ti: c05b2000 EIP: 0060:[] EFLAGS: 00210293 CPU: 0 EIP is at cpu_has_amd_erratum+0x24/0xb0 EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c044ccf4 ESI: c0616900 EDI: c044ccf8 EBP: c05b3f68 ESP: c05b3f58 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: ffc77000 CR3: 006ae000 CR4: 00040690 Stack: 02008140 00000000 c0616900 00000000 c05b3fa8 c010ec8b f5001d80 0000001e 00000000 00000000 00000009 00000010 00000000 c0616900 00000000 c05b3fa8 c010cf58 c0616900 c0616900 c061695c c05b3fc8 c010d156 c061698b c061695c Call Trace: [] init_amd+0x5ee/0x631 [] ? get_cpu_cap+0x121/0x126 [] identify_cpu+0x1f9/0x37d [] identify_boot_cpu+0xd/0x80 [] check_bugs+0x8/0x35 [] start_kernel+0x32a/0x339 [] i386_start_kernel+0x8c/0x90 Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34 EIP: [] cpu_has_amd_erratum+0x24/0xb0 SS:ESP 0068:c05b3f58 ---[ end trace 7fb9e71b486a229a ]--- Kernel panic - not syncing: Attempted to kill the idle task! ---[ end Kernel panic - not syncing: Attempted to kill the idle task! Which is exactly like the splat you've posted and that fails: Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34 All code ======== 0: cf iret 1: 5b pop %rbx 2: c0 89 e5 5d c3 55 89 rorb $0x89,0x55c35de5(%rcx) 9: e5 57 in $0x57,%eax b: 56 push %rsi c: 53 push %rbx d: 51 push %rcx e: 89 c6 mov %eax,%esi 10: 8b 1a mov (%rdx),%ebx 12: 8d 7a 04 lea 0x4(%rdx),%edi 15: 81 fb ff ff 00 00 cmp $0xffff,%ebx 1b: 77 57 ja 0x74 1d: 8b 40 2c mov 0x2c(%rax),%eax 20: 0f ba e0 09 bt $0x9,%eax 24: 73 4e jae 0x74 26: b9 40 01 01 c0 mov $0xc0010140,%ecx 2b:* 0f 32 rdmsr <-- trapping instruction 2d: 89 45 f0 mov %eax,-0x10(%rbp) 30: 89 d8 mov %ebx,%eax 32: 89 d1 mov %edx,%ecx 34: 99 cltd 35: 39 ca cmp %ecx,%edx 37: 77 3b ja 0x74 39: 72 05 jb 0x40 3b: 3b 5d f0 cmp -0x10(%rbp),%ebx 3e: 73 34 jae 0x74 because it tries to read from a non-existent MSR - 0xc0010140 - and maybe it is because of the -cpu host emulation or so but those MSRs do get virtualized, see 2b036c6b861d ("KVM: SVM: Add support for AMD's OSVW feature in guests") but I'd refer to the kvm/qemu people to explain what the deal here exactly is. What I do, is use -cpu Opteron_G5 which is also F15h and that works. Oh, and I'd use 64-bit kernels - 32-bit is not really being tested as extensively. HTH. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.