Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751950AbbGaKZq (ORCPT ); Fri, 31 Jul 2015 06:25:46 -0400 Received: from mail.skyhub.de ([78.46.96.112]:35507 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750950AbbGaKZo (ORCPT ); Fri, 31 Jul 2015 06:25:44 -0400 Date: Fri, 31 Jul 2015 12:25:42 +0200 From: Borislav Petkov To: Paolo Bonzini Cc: Andy Lutomirski , Peter Zijlstra , Linus Torvalds , Willy Tarreau , Steven Rostedt , X86 ML , "linux-kernel@vger.kernel.org" , Thomas Gleixner , Brian Gerst Subject: Re: Dealing with the NMI mess Message-ID: <20150731102542.GA6218@nazgul.tnic> References: <20150724195509.GM2859@worktop.programming.kicks-ass.net> <20150724205119.GM19282@twins.programming.kicks-ass.net> <55BA45A2.8050909@redhat.com> <20150731042205.GB32117@nazgul.tnic> <20150731080303.GA2128@nazgul.tnic> <55BB3F71.1060307@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <55BB3F71.1060307@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3531 Lines: 76 On Fri, Jul 31, 2015 at 11:27:13AM +0200, Paolo Bonzini wrote: > Is the strace different between KVM and baremetal? Yes, the signal part is missing from kvm: $ strace ./icebp execve("./icebp", ["./icebp"], [/* 20 vars */]) = 0 brk(0) = 0x601000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7ff6000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=95207, ...}) = 0 mmap(NULL, 95207, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ffff7fde000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\357\1\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1595408, ...}) = 0 mmap(NULL, 3709016, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ffff7a53000 mprotect(0x7ffff7bd3000, 2097152, PROT_NONE) = 0 mmap(0x7ffff7dd3000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x180000) = 0x7ffff7dd3000 mmap(0x7ffff7dd8000, 18520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ffff7dd8000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7fdd000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7fdc000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7fdb000 arch_prctl(ARCH_SET_FS, 0x7ffff7fdc700) = 0 mprotect(0x7ffff7dd3000, 16384, PROT_READ) = 0 mprotect(0x7ffff7ffc000, 4096, PROT_READ) = 0 munmap(0x7ffff7fde000, 95207) = 0 exit_group(0) = ? > No, it sends #GP. True story: [ 697.707990] traps: icebp[3537] general protection ip:4004b0 sp:7fffffffe610 error:a in icebp[400000+1000] but why? I guess our IDT entry at 1 is funny... Too lazy to check. > The reason why it isn't documented is probably hidden within Intel. > Besides ICEBP, which is a bit fringe, there's no reason not to document > SALC which Thomas mentioned. SALC all has been there since the 8086, > and has been undocumented for thirty-odd years. That one is invalid (on an IVB): [ 1306.231408] traps: icebp[3783] trap invalid opcode ip:4004b0 sp:7fffffffe610 error:0 in icebp[400000+1000] AMD APM documents it as invalid too. > The AAM/AAD variants with immediates other than 10 also have been > undocumented for fifteen years or so (an instruction doing a division > by 10 where the second byte of the opcode is 10? oh, certainly no one > is going to try changing the second byte...) There's this in the AMD APM: "In most modern assemblers, the AAM instruction adjusts to base-10 values. However, by coding the instruction directly in binary, it can adjust to any base specified by the immediate byte value (ib) suffixed onto the D4h opcode. For example, code D408h for octal, D40Ah for decimal, and D40Ch for duodecimal (base 12)." -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/