Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752592AbbGaIDM (ORCPT ); Fri, 31 Jul 2015 04:03:12 -0400 Received: from mail.skyhub.de ([78.46.96.112]:35028 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140AbbGaIDH (ORCPT ); Fri, 31 Jul 2015 04:03:07 -0400 Date: Fri, 31 Jul 2015 10:03:03 +0200 From: Borislav Petkov To: Andy Lutomirski Cc: Paolo Bonzini , Peter Zijlstra , Linus Torvalds , Willy Tarreau , Steven Rostedt , X86 ML , "linux-kernel@vger.kernel.org" , Thomas Gleixner , Brian Gerst Subject: Re: Dealing with the NMI mess Message-ID: <20150731080303.GA2128@nazgul.tnic> References: <20150724153054.GK19282@twins.programming.kicks-ass.net> <20150724195509.GM2859@worktop.programming.kicks-ass.net> <20150724205119.GM19282@twins.programming.kicks-ass.net> <55BA45A2.8050909@redhat.com> <20150731042205.GB32117@nazgul.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4303 Lines: 111 On Thu, Jul 30, 2015 at 10:11:40PM -0700, Andy Lutomirski wrote: > This instruction is awesome. Binutils can disassemble it (it's called > "icebp") but it can't assemble it. KVM has special handling for it on > VMX and actually reports it to QEMU on SVM (complete with a defined > ABI). Fun. > We have an asm macro so we can assemble it for 32-bit but not > 64-bit, despite the fact that it works on 64-bit. > > The kernel instruction decoder can't decode it. Yeah, the kernel insn decoder needs to be fixed. Even my decoder can decode it: $ echo "0xf1" | ./x86d - 0: f1 icebp Big deal. :-) Let's do some fun and games: $ cat icebp.c int main() { asm volatile(".byte 0xf1"); return 0; } $ gcc -Wall -o icebp{,.c} $ objdump -d icebp ... 00000000004004ac
: 4004ac: 55 push %rbp 4004ad: 48 89 e5 mov %rsp,%rbp 4004b0: f1 icebp 4004b1: b8 00 00 00 00 mov $0x0,%eax 4004b6: 5d pop %rbp 4004b7: c3 retq 4004b8: 90 nop ... $ ./icebp Trace/breakpoint trap ^ this in qemu. On baremetal it gets a SIGTRAP with TRAP_BRKPT. Looks like signal handling knows about it... $ strace /tmp/icebp execve("/tmp/icebp", ["/tmp/icebp"], [/* 27 vars */]) = 0 brk(0) = 0x1680000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f71e243d000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=127070, ...}) = 0 mmap(NULL, 127070, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f71e241d000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1729984, ...}) = 0 mmap(NULL, 3836448, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f71e1e76000 mprotect(0x7f71e2015000, 2097152, PROT_NONE) = 0 mmap(0x7f71e2215000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19f000) = 0x7f71e2215000 mmap(0x7f71e221b000, 14880, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f71e221b000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f71e241c000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f71e241b000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f71e241a000 arch_prctl(ARCH_SET_FS, 0x7f71e241b700) = 0 mprotect(0x7f71e2215000, 16384, PROT_READ) = 0 mprotect(0x7f71e243f000, 4096, PROT_READ) = 0 munmap(0x7f71e241d000, 127070) = 0 --- SIGTRAP {si_signo=SIGTRAP, si_code=TRAP_BRKPT, si_pid=4195505, si_uid=0} --- +++ killed by SIGTRAP +++ Trace/breakpoint trap > Fortunately, it looks like the vm86 case is correct (or as correct as > any of the vm86 junk can be), although I haven't tested it. I bet > that icebp is like int3 in that it punches through vm86 mode instead > of sending #GP. Yeah, INT 1. I wonder whether INT 1, i.e. CD imm8 does the same thing. But why do you say it is special - it simply raises #DB, i.e. vector 1. Web page seems to say so when interrupt redirection is disabled. It sounds like a nice and quick way to generate a breakpoint. You can do that with INT 01, i.e., the CD opcode, too. If I'd had to guess, it isn't documented because of the proprietary ICE aspect. And no one uses ICEs anymore so it is going to be forgotten with people popping off and on and asking about the undocumented opcode. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/