Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933164AbbGJRN5 (ORCPT ); Fri, 10 Jul 2015 13:13:57 -0400 Received: from mail-lb0-f174.google.com ([209.85.217.174]:36708 "EHLO mail-lb0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932426AbbGJRNt (ORCPT ); Fri, 10 Jul 2015 13:13:49 -0400 MIME-Version: 1.0 In-Reply-To: References: <23d4709cee2fe92c32d41b99c7a3c1823725925a.1436312944.git.luto@kernel.org> <559C8BFE.6050604@linux.intel.com> <87twtc14po.fsf@x220.int.ebiederm.org> From: Andy Lutomirski Date: Fri, 10 Jul 2015 10:13:28 -0700 Message-ID: Subject: Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN To: Linus Torvalds Cc: "Eric W. Biederman" , Arjan van de Ven , Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , Oleg Nesterov , Kees Cook , Peter Zijlstra , Borislav Petkov Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2796 Lines: 64 On Fri, Jul 10, 2015 at 10:04 AM, Linus Torvalds wrote: > On Fri, Jul 10, 2015 at 9:44 AM, Andy Lutomirski wrote: >> >> That's not what I mean. I'm referring to the vm86 syscall itself. If >> you have a ti flag that causes the slow exit path to be used, then you >> call vm86. vm86 sets up the ludicrous double stack frame that it uses >> and jumps back to the exit asm. The exit asm then branches off to the >> slow path, hits the notifysig_v86 kludge, calls save_v86_state, tears >> down its double stack frame, and keeps meandering back through the >> exit asm. We finally IRET right back to protected mode, and the code >> that userspace was trying to execute in v8086 mode never actually >> runs. > > So? > > So yes, if the thread work flags are set, we never enter vm86 mode. > BUT THAT'S EXACTLY WHAT SHOULD HAPPEN. > > It worries me that you think these kinds of fundamental issues are > completely broken. > The problem is that it's *every* event. That includes this that happen literally every time like strace. (NOHZ_FULL would count, too, if it worked at all on 32-bit kernels.) Try it: vm86 will make zero progress if you run it under strace. It will also execute the trace hooks the wrong number of times, so strace gets very confused. If someone does something daft like using a systrace-style sandbox, it probably breaks the sandbox. > > And yes, if you enable system call auditing, and you actually audit > the vm86 mode system call, that probably causes an exit condition, > which means that you can't actually run vm86 mode and make progress if > you audit that system call. Big f*cking deal. People who enable system > call auditing break many more important things (eg basic performance) > that that isn't even an argument. Do you really think that people who > wanted to run DOS games at hardware speeds wanted to _audit_ those > games? No. Not at all. It does, however, mean that Fedora/RHEL users (who use auditing by default in most cases, sigh) have a decent change of having had a non-working vm86 syscall for a long time. This makes me think that there really aren't many vm86 users out there, since we'd have heard about the breakage. Note that audit is very special, though, since it has its own asm path. It might actually work, but I haven't tested it. In any event, we're quibbling about the wording of the kconfig text here. Both Brian and I have patches that fix the ptrace problem, so it's likely to be a nonissue in 4.3 regardless. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/