MIME-Version: 1.0
In-Reply-To: <CA+55aFzGC9fnZGmUMRbV3L+w_F+-H+p1go+_s0O5ptOXLJquVg@mail.gmail.com>
References: <23d4709cee2fe92c32d41b99c7a3c1823725925a.1436312944.git.luto@kernel.org>
 <559C8BFE.6050604@linux.intel.com> <CA+55aFyrGkCLcizi-Wgk5D-a3QGqZSh-4ahEFuhJZS_obsiNHQ@mail.gmail.com>
 <CALCETrWfuW6Lt42PoAcOfxgO4dzPm8y0DOhhtpQ=w+cS3AJw2A@mail.gmail.com>
 <87twtc14po.fsf@x220.int.ebiederm.org> <CALCETrXr1S=5y734BVzHGTk74+qHAz-D692nd57HwqVN2sOU1g@mail.gmail.com>
 <CA+55aFxWfNb4Bj1tsxkCgpbaSseioEV=AenGX7=sgz2mzJUWYA@mail.gmail.com>
 <CALCETrUN6TFRmLfgys27SJs3crSfiCFbS+ZCR42tHRWMZg6b=Q@mail.gmail.com> <CA+55aFzGC9fnZGmUMRbV3L+w_F+-H+p1go+_s0O5ptOXLJquVg@mail.gmail.com>
From: Andy Lutomirski <luto@amacapital.net>
Date: Fri, 10 Jul 2015 10:13:28 -0700
Message-ID: <CALCETrVwS4DPjOja5HcdpP7KDOBsSgBTVtzehAzrYq+WCk1veA@mail.gmail.com>
Subject: Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
        Arjan van de Ven <arjan@linux.intel.com>,
        Andy Lutomirski <luto@kernel.org>,
        "the arch/x86 maintainers" <x86@kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Oleg Nesterov <oleg@redhat.com>, Kees Cook <keescook@chromium.org>,
        Peter Zijlstra <peterz@infradead.org>, Borislav Petkov <bp@alien8.de>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2796
Lines: 64

On Fri, Jul 10, 2015 at 10:04 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Jul 10, 2015 at 9:44 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>>
>> That's not what I mean.  I'm referring to the vm86 syscall itself.  If
>> you have a ti flag that causes the slow exit path to be used, then you
>> call vm86.  vm86 sets up the ludicrous double stack frame that it uses
>> and jumps back to the exit asm.  The exit asm then branches off to the
>> slow path, hits the notifysig_v86 kludge, calls save_v86_state, tears
>> down its double stack frame, and keeps meandering back through the
>> exit asm.  We finally IRET right back to protected mode, and the code
>> that userspace was trying to execute in v8086 mode never actually
>> runs.
>
> So?

>
> So yes, if the thread work flags are set, we never enter vm86 mode.
> BUT THAT'S EXACTLY WHAT SHOULD HAPPEN.
>
> It worries me that you think these kinds of fundamental issues are
> completely broken.
>

The problem is that it's *every* event.  That includes this that
happen literally every time like strace.  (NOHZ_FULL would count, too,
if it worked at all on 32-bit kernels.)

Try it: vm86 will make zero progress if you run it under strace.  It
will also execute the trace hooks the wrong number of times, so strace
gets very confused.  If someone does something daft like using a
systrace-style sandbox, it probably breaks the sandbox.

>
> And yes, if you enable system call auditing, and you actually audit
> the vm86 mode system call, that probably causes an exit condition,
> which means that you can't actually run vm86 mode and make progress if
> you audit that system call. Big f*cking deal. People who enable system
> call auditing break many more important things (eg basic performance)
> that that isn't even an argument. Do you really think that people who
> wanted to run DOS games at hardware speeds wanted to _audit_ those
> games? No.

Not at all.

It does, however, mean that Fedora/RHEL users (who use auditing by
default in most cases, sigh) have a decent change of having had a
non-working vm86 syscall for a long time.  This makes me think that
there really aren't many vm86 users out there, since we'd have heard
about the breakage.

Note that audit is very special, though, since it has its own asm
path.  It might actually work, but I haven't tested it.

In any event, we're quibbling about the wording of the kconfig text
here.  Both Brian and I have patches that fix the ptrace problem, so
it's likely to be a nonissue in 4.3 regardless.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/