Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758577AbbGHTPY (ORCPT ); Wed, 8 Jul 2015 15:15:24 -0400 Received: from mail-lb0-f182.google.com ([209.85.217.182]:36744 "EHLO mail-lb0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755744AbbGHTPS (ORCPT ); Wed, 8 Jul 2015 15:15:18 -0400 MIME-Version: 1.0 In-Reply-To: References: <23d4709cee2fe92c32d41b99c7a3c1823725925a.1436312944.git.luto@kernel.org> <559C8BFE.6050604@linux.intel.com> From: Andy Lutomirski Date: Wed, 8 Jul 2015 12:14:56 -0700 Message-ID: Subject: Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN To: Brian Gerst Cc: Linus Torvalds , Arjan van de Ven , Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , Oleg Nesterov , Kees Cook , Peter Zijlstra , Borislav Petkov Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4992 Lines: 110 On Wed, Jul 8, 2015 at 12:05 PM, Brian Gerst wrote: > On Wed, Jul 8, 2015 at 1:30 PM, Andy Lutomirski wrote: >> On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds >> wrote: >>> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven wrote: >>>> >>>> if this patch would not be acceptable, at minimum we need some sort of "off >>>> by default >>>> unless the sysadmin flips a sysfs thing", which is really just a huge hack. >>> >>> The only thing that matters is whether people use this or not. >>> >> >> I think that the world contains precisely two programs that use the >> vm86 syscalls. One is dosemu, and one is a test case I wrote. (There >> are probably some exploits written by other people that I don't know >> about. Certainly Spender has been patching vm86 for long enough that >> he must have an exploit or two up his sleeve.) >> >> As far as I can tell (and I'll try to test this better for real later >> this week), dosemu already knows how to emulate real mode if vm86 is >> unavailable. So it's unclear that turning off the vm86 syscalls >> actually breaks anything whatsoever. >> >> On the other hand, sys_vm86 fails if the syscall slow path is in use. >> That means that quite a few Fedora versions (auditing), anything with >> ptrace, seccomp (before 3.16 IIRC), and anything with context tracking >> is probably actually *improved* by turning off the vm86 syscalls even >> for dosemu users. >> >> And apparently Ubuntu has had CONFIG_VM86 disabled forever. >> >> IOW, vm86 really is broken. >> >>> If people use vm86 mode, we can't just disable it. It's that simple. >>> "It's poorly maintained" isn't an argument for removal. Only "nobody >>> cares" works as an argument for that. >>> >>> My suspicion is that people still do use vm86 mode, but who knows.. >>> Quite frankly, rather than disable it, I'd much rather see people who >>> modify low-level x86 code (yes, that means you, Luto) *test* it. If >>> you aren't willign to test the modifications you make, I don't think >>> those modifications should be merged, regardless of how nice a cleanup >>> they are. >> >> I tried to test it. As far as I know, my changes in -tip have no >> effect on vm86, and the changes I'm planning on sending this week will >> make it work better. I still thing that Linux users should have it >> configured out or deleted altogether. Especially people who care at >> all about security. >> >> It's easy to try the easy case (run from tools/testing/selftests/x86) >> -- this is v4.2-rc1, but most recent versions should be identical: >> >> $ ./entry_from_vm86_32 >> [RUN] #BR from vm86 mode >> [OK] Exited vm86 mode due to #BR >> [RUN] SYSENTER from vm86 mode >> [OK] Exited vm86 mode due to unhandled GP fault >> >> $ strace -e vm86 ./entry_from_vm86_32 >> [RUN] #BR from vm86 mode >> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >> (Function not implemented) >> [OK] Exited vm86 mode due to type 0, arg 0 >> [RUN] SYSENTER from vm86 mode >> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >> (Function not implemented) >> [OK] Exited vm86 mode due to type 0, arg 0 >> >> It only says "[OK]" because my test case isn't careful enough. That's >> a failure. I suspect it was a much worse failure a couple versions >> ago before my ENOSYS-reworking patch went in. >> >> Replace "-e vm86" with "-e write" and be puzzled. The failure mode is >> really pretty bad. >> >> This only tests easy stuff. The integration between vm86 and fault >> handling is truly awful and I don't even know how to approach testing >> it. I'd probably have to run twenty or thirty old real-mode games to >> even exercise those code paths. >> >> I'll try to confirm later this week that dosemu can really handle real >> mode without sys_vm86. > > None of these issues are unfixable. As I said before, many of them > can be resolved if vm86 is changed to use the normal syscall/exception > exit paths. Give me a few days to finish off that patch set. > I look forward to it. However: I imagine that, if you do this, you may need to be quite careful about an x86_32-ism. Currently, if you have a pt_regs pointer for the current entry and user_mode(regs) returns true, then regs == current_pt_regs(). If you let user mode run with EFLAGS.VM set with the normal tss.sp0, then this will no longer be true, as the extra-long entry-from-v8086 frame will shift pt_regs by a few bytes. I don't know whether this matters, but I can imagine it causing do_signal to explode. *shudder* Anyway, I'll send out my 32-bit cleanups for review soon. If it conflicts with your changes, it'll be easy to fix up. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/