Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966435AbbDWQGp (ORCPT ); Thu, 23 Apr 2015 12:06:45 -0400 Received: from mail-ob0-f181.google.com ([209.85.214.181]:35583 "EHLO mail-ob0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966186AbbDWQGo (ORCPT ); Thu, 23 Apr 2015 12:06:44 -0400 MIME-Version: 1.0 In-Reply-To: References: <1429792491-5978-1-git-send-email-dvlasenk@redhat.com> Date: Thu, 23 Apr 2015 12:06:43 -0400 Message-ID: Subject: Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary From: Brian Gerst To: Linus Torvalds Cc: Denys Vlasenko , Ingo Molnar , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Andy Lutomirski , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , "the arch/x86 maintainers" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2334 Lines: 49 On Thu, Apr 23, 2015 at 11:22 AM, Linus Torvalds wrote: > On Thu, Apr 23, 2015 at 5:34 AM, Denys Vlasenko wrote: >> >> It was observed to cause Wine crashes. Conjectured sequence of events >> causing it is as follows: >> >> 1. Wine process enters kernel via syscall insn. >> 2. Context switch to any other task. >> 3. Interrupt or exception happens, CPU loads %ss with 0. >> (This happens according to both Intel and AMD docs.) >> %ss cached descriptor is set to "invalid" state. >> 4. Context switch back to Wine. >> 5. sysret to 32-bit userspace. %ss selector has correct value but its >> cached descriptor is still invalid. > > I really don't like the patch, as it just feels very hacky to me. > > It is a bit scary to me that apparently we leak %ss values between > processes, so that while we run in the kernel we can randomly have the > ss descriptor either be 0 or __KERNEL_DS. That sounds like an > information leak to me, even in 64-bit mode. The value of %ss may not > *matter* in 64-bit mode, but leaking that difference between processes > sounds nasty. I can't offhand thing of any way to actually read the > present bit in the cached descriptor (I was thinking something like > the "LSL" instruction, but that takes a new segment selector, not the > segment itself), but it just smells odd to me. So you are saying we should save and conditionally restore the kernel's %ss during context switch? That shouldn't be too bad. Half of the time you would be loading the null selector which is fast (no GDT access, no validation). > Also, why does this only happen with Wine? In regular 32-bit mode the > segment valid bit in the cached descriptor should also matter. So how > come this doesn't trigger for any 32-bit user land on a 64-bit kernel? Probably just lack of exposure so far. It only affects AMD cpus, and it was just merged. Wine is probably the most common 32-bit app people will run on a 64-bit kernel. I'll test something other than Wine that is 32-bit when I get home tonight. -- Brian Gerst -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/