Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751962AbdHGTII (ORCPT ); Mon, 7 Aug 2017 15:08:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:43270 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751505AbdHGTIH (ORCPT ); Mon, 7 Aug 2017 15:08:07 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E14CC23695 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org MIME-Version: 1.0 In-Reply-To: References: <73bef0c2-f181-0626-2ac1-e4e0537ca851@list.ru> From: Andy Lutomirski Date: Mon, 7 Aug 2017 12:07:45 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: FSGSBASE ABI considerations To: Linus Torvalds Cc: Andy Lutomirski , Stas Sergeev , "Bae, Chang Seok" , X86 ML , "linux-kernel@vger.kernel.org" , Borislav Petkov , Brian Gerst , Bart Oldeman Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2680 Lines: 65 On Mon, Aug 7, 2017 at 10:35 AM, Linus Torvalds wrote: > On Mon, Aug 7, 2017 at 9:20 AM, Andy Lutomirski wrote: >> >> Windows does something sort of like this (I think), but I don't like >> this solution. I fully expect that someone will write a program that >> does: >> >> old = rdgsbase(); >> wrgsbase(new); >> call_very_fast_function(); >> wrgsbase(old); >> >> This will work if GS == 0, which is fine. The problem is that it will >> *also* work if GS != 0 with very high probability, especially if this >> code sequence is right after some operation that sleeps. And then >> we'll get random crashes with very low probability, depending on where >> the scheduler hits. > > It will work reliably if you just make the scheduler save/restore the > base rather than the selector. > > I really think you need to walk away from the "selector is meaningful" > model. Yes, yes, it's the legacy model, but it's the *insane* model. > > So screw the selector. It doesn't matter. We'll need to save/restore > the value, but that's it. What we *really* save and restore is just > the base pointer. > > Why do you care so much about the selector? If people *don't* use the > fsgsbase, then the selector and the base of the segment will always > match anyway (modulo the system calls that actually change the > gdt/ldt, and we can just sat that *then* selectors matter). > > And if people *do* use fsgsbase, then the selector is by definition > not important. > > So just make the scheduler save the base first, and restore it last. > End of problem. Your user-space code above just works. There is no > race, i doesn't matter one whit whether GS is 0 ir not, there simply > is no problem. I agree completely. The scheduler should do exactly this and, with my patches applied, it does. > > So just what is the problem you're trying to solve? > I'm trying to avoid a situation where we implement that policy and the interaction with modify_ldt() becomes very strange. Linux has a long history of having ill-defined semantics x86_64, and I don't want to make it worse. If we *just* change the way the scheduler works, then we end up with modify_ldt() behaving determinstically on IVB+ and behaving deterministically on 32-bit kernels, but having that deterministic behavior be *different*. This makes me rather unhappy about the whole situation. Also, I don't want to break gdb, and even telling whether a change breaks gdb is an incredible PITA. Whern GDB saves and restores a context, it currently restores the base first and the selector second, and I have no idea whether gdb expects restoring the selector to update the base.