Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933666AbeAKWAe (ORCPT + 1 other); Thu, 11 Jan 2018 17:00:34 -0500 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:39677 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932790AbeAKWAc (ORCPT ); Thu, 11 Jan 2018 17:00:32 -0500 Date: Thu, 11 Jan 2018 22:59:54 +0100 From: Willy Tarreau To: Linus Torvalds Cc: Andy Lutomirski , Dave Hansen , Peter Zijlstra , LKML , X86 ML , Borislav Petkov , Brian Gerst , Ingo Molnar , Thomas Gleixner , Josh Poimboeuf , "H. Peter Anvin" , Greg Kroah-Hartman , Kees Cook Subject: Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set Message-ID: <20180111215954.GC15528@1wt.eu> References: <20180110082207.GX29822@worktop.programming.kicks-ass.net> <20180110091102.GH14066@1wt.eu> <20180111064259.GC14920@1wt.eu> <0f08d89e-61e1-20e3-5c59-0b2f7b32bf0c@linux.intel.com> <20180111154412.GA15296@1wt.eu> <20180111174025.GB15344@1wt.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 11, 2018 at 10:25:29AM -0800, Linus Torvalds wrote: > Just to clarify: I definitely want the part where it is only > switchable in single-threaded mode, and I actually do want it > "inherited" by threads when they do get created. OK this is what is currently done in series v3 because the TIF_* flags are copied as-is to threads (unless I missed something). Even for re-enabling it currently refuses it if mm_users > 1. > It's just that my mental model for threads is not that they "inherit" > the PTI flag, it's that they simply share it. But that's more of a > difference in "views" than in actual behavior. I see, thanks for explaining this point, I understand better your concern now. Well, if we document that the current process' flag is replicated as-is to any threads so that it is consistent across all threads and that it may only be modified on all threads atomically, which currently we can only achieve by doing it when there's a single thread on an mm, I suspect it could match your mental model. > If you do the PTI on/off operation *before* the vfork(), nothing is > different. The vfork() by definition ends up having the same PTI > state, since it has the same VM. But that's actually 100% expected, > and it matches the fork() behavior too: the PTI state should be just > copied by a fork(), since fork isn't any protection domain. > > And *after* you've done a vfork(), you can't do a PTI on/off, because > now the VM has multiple users, which is 100% equivalent to the thread > case that we already all agreed should be disallowed. So no, you can't > do "vfork -> setnopti -> exec', but that is in no way different from > any of the *other* things you cannot do in between vfork and execve. That's where I like the principle of the NEXT ctl which can be per- thread. The thread about to do an execve() cannot change its own flag because it's entangled to the other ones sharing the same mm, but it can change its own NEXT flag so that execve() starts with the specified mode (typically PTI on in the example of log rotation for a server). Quite honnestly for the NOW vs NEXT, I find the NOW convenient to avoid a wrapper, but a program could also self-exec after setting the flag (I've already done this to change thread stack sizes on certain processes a long time ago and that's no real hassle). And given that NOW cannot really re-adjust the PGD that was already assigned, maybe in the end we should stick to this NEXT thing and wait for the next execve() to apply the operation. Willy