Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965170AbeAKSZc (ORCPT + 1 other); Thu, 11 Jan 2018 13:25:32 -0500 Received: from mail-it0-f48.google.com ([209.85.214.48]:43824 "EHLO mail-it0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965129AbeAKSZb (ORCPT ); Thu, 11 Jan 2018 13:25:31 -0500 X-Google-Smtp-Source: ACJfBotX/rFprd2JMAKBD/BW7TrGpr61i0efdJ88aDZoDamjL0hP9uZ/R8a9oGY9PB+kJD0klTpRBeRplKesKUuiziM= MIME-Version: 1.0 In-Reply-To: <20180111174025.GB15344@1wt.eu> References: <1515502580-12261-1-git-send-email-w@1wt.eu> <1515502580-12261-7-git-send-email-w@1wt.eu> <20180110082207.GX29822@worktop.programming.kicks-ass.net> <20180110091102.GH14066@1wt.eu> <20180111064259.GC14920@1wt.eu> <0f08d89e-61e1-20e3-5c59-0b2f7b32bf0c@linux.intel.com> <20180111154412.GA15296@1wt.eu> <20180111174025.GB15344@1wt.eu> From: Linus Torvalds Date: Thu, 11 Jan 2018 10:25:29 -0800 X-Google-Sender-Auth: CC_k-RnR5ix2R4HzWB5IwIv5Z4s Message-ID: Subject: Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set To: Willy Tarreau Cc: Andy Lutomirski , Dave Hansen , Peter Zijlstra , LKML , X86 ML , Borislav Petkov , Brian Gerst , Ingo Molnar , Thomas Gleixner , Josh Poimboeuf , "H. Peter Anvin" , Greg Kroah-Hartman , Kees Cook Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 11, 2018 at 9:40 AM, Willy Tarreau wrote: >> As for per-mm vs per-thread, let's make it only switchable in >> single-threaded processes for now and inherited when threads are >> created. > > That's exactly what it does for now, but Linus doesn't like it at all. Just to clarify: I definitely want the part where it is only switchable in single-threaded mode, and I actually do want it "inherited" by threads when they do get created. It's just that my mental model for threads is not that they "inherit" the PTI flag, it's that they simply share it. But that's more of a difference in "views" than in actual behavior. >> (Another reason for per-thread instead of per-mm: as a per-mm thing, >> you can't set it up for your descendents using vfork(); prctl(); >> exec(), and the latter is how your average language runtime that >> spawns subprocesses would want to do it. > > That's indeed the benefit it provides for now since I actually had > to *add* code to execve() to disable it then. So the "vfork()" case is indeed interesting, but I don't think it's all that relevant. Why? If you do the PTI on/off operation *before* the vfork(), nothing is different. The vfork() by definition ends up having the same PTI state, since it has the same VM. But that's actually 100% expected, and it matches the fork() behavior too: the PTI state should be just copied by a fork(), since fork isn't any protection domain. And *after* you've done a vfork(), you can't do a PTI on/off, because now the VM has multiple users, which is 100% equivalent to the thread case that we already all agreed should be disallowed. So no, you can't do "vfork -> setnopti -> exec', but that is in no way different from any of the *other* things you cannot do in between vfork and execve. And in a wrapper that sets nopti, you wouldn't want to use vfork anyway. You wouldn't even want to use *fork*. You'd just do "set nopti" and then execve(). That's the whole point of the wrapper. So vfork() is worth _mentioning_, but I don't think there is any actual issue there. Quite the reverse - it acts exactly as expected. The main thing that should be special for PTI on/off is "execve()". That's the one that may force PTI on again, because of a security boundary. The other case may be the CLONE_NEW* operations. I *think* they are noops as far as PTI settings would be, but I think people should think about them. Linus