Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965206AbeAJJMQ (ORCPT + 1 other); Wed, 10 Jan 2018 04:12:16 -0500 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:39365 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965106AbeAJJLt (ORCPT ); Wed, 10 Jan 2018 04:11:49 -0500 Date: Wed, 10 Jan 2018 10:11:02 +0100 From: Willy Tarreau To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Andy Lutomirski , Borislav Petkov , Brian Gerst , Dave Hansen , Ingo Molnar , Linus Torvalds , Thomas Gleixner , Josh Poimboeuf , "H. Peter Anvin" , Greg Kroah-Hartman , Kees Cook Subject: Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set Message-ID: <20180110091102.GH14066@1wt.eu> References: <1515502580-12261-1-git-send-email-w@1wt.eu> <1515502580-12261-7-git-send-email-w@1wt.eu> <20180110082207.GX29822@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180110082207.GX29822@worktop.programming.kicks-ass.net> User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Wed, Jan 10, 2018 at 09:22:07AM +0100, Peter Zijlstra wrote: > On Tue, Jan 09, 2018 at 01:56:20PM +0100, Willy Tarreau wrote: > > - use pti_disable instead of task flag > > --- > > arch/x86/entry/calling.h | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h > > index 2c0d3b5..5361a10 100644 > > --- a/arch/x86/entry/calling.h > > +++ b/arch/x86/entry/calling.h > > @@ -229,6 +229,11 @@ > > > > .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req > > ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI > > + > > + /* The "pti_disable" mm attribute is mirrored into this per-cpu var */ > > + cmpb $0, PER_CPU_VAR(pti_disable) > > + jne .Lend_\@ > > + > > mov %cr3, \scratch_reg > > So could you switch back to a task flag for this? That word is already > cache-hot on the exit path while your new variable is not. That's a good point. There's already been some demands for a per-thread setting. What I can propose then is to partially revert the changes to have this : - arch_prctl() adjusts the task flag and not a per-mm variable anymore (Linus, are you OK for this ?) - arch_prctl() only accepts to perform the action if mm->mm_users == 1 so that we don't change the setting after having created threads ; this way the task flag is replicated to all future threads ; - later we may decide to permit re-enabling PTI per thread if it was disabled. If we agree on this, I'd like to propose to have two flags : - TIF_DISABLE_PTI_NOW : disable PTI for the current task, reset by execve() - TIF_DISABLE_PTI_NEXT : disable PTI after execve(), reset by execve() execve() would then simply do : TIF_DISABLE_PTI_NOW = TIF_DISABLE_PTI_NEXT; TIF_DISABLE_PTI_NEXT = 0; The former would be used by applications using their own configuration. The latter would be used by wrappers. This way we seem to cover the various use cases. And we make this depend on a sysctl that allows the admin to globally and permanently disable the feature and which is disabled by default. Any objection ? Regards, Willy