Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965766AbeAKTiX (ORCPT + 1 other); Thu, 11 Jan 2018 14:38:23 -0500 Received: from mail-pg0-f66.google.com ([74.125.83.66]:36920 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965724AbeAKTiU (ORCPT ); Thu, 11 Jan 2018 14:38:20 -0500 X-Google-Smtp-Source: ACJfBouggbGX0j2h1gTDc8Fx8j1QIw0zMNtS/OSpGrU4yv0MLPGMv8B8hvSsY/KAvGEKkI2a7C+n+A== Date: Thu, 11 Jan 2018 11:38:16 -0800 From: Alexei Starovoitov To: Dave Hansen Cc: Linus Torvalds , Josh Poimboeuf , Andy Lutomirski , Willy Tarreau , Peter Zijlstra , LKML , X86 ML , Borislav Petkov , Brian Gerst , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Greg Kroah-Hartman , Kees Cook Subject: Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set Message-ID: <20180111193814.emukyfnq5zhq6bww@ast-mbp.dhcp.thefacebook.com> References: <20180111064259.GC14920@1wt.eu> <0f08d89e-61e1-20e3-5c59-0b2f7b32bf0c@linux.intel.com> <20180111154412.GA15296@1wt.eu> <20180111182147.masunghp5km6igjq@ast-mbp.dhcp.thefacebook.com> <20180111183207.dah7imbuvuhvrrk6@treble> <4e32af93-f632-ae17-eed4-c7023c1b9cc5@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4e32af93-f632-ae17-eed4-c7023c1b9cc5@linux.intel.com> User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 11, 2018 at 10:57:51AM -0800, Dave Hansen wrote: > On 01/11/2018 10:51 AM, Linus Torvalds wrote: > > On Thu, Jan 11, 2018 at 10:38 AM, Dave Hansen > > wrote: > >> On 01/11/2018 10:32 AM, Josh Poimboeuf wrote: > >>>> hmm. Exposing cr3 to user space will make it trivial for user process > >>>> to know whether kpti is active. Not sure how exploitable such > >>>> information leak. > >>> It's already trivial to detect PTI from user space. > >> Do tell. > > One way to do it is to just run the attack, and see if you get something. > > Not judging how trivial (or not) the attack is, I was hoping for > something that was *not* the attack itself. :) the attack can also be inconclusive. I'm playing with an idea to conditionally switch into user cr3 after returning from syscall. Like per task or per cpu counter or randomly tell syscall return to keep kernel cr3 while next interrupt will bring task back to user cr3. Public exploits use syscalls to keep kernel memory in L1 and with such hack they see partial kernel reads and cannot really tell which are the kernel bytes in the mix. If there is a fast way for an attacker to know that after syscall kpti is off for this task then such conditional kpti on/off will be completely pointless. It's not 100% secure obviously. Sort-of best effort to bring back most of syscall performance without thinking which task should or should not be allowed to toggle kpti flag.