Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752495AbdLHHe7 (ORCPT ); Fri, 8 Dec 2017 02:34:59 -0500 Received: from mail-wm0-f46.google.com ([74.125.82.46]:32991 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751021AbdLHHe6 (ORCPT ); Fri, 8 Dec 2017 02:34:58 -0500 X-Google-Smtp-Source: AGs4zMYJr/HJkT0k/dpWHAyWnKd0EUhL2QqOVMUVJgoim7fDJ62uIHdYm9VSYk3cLMZyOYHHkL1xHg== Date: Fri, 8 Dec 2017 08:34:54 +0100 From: Ingo Molnar To: Andy Lutomirski Cc: Thomas Gleixner , Andy Lutomirski , Borislav Petkov , X86 ML , "linux-kernel@vger.kernel.org" , Brian Gerst , David Laight , Kees Cook , Peter Zijlstra Subject: Re: [PATCH] LDT improvements Message-ID: <20171208073454.dicyefwncsihq7sm@gmail.com> References: <48fe5cf1382d6a95c7b1837415882edcc81a9781.1512631324.git.luto@kernel.org> <20171207124347.p7kdj7q4qqs3ivri@pd.tnic> <665F1CA8-D012-4465-B5F7-E81E088847DB@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <665F1CA8-D012-4465-B5F7-E81E088847DB@amacapital.net> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2517 Lines: 63 * Andy Lutomirski wrote: > > > > On Dec 7, 2017, at 9:23 AM, Thomas Gleixner wrote: > > > >> On Thu, 7 Dec 2017, Andy Lutomirski wrote: > >> > >>> On Thu, Dec 7, 2017 at 4:43 AM, Borislav Petkov wrote: > >>>> On Wed, Dec 06, 2017 at 11:22:21PM -0800, Andy Lutomirski wrote: > >>>> I think I like this approach. I also think it might be nice to move the > >>>> whole cpu_entry_area into this new pgd range so that we can stop mucking > >>>> around with the fixmap. > >>> > >>> Yeah, and also, I don't like the idea of sacrificing a whole PGD > >>> only for the LDT crap which is optional, even. Frankly - and this > >>> is just me - I'd make CONFIG_KERNEL_PAGE_TABLE_ISOLATION xor > >>> CONFIG_MODIFY_LDT_SYSCALL and don't give a rat's *ss about the LDT. > >> > >> The PGD sacrifice doesn't bother me. Putting a writable LDT map at a > >> constant address does bother me. We could probably get away with RO > >> if we trapped and handled the nasty faults, but that could be very > >> problematic. > > > > Where is the problem? You can map it RO into user space with the USER bit > > cleared. The kernel knows how to access the real stuff. > > Blows up when the CPU tries to set the accessed bit. BTW., could we force the accessed bit to be always set, without breaking the ABI? > > The approach I've taken is to create a VMA and map it into user space with > > the USER bit cleared. A little bit more effort code wise, but that avoids > > all the page table muck and keeps it straight attached to the process. > > > > Will post once in a bit. > > I don't love mucking with user address space. I'm also quite nervous about > putting it in our near anything that could pass an access_ok check, since we're > totally screwed if the bad guys can figure out how to write to it. Hm, robustness of the LDT address wrt. access_ok() is a valid concern. Can we have vmas with high addresses, in the vmalloc space for example? IIRC the GPU code has precedents in that area. Since this is x86-64, limitation of the vmalloc() space is not an issue. I like Thomas's solution: - have the LDT in a regular mmap space vma (hence per process ASLR randomized), but with the system bit set. - That would be an advantage even for non-PTI kernels, because mmap() is probably more randomized than kmalloc(). - It would also be a cleaner approach all around, and would avoid the fixmap complications and the scheduler muckery. Thanks, Ingo