Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1032259AbbKFHF4 (ORCPT ); Fri, 6 Nov 2015 02:05:56 -0500 Received: from mail-ob0-f173.google.com ([209.85.214.173]:35283 "EHLO mail-ob0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1031984AbbKFHFz (ORCPT ); Fri, 6 Nov 2015 02:05:55 -0500 MIME-Version: 1.0 In-Reply-To: <20151106065549.GA2031@gmail.com> References: <20151103111649.GA3477@gmail.com> <20151104233907.GA25925@codemonkey.org.uk> <20151105021710.GA22941@codemonkey.org.uk> <20151106065549.GA2031@gmail.com> From: Andy Lutomirski Date: Thu, 5 Nov 2015 23:05:35 -0800 Message-ID: Subject: Re: [GIT PULL] x86/mm changes for v4.4 To: Ingo Molnar Cc: Linus Torvalds , Stephen Smalley , Matt Fleming , Dave Jones , Linux Kernel Mailing List , Thomas Gleixner , "H. Peter Anvin" , Borislav Petkov , Andrew Morton , Andy Lutomirski , Denys Vlasenko , Kees Cook Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3945 Lines: 81 On Thu, Nov 5, 2015 at 10:55 PM, Ingo Molnar wrote: > > * Linus Torvalds wrote: > >> On Wed, Nov 4, 2015 at 6:17 PM, Dave Jones wrote: >> > On Wed, Nov 04, 2015 at 05:31:59PM -0800, Linus Torvalds wrote: >> > > >> > > I don't have that later debug output at all. Presumably some config difference. >> > >> > CONFIG_X86_PTDUMP_CORE iirc. >> >> No, I have that. I suspect CONFIG_EFI_PGT_DUMP instead. >> >> Anyway, as it stands now, I think the CONFIG_DEBUG_WX option should >> not default to 'y' unless it is made more useful if it actually >> triggers. Ingo? > > Yeah, agreed absolutely. > > So this is a bit sad because RWX pages are a real problem in practice, especially > since the EFI addresses are well predictable, but generating a warning without > being able to fix it quickly is counterproductive as well, as it only annoys > people and makes them turn off the option. (Which we could do as well to begin > with, without the annoyance factor...) > > So the plan would be: > > 1) Make it default-n. > > 2) We should try to further improve the messages to make it easier to determine > what's wrong. We _do_ try to output symbolic information in the warning, to > make it easier to find buggy mappings, but these are not standard kernel > mappings. So I think we need an e820 mappings based semi-symbolic printout of > bad addresses - maybe even correlate it with the MMIO resource tree. > > 3) We should fix the EFI permission problem without relying on the firmware: it > appears we could just mark everything R-X optimistically, and if a write fault > happens (it's pretty rare in fact, only triggers when we write to an EFI > variable and so), we can mark the faulting page RW- on the fly, because it > appears that writable EFI sections, while not enumerated very well in 'old' > firmware, are still supposed to be page granular. (Even 'new' firmware I > wouldn't automatically trust to get the enumeration right...) I think it was Borislav who pointed out that this idea, which might have been mine, is a bit silly. Why not just skip mapping the EFI stuff in the init_pgd entirely and only map it in the EFI pgd? We'll have RWX stuff in the EFI pgd, but so what? If we're exposing anything that runs with the EFI pgd loaded to untrusted input, I think we've already lost. Admittedly, we might need to use a certain amount of care to avoid interesting conflicts with the vmap mechanism. We might need to vmap all of the EFI stuff, and possibly even all the top-level entries that contain EFI stuff (i.e. exactly one of them unless EFI ends up *huge*) as a blank not-present region to avoid overlaps, but that's not a big deal. > > If that 'supposed to be' turns out to be 'not true' (not unheard of in > firmware land), then plan B would be to mark pages that generate write faults > RWX as well, to not break functionality. (This 'mark it RWX' is not something > that exploits would have easy access to, and we could also generate a warning > [after the EFI call has finished] if it ever triggers.) > > Admittedly this approach might not be without its own complications, but it > looks reasonably simple (I don't think we need per EFI call page tables, > etc.), and does not assume much about the firmware being able to enumerate its > permissions properly. Were we to merge EFI support today I'd have insisted on > trying such an approach from day 1 on. I think we have separate EFI page tables already for other reasons. I could be wrong -- I've never really understood the EFI mapping layout very well. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/