Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752313AbdGIO15 (ORCPT ); Sun, 9 Jul 2017 10:27:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57666 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750999AbdGIO14 (ORCPT ); Sun, 9 Jul 2017 10:27:56 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D588F3D940 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=bhe@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com D588F3D940 Date: Sun, 9 Jul 2017 22:27:51 +0800 From: Baoquan He To: Matt Fleming Cc: Naoya Horiguchi , Kees Cook , LKML , "x86@kernel.org" , Thomas Gleixner , "H. Peter Anvin" , Ingo Molnar , "izumi.taku@jp.fujitsu.com" , Thomas Garnier , "fanc.fnst@cn.fujitsu.com" , Junichi Nomura , Ard Biesheuvel Subject: Re: [PATCH] x86/boot/KASLR: exclude EFI_BOOT_SERVICES_{CODE|DATA} from KASLR's choice Message-ID: <20170709142751.GB13560@x1> References: <20170706083106.GA21796@hori1.linux.bs1.fc.nec.co.jp> <20170706145727.GB3080@codeblueprint.co.uk> <20170707030759.GA2343@x1> <20170707105658.GA9917@codeblueprint.co.uk> <20170709104413.GA13560@x1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170709104413.GA13560@x1> User-Agent: Mutt/1.7.0 (2016-08-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Sun, 09 Jul 2017 14:27:56 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11136 Lines: 146 On 07/09/17 at 06:44pm, Baoquan He wrote: > On 07/07/17 at 11:56am, Matt Fleming wrote: > > On Fri, 07 Jul, at 11:07:59AM, Baoquan He wrote: > > > On 07/06/17 at 03:57pm, Matt Fleming wrote: > > > > On Thu, 06 Jul, at 08:31:07AM, Naoya Horiguchi wrote: > > > > > + for (i = 0; i < nr_desc; i++) { > > > > > + md = (efi_memory_desc_t *)(pmap + (i * e->efi_memdesc_size)); > > > > > + > > > > > + /* > > > > > + * EFI_BOOT_SERVICES_{CODE|DATA} are avoided because boot > > > > > + * services regions could be accessed after ExitBootServices() > > > > > + * due to the workaround for buggy firmware. > > > > > + */ > > > > > + if (!(md->type == EFI_LOADER_CODE || > > > > > + md->type == EFI_LOADER_DATA || > > > > > + md->type == EFI_CONVENTIONAL_MEMORY)) > > > > > + continue; > > > > > > > > Wouldn't it make more sense to *only* use EFI_CONVENTIONAL_MEMORY? > > > > > > > > You can't re-use EFI_LOADER_* regions because the kaslr code is run so > > > > early in boot that you've no idea if data the kernel will need is in > > > > those EFI_LOADER_* regions. > > > > > > > > For example, we pass struct setup_data objects inside of > > > > EFI_LOADER_DATA regions. > > > > > > It doesn't matter because we have tried to avoid those memory setup_data > > > resides in in mem_avoid_overlap(). Here discarding EFI_LOADER_* could > > > discard the whole regions while setup_data could occupy small part of > > > them. > > > > What about the GDT that we allocate in the x86 EFI boot stub as > > EFI_LOADER_DATA? Are there functions to avoid that too? > > This is a very good question. For the current e820 processing, we don't > avoid GDT allocated in x86 EFI STUB because we have no information about > GDT in EFI STUB. You can see setup_e820() in boot/compressed/eboot.c I was wrong here, we can know gdt information by sgdt. Surely GDT is just an example, as Matt mentioned, there could be other stuffs we need avoid, or future change of uefi. > grabs EFI_LOADER_* regions as E820_TYPE_RAM. Now the GDT which is built > in EFI STUB code will live till kernel is ready to build its onw gdt(in > kernel/head_64.S). After that it become useless and can be reclaimed for > reusing. I believe Naoya must have read boot/compressed/eboot.c and > found setup_e820() code, then added EFI_LOADER_* for kaslr usage. > > Previously I thought e820 should take the precedence to be processed > on uefi system because continuous efi memory regions will be merged into > e820 regions. Using e820 can avoid those small efi regions or the left > part of efi regions being discarded directly when they are smaller than > kernel image size. Now considering this GDT in efi stub issue, we have to > try efi regions first on uefi system. Otherwise it may cause problem that > kernel could be decompressed onto the GDT tables of efi stub. > > In fact, I am wondering if we can reuse the gdt table which is built > before entering into long mode in boot/compressed/head_64.S, but not > allocate memory for GDT in efi stub. The thing is 32bit system doesn't > have this gdt table in boot compressing stage since it has one before > entering into protection mode. Just personal thought. > > > > What about any future uses we add? Who's going to remember to patch > > the kaslr code which now duplicates some of the EFI memory map logic? > > > > All of these problems can avoided if you just stick with > > EFI_CONVENTIONAL_MEMORY. > > Below is the dmesg with 'efi=debug' adding on my ovmf uefi kvm guest. > Since uefi could do a lot of thing when loading OS, E.g loading into any > kind of storage driver application, firmware usually reserve a chunk of > memory of hunderes of Mega Bytes. Like this one: > > ****** > [ +0.000000] efi: mem12: [Loader Data | ||WB|WT|WC|UC] range=[0x000000005c8eb000-0x000000007bfbdfff] (502MB) > ****** > > We can't say it's a big deal, but 500MB is also not so trival that we > can easily ignore it without any consideration. Just uefi spec doesn't > define the limitation of Loader memory and Conventional memory and if > Conventional memory has to be present. With my understanding, there won't > be any problem if only Loader memory exists. > > Further more, kaslr is not a precise searching job, no specific address > has to be positioned. Even it's OK that the physical address > randomization failed to find a new address randomly. So it's fine to me > that we don't take EFI_LOADER_* memory into consideration for kaslr. BUT > we need make this clear that why not, and if we can do anyting to make > it better. > > [ +0.000000] efi: SMBIOS=0x7fed5000 ACPI=0x7ff03000 ACPI 2.0=0x7ff03014 MEMATTR=0x7ea50218 > [ +0.000000] efi: mem00: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000000000-0x0000000000000fff] (0MB) > [ +0.000000] efi: mem01: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000001000-0x0000000000001fff] (0MB) > [ +0.000000] efi: mem02: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000002000-0x000000000009ffff] (0MB) > [ +0.000000] efi: mem03: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000100000-0x0000000000805fff] (7MB) > [ +0.000000] efi: mem04: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000806000-0x0000000000806fff] (0MB) > [ +0.000000] efi: mem05: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000807000-0x000000000081ffff] (0MB) > [ +0.000000] efi: mem06: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000820000-0x00000000012fffff] (10MB) > [ +0.000000] efi: mem07: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000001300000-0x0000000001ffffff] (13MB) > [ +0.000000] efi: mem08: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000002000000-0x0000000003614fff] (22MB) > [ +0.000000] efi: mem09: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000003615000-0x000000003d6b3fff] (928MB) > [ +0.000000] efi: mem10: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000003d6b4000-0x000000003fffffff] (41MB) > [ +0.000000] efi: mem11: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000040000000-0x000000005c8eafff] (456MB) > [ +0.000000] efi: mem12: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000005c8eb000-0x000000007bfbdfff] (502MB) > [ +0.000000] efi: mem13: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007bfbe000-0x000000007bfddfff] (0MB) > [ +0.000000] efi: mem14: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007bfde000-0x000000007e6effff] (39MB) > [ +0.000000] efi: mem15: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007e6f0000-0x000000007e7e3fff] (0MB) > [ +0.000000] efi: mem16: [Loader Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007e7e4000-0x000000007e90afff] (1MB) > [ +0.000000] efi: mem17: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007e90b000-0x000000007e914fff] (0MB) > [ +0.000000] efi: mem18: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007e915000-0x000000007ea46fff] (1MB) > [ +0.000000] efi: mem19: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ea47000-0x000000007eb8dfff] (1MB) > [ +0.000000] efi: mem20: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007eb8e000-0x000000007edd0fff] (2MB) > [ +0.000000] efi: mem21: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007edd1000-0x000000007edd5fff] (0MB) > [ +0.000000] efi: mem22: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007edd6000-0x000000007edddfff] (0MB) > [ +0.000000] efi: mem23: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007edde000-0x000000007ede2fff] (0MB) > [ +0.000000] efi: mem24: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ede3000-0x000000007ede8fff] (0MB) > [ +0.000000] efi: mem25: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ede9000-0x000000007ee12fff] (0MB) > [ +0.000000] efi: mem26: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ee13000-0x000000007ee23fff] (0MB) > [ +0.000000] efi: mem27: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ee24000-0x000000007f46dfff] (6MB) > [ +0.000000] efi: mem28: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007f46e000-0x000000007f477fff] (0MB) > [ +0.000000] efi: mem29: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007f478000-0x000000007fd23fff] (8MB) > [ +0.000000] efi: mem30: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007fd24000-0x000000007fd24fff] (0MB) > [ +0.000000] efi: mem31: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007fd25000-0x000000007fea3fff] (1MB) > [ +0.000000] efi: mem32: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007fea4000-0x000000007fed3fff] (0MB) > [ +0.000000] efi: mem33: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007fed4000-0x000000007fef7fff] (0MB) > [ +0.000000] efi: mem34: [Reserved | | | | | | | | |WB|WT|WC|UC] range=[0x000000007fef8000-0x000000007fefbfff] (0MB) > [ +0.000000] efi: mem35: [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007fefc000-0x000000007ff03fff] (0MB) > [ +0.000000] efi: mem36: [ACPI Memory NVS | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff04000-0x000000007ff07fff] (0MB) > [ +0.000000] efi: mem37: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff08000-0x000000007ff6afff] (0MB) > [ +0.000000] efi: mem38: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff6b000-0x000000007ff95fff] (0MB) > [ +0.000000] efi: mem39: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff96000-0x000000007ffa6fff] (0MB) > [ +0.000000] efi: mem40: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ffa7000-0x000000007ffcffff] (0MB) > [ +0.000000] efi: mem41: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ffd0000-0x000000007ffeffff] (0MB) > [ +0.000000] efi: mem42: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007fff0000-0x000000007fffffff] (0MB) > [ +0.000000] efi: mem43: [Runtime Data |RUN| | | | | | || | | |UC] range=[0x00000000ffe00000-0x00000000ffffffff] (2MB) > > > > > Honestly, how much memory do we expect to waste if we ignore > > EFI_LOADER_* regions? > > > > Also, the fact that you're referencing EFI-specific boot quirks in the > > kaslr code should be a massive red flag that you're playing with the > > innards of the EFI subsystem. > > Questions has been answered in above words.