2013-06-02 12:56:10

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 0/4] EFI 1:1 mapping

From: Borislav Petkov <[email protected]>

Hi all,

this one is 64-bit only for now and it has been tested only in kvm with
OVMF.

Keeping in mind the ihnerent efi b0rkedness left and right, I'd like to
be very cautious and conservative with this and not hurry anything until
it has been actually very well tested on a variety of baremetal boxes.

Please take a closer look and let me know.

Thanks.

Borislav Petkov (4):
efi: Convert runtime services function ptrs
x86, cpa: Map in an arbitrary pgd
x86, efi: Add an efi= kernel command line parameter
x86, efi: Map runtime services 1:1

arch/x86/boot/compressed/eboot.c | 2 +-
arch/x86/include/asm/efi.h | 30 +++---
arch/x86/include/asm/pgtable_types.h | 3 +-
arch/x86/mm/pageattr.c | 80 ++++++++++++----
arch/x86/platform/efi/efi.c | 177 +++++++++++++++++++++++++++++------
arch/x86/platform/efi/efi_stub_64.S | 48 ++++++++++
include/linux/efi.h | 28 +++---
7 files changed, 290 insertions(+), 78 deletions(-)

--
1.8.3.rc1.25.g423ecb0


2013-06-02 12:56:35

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

From: Borislav Petkov <[email protected]>

.. for passing miscellaneous options from the kernel command line.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/platform/efi/efi.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 82089d8b1954..aea4337f7023 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -88,6 +88,11 @@ static u64 active_size;

unsigned long x86_efi_facility;

+ /* 1:1 mapping of services regions */
+#define EFI_CFG_MAP11 BIT(0)
+
+static unsigned long efi_config;
+
/*
* Returns 1 if 'facility' is enabled, 0 otherwise.
*/
@@ -1167,3 +1172,14 @@ efi_status_t efi_query_variable_store(u32 attributes, unsigned long size)
return EFI_SUCCESS;
}
EXPORT_SYMBOL_GPL(efi_query_variable_store);
+
+static int __init parse_efi_cmdline(char *str)
+{
+ if (*str == '=')
+ str++;
+ if (!strncmp(str, "1:1_map", 7))
+ efi_config |= EFI_CFG_MAP11;
+
+ return 0;
+}
+early_param("efi", parse_efi_cmdline);
--
1.8.3.rc1.25.g423ecb0

2013-06-02 12:56:52

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 2/4] x86, cpa: Map in an arbitrary pgd

From: Borislav Petkov <[email protected]>

Add the ability to map pages in an arbitrary pgd.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/pgtable_types.h | 3 +-
arch/x86/mm/pageattr.c | 80 ++++++++++++++++++++++++++++--------
2 files changed, 65 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index e6423002c10b..0613e147f083 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -352,7 +352,8 @@ static inline void update_page_count(int level, unsigned long pages) { }
*/
extern pte_t *lookup_address(unsigned long address, unsigned int *level);
extern phys_addr_t slow_virt_to_phys(void *__address);
-
+extern void kernel_map_pages_in_pgd(pgd_t *pgd, unsigned long address,
+ unsigned numpages, unsigned long page_flags);
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_X86_PGTABLE_DEFS_H */
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bb32480c2d71..3d64e5fc2adc 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -30,6 +30,7 @@
*/
struct cpa_data {
unsigned long *vaddr;
+ pgd_t *pgd;
pgprot_t mask_set;
pgprot_t mask_clr;
int numpages;
@@ -322,17 +323,9 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
return prot;
}

-/*
- * Lookup the page table entry for a virtual address. Return a pointer
- * to the entry and the level of the mapping.
- *
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
- */
-pte_t *lookup_address(unsigned long address, unsigned int *level)
+static pte_t *
+__lookup_address_in_pgd(pgd_t *pgd, unsigned long address, unsigned int *level)
{
- pgd_t *pgd = pgd_offset_k(address);
pud_t *pud;
pmd_t *pmd;

@@ -361,8 +354,30 @@ pte_t *lookup_address(unsigned long address, unsigned int *level)

return pte_offset_kernel(pmd, address);
}
+
+/*
+ * Lookup the page table entry for a virtual address. Return a pointer
+ * to the entry and the level of the mapping.
+ *
+ * Note: We return pud and pmd either when the entry is marked large
+ * or when the present bit is not set. Otherwise we would return a
+ * pointer to a nonexisting mapping.
+ */
+pte_t *lookup_address(unsigned long address, unsigned int *level)
+{
+ return __lookup_address_in_pgd(pgd_offset_k(address), address, level);
+}
EXPORT_SYMBOL_GPL(lookup_address);

+pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
+ unsigned int *level)
+{
+ if (cpa->pgd)
+ return __lookup_address_in_pgd(cpa->pgd, address, level);
+
+ return lookup_address(address, level);
+}
+
/*
* This is necessary because __pa() does not work on some
* kinds of memory, like vmalloc() or the alloc_remap()
@@ -437,7 +452,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
* Check for races, another CPU might have split this page
* up already:
*/
- tmp = lookup_address(address, &level);
+ tmp = _lookup_address_cpa(cpa, address, &level);
if (tmp != kpte)
goto out_unlock;

@@ -543,7 +558,8 @@ out_unlock:
}

static int
-__split_large_page(pte_t *kpte, unsigned long address, struct page *base)
+__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
+ struct page *base)
{
pte_t *pbase = (pte_t *)page_address(base);
unsigned long pfn, pfninc = 1;
@@ -556,7 +572,7 @@ __split_large_page(pte_t *kpte, unsigned long address, struct page *base)
* Check for races, another CPU might have split this page
* up for us already:
*/
- tmp = lookup_address(address, &level);
+ tmp = _lookup_address_cpa(cpa, address, &level);
if (tmp != kpte) {
spin_unlock(&pgd_lock);
return 1;
@@ -632,7 +648,8 @@ __split_large_page(pte_t *kpte, unsigned long address, struct page *base)
return 0;
}

-static int split_large_page(pte_t *kpte, unsigned long address)
+static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
+ unsigned long address)
{
struct page *base;

@@ -644,7 +661,7 @@ static int split_large_page(pte_t *kpte, unsigned long address)
if (!base)
return -ENOMEM;

- if (__split_large_page(kpte, address, base))
+ if (__split_large_page(cpa, kpte, address, base))
__free_page(base);

return 0;
@@ -697,7 +714,10 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
else
address = *cpa->vaddr;
repeat:
- kpte = lookup_address(address, &level);
+ if (cpa->pgd)
+ kpte = __lookup_address_in_pgd(cpa->pgd, address, &level);
+ else
+ kpte = _lookup_address_cpa(cpa, address, &level);
if (!kpte)
return __cpa_process_fault(cpa, address, primary);

@@ -761,7 +781,7 @@ repeat:
/*
* We have to split the large page:
*/
- err = split_large_page(kpte, address);
+ err = split_large_page(cpa, kpte, address);
if (!err) {
/*
* Do a global flush tlb after splitting the large page
@@ -910,6 +930,8 @@ static int change_page_attr_set_clr(unsigned long *addr, int numpages,
int ret, cache, checkalias;
unsigned long baddr = 0;

+ memset(&cpa, 0, sizeof(cpa));
+
/*
* Check, if we are requested to change a not supported
* feature:
@@ -1434,6 +1456,30 @@ bool kernel_page_present(struct page *page)

#endif /* CONFIG_DEBUG_PAGEALLOC */

+void kernel_map_pages_in_pgd(pgd_t *pgd, unsigned long address,
+ unsigned numpages, unsigned long page_flags)
+{
+ struct cpa_data cpa = {
+ .vaddr = &address,
+ .pgd = pgd,
+ .numpages = numpages,
+ .mask_set = __pgprot(0),
+ .mask_clr = __pgprot(0),
+ .flags = 0
+ };
+
+ if (!(__supported_pte_mask & _PAGE_NX))
+ return;
+
+ if (!(page_flags & _PAGE_NX))
+ cpa.mask_clr = __pgprot(_PAGE_NX);
+
+ cpa.mask_set = __pgprot(_PAGE_PRESENT | page_flags);
+
+ __change_page_attr_set_clr(&cpa, 0);
+ __flush_tlb_all();
+}
+
/*
* The testcases use internal knowledge of the implementation that shouldn't
* be exposed to the rest of the kernel. Include these directly here.
--
1.8.3.rc1.25.g423ecb0

2013-06-02 12:56:41

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 1/4] efi: Convert runtime services function ptrs

From: Borislav Petkov <[email protected]>

... to void * like the boot services and lose all the void * casts. No
functionality change.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/boot/compressed/eboot.c | 2 +-
arch/x86/include/asm/efi.h | 28 ++++++++++++++--------------
include/linux/efi.h | 28 ++++++++++++++--------------
3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index 35ee62fccf98..4060c8daf05e 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -266,7 +266,7 @@ static efi_status_t setup_efi_vars(struct boot_params *params)
while (data && data->next)
data = (struct setup_data *)(unsigned long)data->next;

- status = efi_call_phys4((void *)sys_table->runtime->query_variable_info,
+ status = efi_call_phys4(sys_table->runtime->query_variable_info,
EFI_VARIABLE_NON_VOLATILE |
EFI_VARIABLE_BOOTSERVICE_ACCESS |
EFI_VARIABLE_RUNTIME_ACCESS, &store_size,
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 2fb5d5884e23..5b33686b6995 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -52,40 +52,40 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
u64 arg4, u64 arg5, u64 arg6);

#define efi_call_phys0(f) \
- efi_call0((void *)(f))
+ efi_call0((f))
#define efi_call_phys1(f, a1) \
- efi_call1((void *)(f), (u64)(a1))
+ efi_call1((f), (u64)(a1))
#define efi_call_phys2(f, a1, a2) \
- efi_call2((void *)(f), (u64)(a1), (u64)(a2))
+ efi_call2((f), (u64)(a1), (u64)(a2))
#define efi_call_phys3(f, a1, a2, a3) \
- efi_call3((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3))
+ efi_call3((f), (u64)(a1), (u64)(a2), (u64)(a3))
#define efi_call_phys4(f, a1, a2, a3, a4) \
- efi_call4((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+ efi_call4((f), (u64)(a1), (u64)(a2), (u64)(a3), \
(u64)(a4))
#define efi_call_phys5(f, a1, a2, a3, a4, a5) \
- efi_call5((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+ efi_call5((f), (u64)(a1), (u64)(a2), (u64)(a3), \
(u64)(a4), (u64)(a5))
#define efi_call_phys6(f, a1, a2, a3, a4, a5, a6) \
- efi_call6((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+ efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
(u64)(a4), (u64)(a5), (u64)(a6))

#define efi_call_virt0(f) \
- efi_call0((void *)(efi.systab->runtime->f))
+ efi_call0((efi.systab->runtime->f))
#define efi_call_virt1(f, a1) \
- efi_call1((void *)(efi.systab->runtime->f), (u64)(a1))
+ efi_call1((efi.systab->runtime->f), (u64)(a1))
#define efi_call_virt2(f, a1, a2) \
- efi_call2((void *)(efi.systab->runtime->f), (u64)(a1), (u64)(a2))
+ efi_call2((efi.systab->runtime->f), (u64)(a1), (u64)(a2))
#define efi_call_virt3(f, a1, a2, a3) \
- efi_call3((void *)(efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
+ efi_call3((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3))
#define efi_call_virt4(f, a1, a2, a3, a4) \
- efi_call4((void *)(efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
+ efi_call4((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3), (u64)(a4))
#define efi_call_virt5(f, a1, a2, a3, a4, a5) \
- efi_call5((void *)(efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
+ efi_call5((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3), (u64)(a4), (u64)(a5))
#define efi_call_virt6(f, a1, a2, a3, a4, a5, a6) \
- efi_call6((void *)(efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
+ efi_call6((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))

extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 2bc0ad78d058..21ae6b3c0359 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -287,20 +287,20 @@ typedef struct {

typedef struct {
efi_table_hdr_t hdr;
- unsigned long get_time;
- unsigned long set_time;
- unsigned long get_wakeup_time;
- unsigned long set_wakeup_time;
- unsigned long set_virtual_address_map;
- unsigned long convert_pointer;
- unsigned long get_variable;
- unsigned long get_next_variable;
- unsigned long set_variable;
- unsigned long get_next_high_mono_count;
- unsigned long reset_system;
- unsigned long update_capsule;
- unsigned long query_capsule_caps;
- unsigned long query_variable_info;
+ void *get_time;
+ void *set_time;
+ void *get_wakeup_time;
+ void *set_wakeup_time;
+ void *set_virtual_address_map;
+ void *convert_pointer;
+ void *get_variable;
+ void *get_next_variable;
+ void *set_variable;
+ void *get_next_high_mono_count;
+ void *reset_system;
+ void *update_capsule;
+ void *query_capsule_caps;
+ void *query_variable_info;
} efi_runtime_services_t;

typedef efi_status_t efi_get_time_t (efi_time_t *tm, efi_time_cap_t *tc);
--
1.8.3.rc1.25.g423ecb0

2013-06-02 12:57:05

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 4/4] x86, efi: Map runtime services 1:1

From: Borislav Petkov <[email protected]>

Due to the braindead design of EFI, we cannot map runtime services more
than once for the duration of a booted system. Thus, if we want to use
EFI runtime services in a kexec'ed kernel, maybe the only possible and
sensible approach would be to map them 1:1 so that when the kexec kernel
loads, it can simply call those addresses without the need for remapping
(which doesn't work anyway).

Furthermore, this mapping approach could be of help with b0rked EFI
implementations for a different set of reasons.

This implementation is 64-bit only for now and it boots fine in kvm with
OVMF BIOS.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/efi.h | 2 +
arch/x86/platform/efi/efi.c | 161 +++++++++++++++++++++++++++++-------
arch/x86/platform/efi/efi_stub_64.S | 48 +++++++++++
3 files changed, 180 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 5b33686b6995..1c9c0a5cc280 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -41,6 +41,8 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);

#define EFI_LOADER_SIGNATURE "EL64"

+extern pgd_t *efi_pgt;
+
extern u64 efi_call0(void *fp);
extern u64 efi_call1(void *fp, u64 arg1);
extern u64 efi_call2(void *fp, u64 arg1, u64 arg2);
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index aea4337f7023..36ecefb54495 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -93,6 +93,8 @@ unsigned long x86_efi_facility;

static unsigned long efi_config;

+extern bool use_11_map;
+
/*
* Returns 1 if 'facility' is enabled, 0 otherwise.
*/
@@ -763,6 +765,25 @@ static int __init efi_runtime_init(void)
* virtual mode.
*/
efi.get_time = phys_efi_get_time;
+
+ if (efi_config & EFI_CFG_MAP11) {
+#define efi_phys_assign(f) \
+ efi_phys.f = (efi_ ##f## _t *)runtime->f
+
+ efi_phys_assign(set_time);
+ efi_phys_assign(get_wakeup_time);
+ efi_phys_assign(set_wakeup_time);
+ efi_phys_assign(get_variable);
+ efi_phys_assign(get_next_variable);
+ efi_phys_assign(set_variable);
+ efi_phys_assign(get_next_high_mono_count);
+ efi_phys_assign(reset_system);
+ efi_phys_assign(set_virtual_address_map);
+ efi_phys_assign(query_variable_info);
+ efi_phys_assign(update_capsule);
+ efi_phys_assign(query_capsule_caps);
+ }
+
early_iounmap(runtime, sizeof(efi_runtime_services_t));

return 0;
@@ -954,6 +975,61 @@ void efi_memory_uc(u64 addr, unsigned long size)
set_memory_uc(addr, npages);
}

+static void __init __runtime_map_11(efi_memory_desc_t *md)
+{
+ pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
+ unsigned long page_flags = 0;
+
+ if (md->type == EFI_RUNTIME_SERVICES_DATA ||
+ md->type == EFI_BOOT_SERVICES_DATA)
+ page_flags |= _PAGE_NX;
+
+ if (!(md->attribute & EFI_MEMORY_WB))
+ page_flags |= _PAGE_PCD;
+
+ kernel_map_pages_in_pgd(pgd + pgd_index(md->phys_addr),
+ md->phys_addr,
+ md->num_pages,
+ page_flags);
+
+ md->virt_addr = md->phys_addr;
+}
+
+static int __init __runtime_ioremap(efi_memory_desc_t *md)
+{
+ u64 end, systab, start_pfn, end_pfn;
+ unsigned long size;
+ void *va;
+
+ size = md->num_pages << EFI_PAGE_SHIFT;
+ end = md->phys_addr + size;
+ start_pfn = PFN_DOWN(md->phys_addr);
+ end_pfn = PFN_UP(end);
+
+ if (pfn_range_is_mapped(start_pfn, end_pfn)) {
+ va = __va(md->phys_addr);
+
+ if (!(md->attribute & EFI_MEMORY_WB))
+ efi_memory_uc((u64)(unsigned long)va, size);
+ } else
+ va = efi_ioremap(md->phys_addr, size, md->type, md->attribute);
+
+ md->virt_addr = (u64) (unsigned long) va;
+ if (!va) {
+ pr_err("ioremap of 0x%llX failed!\n",
+ (unsigned long long)md->phys_addr);
+ return 1;
+ }
+
+ systab = (u64) (unsigned long) efi_phys.systab;
+ if (md->phys_addr <= systab && systab < end) {
+ systab += md->virt_addr - md->phys_addr;
+ efi.systab = (efi_system_table_t *) (unsigned long) systab;
+ }
+
+ return 0;
+}
+
/*
* This function will switch the EFI runtime services to virtual mode.
* Essentially, look through the EFI memmap and map every region that
@@ -964,11 +1040,11 @@ void efi_memory_uc(u64 addr, unsigned long size)
*/
void __init efi_enter_virtual_mode(void)
{
+ pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
efi_memory_desc_t *md, *prev_md = NULL;
efi_status_t status;
- unsigned long size;
- u64 end, systab, start_pfn, end_pfn;
- void *p, *va, *new_memmap = NULL;
+ void *p, *new_memmap = NULL;
+ unsigned num_pgds;
int count = 0;

efi.systab = NULL;
@@ -1017,33 +1093,18 @@ void __init efi_enter_virtual_mode(void)
md->type != EFI_BOOT_SERVICES_DATA)
continue;

- size = md->num_pages << EFI_PAGE_SHIFT;
- end = md->phys_addr + size;
-
- start_pfn = PFN_DOWN(md->phys_addr);
- end_pfn = PFN_UP(end);
- if (pfn_range_is_mapped(start_pfn, end_pfn)) {
- va = __va(md->phys_addr);
-
- if (!(md->attribute & EFI_MEMORY_WB))
- efi_memory_uc((u64)(unsigned long)va, size);
- } else
- va = efi_ioremap(md->phys_addr, size,
- md->type, md->attribute);
-
- md->virt_addr = (u64) (unsigned long) va;
-
- if (!va) {
- pr_err("ioremap of 0x%llX failed!\n",
- (unsigned long long)md->phys_addr);
+ /*
+ * XXX: need to map the region which contains
+ * SetVirtualAddressMap so that we can call it here.
+ * Probably can be removed after we map boot services 1:1
+ * too.
+ */
+ if (__runtime_ioremap(md))
continue;
- }

- systab = (u64) (unsigned long) efi_phys.systab;
- if (md->phys_addr <= systab && systab < end) {
- systab += md->virt_addr - md->phys_addr;
- efi.systab = (efi_system_table_t *) (unsigned long) systab;
- }
+ if (efi_config & EFI_CFG_MAP11)
+ __runtime_map_11(md);
+
new_memmap = krealloc(new_memmap,
(count + 1) * memmap.desc_size,
GFP_KERNEL);
@@ -1052,7 +1113,8 @@ void __init efi_enter_virtual_mode(void)
count++;
}

- BUG_ON(!efi.systab);
+ if (!(efi_config & EFI_CFG_MAP11))
+ BUG_ON(!efi.systab);

status = phys_efi_set_virtual_address_map(
memmap.desc_size * count,
@@ -1072,6 +1134,41 @@ void __init efi_enter_virtual_mode(void)
*
* Call EFI services through wrapper functions.
*/
+ if (efi_config & EFI_CFG_MAP11) {
+#define efi_assign(efi, f) \
+ efi.systab->runtime->f = efi_phys.f
+
+ efi.systab->runtime = kzalloc(sizeof(efi_runtime_services_t),
+ GFP_KERNEL);
+ BUG_ON(!efi.systab->runtime);
+
+ efi_assign(efi, get_time);
+ efi_assign(efi, set_time);
+ efi_assign(efi, get_wakeup_time);
+ efi_assign(efi, set_wakeup_time);
+ efi_assign(efi, get_variable);
+ efi_assign(efi, get_next_variable);
+ efi_assign(efi, set_variable);
+ efi_assign(efi, get_next_high_mono_count);
+ efi_assign(efi, reset_system);
+ efi_assign(efi, query_variable_info);
+ efi_assign(efi, update_capsule);
+ efi_assign(efi, query_capsule_caps);
+
+ /*
+ * map-in low kernel mapping for passing arguments to EFI
+ * functions.
+ */
+ num_pgds = pgd_index(VMALLOC_START - 1) - pgd_index(PAGE_OFFSET);
+
+ memcpy(pgd + pgd_index(PAGE_OFFSET),
+ init_mm.pgd + pgd_index(PAGE_OFFSET),
+ sizeof(pgd_t) * num_pgds);
+
+ efi_pgt = (pgd_t *)(unsigned long)real_mode_header->trampoline_pgd;;
+ use_11_map = true;
+ }
+
efi.runtime_version = efi_systab.hdr.revision;
efi.get_time = virt_efi_get_time;
efi.set_time = virt_efi_set_time;
@@ -1086,8 +1183,10 @@ void __init efi_enter_virtual_mode(void)
efi.query_variable_info = virt_efi_query_variable_info;
efi.update_capsule = virt_efi_update_capsule;
efi.query_capsule_caps = virt_efi_query_capsule_caps;
- if (__supported_pte_mask & _PAGE_NX)
- runtime_code_page_mkexec();
+
+ if (!(efi_config & EFI_CFG_MAP11))
+ if (__supported_pte_mask & _PAGE_NX)
+ runtime_code_page_mkexec();

kfree(new_memmap);
}
diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index 4c07ccab8146..2f93dcad3804 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -34,10 +34,40 @@
mov %rsi, %cr0; \
mov (%rsp), %rsp

+/* happily stolen from gcc, see __flush_tlb_global() */
+#define FLUSH_TLB_ALL \
+ movq %cr4, %r14; \
+ movq %r14, %r13; \
+ and $0x7f, %r13b; \
+ movq %r13, %cr4; \
+ movq %r14, %cr4
+
+/*
+ * %r15 is a non-volatile register and is preserved by UEFI so use
+ * it for stashing previous PGD in there.
+ */
+#define SWITCH_PGT \
+ cmpb $0, use_11_map; \
+ je 1f; \
+ movq %cr3, %r15; \
+ movq efi_pgt, %rax; \
+ movq %rax, %cr3; \
+ FLUSH_TLB_ALL; \
+ 1:
+
+#define RESTORE_PGT \
+ cmpb $0, use_11_map; \
+ je 2f; \
+ movq %r15, %cr3; \
+ FLUSH_TLB_ALL; \
+ 2:
+
ENTRY(efi_call0)
SAVE_XMM
subq $32, %rsp
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
@@ -47,7 +77,9 @@ ENTRY(efi_call1)
SAVE_XMM
subq $32, %rsp
mov %rsi, %rcx
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
@@ -57,7 +89,9 @@ ENTRY(efi_call2)
SAVE_XMM
subq $32, %rsp
mov %rsi, %rcx
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
@@ -68,7 +102,9 @@ ENTRY(efi_call3)
subq $32, %rsp
mov %rcx, %r8
mov %rsi, %rcx
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
@@ -80,7 +116,9 @@ ENTRY(efi_call4)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
@@ -93,7 +131,9 @@ ENTRY(efi_call5)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $48, %rsp
RESTORE_XMM
ret
@@ -109,8 +149,16 @@ ENTRY(efi_call6)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
+ SWITCH_PGT
call *%rdi
+ RESTORE_PGT
addq $48, %rsp
RESTORE_XMM
ret
ENDPROC(efi_call6)
+
+GLOBAL(efi_pgt)
+ .quad 0
+
+ENTRY(use_11_map)
+ .byte 0
--
1.8.3.rc1.25.g423ecb0

2013-06-02 22:56:33

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

I've just run Windows 8 under a hacked up copy of OVMF that dumps the
data passed to SetVirtualAddressMap. It seems that Windows *is* mapping
the runtime services to higher addresses - so presumably the 1:1 mapping
is in addition to the virtual mapping.

--
Matthew Garrett | [email protected]

2013-06-03 08:11:58

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Sun, Jun 02, 2013 at 11:56:20PM +0100, Matthew Garrett wrote:
> I've just run Windows 8 under a hacked up copy of OVMF that dumps
> the data passed to SetVirtualAddressMap. It seems that Windows *is*
> mapping the runtime services to higher addresses - so presumably the
> 1:1 mapping is in addition to the virtual mapping.

But but, once we call SetVirtualAddressMap with the set of addresses of
the runtime services, only those can be used after, right? If so, we
can't have both (this is at least my understanding)...

Thanks.

2013-06-03 14:27:26

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 2013-06-03 at 10:11 +0200, Borislav Petkov wrote:
> On Sun, Jun 02, 2013 at 11:56:20PM +0100, Matthew Garrett wrote:
> > I've just run Windows 8 under a hacked up copy of OVMF that dumps
> > the data passed to SetVirtualAddressMap. It seems that Windows *is*
> > mapping the runtime services to higher addresses - so presumably the
> > 1:1 mapping is in addition to the virtual mapping.
>
> But but, once we call SetVirtualAddressMap with the set of addresses of
> the runtime services, only those can be used after, right? If so, we
> can't have both (this is at least my understanding)...

That's correct. I think not calling SetVirtualAddressMap() and just
using a 1:1 mapping is far safer (having looked at what tianocore does
for SetVirtualAddressMap()). The chances are that all the UEFI bioses
are only tested with windows, so the pointer chases it has to do to
switch address maps only work with the operations windows does.

James

2013-06-03 14:30:20

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 07:27:22AM -0700, James Bottomley wrote:

> That's correct. I think not calling SetVirtualAddressMap() and just
> using a 1:1 mapping is far safer (having looked at what tianocore does
> for SetVirtualAddressMap()). The chances are that all the UEFI bioses
> are only tested with windows, so the pointer chases it has to do to
> switch address maps only work with the operations windows does.

Windows calls SetVirtualAddressMap(), so the only way these systems have
been tested is with SetVirtualAddressMap().

--
Matthew Garrett | [email protected]

2013-06-03 14:32:59

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 10:11:48AM +0200, Borislav Petkov wrote:
> On Sun, Jun 02, 2013 at 11:56:20PM +0100, Matthew Garrett wrote:
> > I've just run Windows 8 under a hacked up copy of OVMF that dumps
> > the data passed to SetVirtualAddressMap. It seems that Windows *is*
> > mapping the runtime services to higher addresses - so presumably the
> > 1:1 mapping is in addition to the virtual mapping.
>
> But but, once we call SetVirtualAddressMap with the set of addresses of
> the runtime services, only those can be used after, right? If so, we
> can't have both (this is at least my understanding)...

We can only pass one set of addresses to SetVirtualAddressMap(), but it
doesn't seem like there's any intrinsic reason we can't the runtime
regions mapped to multiple virtual addresses.

--
Matthew Garrett | [email protected]

2013-06-03 14:38:07

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 2013-06-03 at 15:30 +0100, Matthew Garrett wrote:
> On Mon, Jun 03, 2013 at 07:27:22AM -0700, James Bottomley wrote:
>
> > That's correct. I think not calling SetVirtualAddressMap() and just
> > using a 1:1 mapping is far safer (having looked at what tianocore does
> > for SetVirtualAddressMap()). The chances are that all the UEFI bioses
> > are only tested with windows, so the pointer chases it has to do to
> > switch address maps only work with the operations windows does.
>
> Windows calls SetVirtualAddressMap(), so the only way these systems have
> been tested is with SetVirtualAddressMap().

I know, but that's not what I said.

If you look at the implementation, SetVirtualAddressMap() does a massive
pointer chase through the images. It not only tries to relocate the
text and data, but it also tries to relocate all the users of the data.
Some of these sources of data are boot time and some runtime. Those
both need to be relocated by a separate pointer chase. What we saw with
the QueryVariableInfo() problem was that a boot time pointer wasn't
relocated. That's got to mean that windows only calls QueryVariableInfo
from runtime.

My point is that if we elect to call SetVirtualAddressMap() we'll be
restricted to only making the calls at boot time that windows does
otherwise we'll end up with these unrelocated pointers. That's a huge
nasty verification burden on us. Alternatively, if we never call
SetVirtualAddressMap() it seems to me that we just don't have to worry
about pointer relocation issues. Thus, I think it would be better we
use the 1:1 mapping instead of calling SetVirtualAddressMap().

James

2013-06-03 14:54:18

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 03 Jun, at 03:32:52PM, Matthew Garrett wrote:
> We can only pass one set of addresses to SetVirtualAddressMap(), but it
> doesn't seem like there's any intrinsic reason we can't the runtime
> regions mapped to multiple virtual addresses.

Indeed. That's the approach I took with my 1:1 series from last year. If
Windows is mapping things at higher addresses like you said, then
they're probably doing the same.

--
Matt Fleming, Intel Open Source Technology Center

2013-06-03 15:21:30

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 07:38:02AM -0700, James Bottomley wrote:
> On Mon, 2013-06-03 at 15:30 +0100, Matthew Garrett wrote:
> > Windows calls SetVirtualAddressMap(), so the only way these systems have
> > been tested is with SetVirtualAddressMap().
>
> I know, but that's not what I said.
>
> If you look at the implementation, SetVirtualAddressMap() does a massive
> pointer chase through the images. It not only tries to relocate the
> text and data, but it also tries to relocate all the users of the data.
> Some of these sources of data are boot time and some runtime. Those
> both need to be relocated by a separate pointer chase. What we saw with
> the QueryVariableInfo() problem was that a boot time pointer wasn't
> relocated. That's got to mean that windows only calls QueryVariableInfo
> from runtime.

Sure.

> My point is that if we elect to call SetVirtualAddressMap() we'll be
> restricted to only making the calls at boot time that windows does
> otherwise we'll end up with these unrelocated pointers. That's a huge
> nasty verification burden on us. Alternatively, if we never call
> SetVirtualAddressMap() it seems to me that we just don't have to worry
> about pointer relocation issues. Thus, I think it would be better we
> use the 1:1 mapping instead of calling SetVirtualAddressMap().

Some hardware just arbitrarily fails some calls if
SetVirtualAddressMap() isn't called. As you pointed out, the only
situation that these systems are ever tested in is the one where calls
are made in roughly the same order as Windows, ie:

Calls made in boot services:

GetTime()
Getvariable()
ExitBootServices()

Calls made in runtime:

SetVirtualAddressMap()
GetNextVariable()
GetVariable()
SetVariable()

So far I haven't been able to convince Windows to make any other runtime
calls, which makes me a little unhappy about even calling
QueryVariableInfo() during runtime, but on the other hand our options
there are either to call it or to kill Samsungs, so I think we're stuck
with it. But, overall, refusing to call SetVirtualAddressMap() simply
isn't an option.

--
Matthew Garrett | [email protected]

2013-06-03 16:18:10

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 2013-06-03 at 16:21 +0100, Matthew Garrett wrote:
> On Mon, Jun 03, 2013 at 07:38:02AM -0700, James Bottomley wrote:
> > On Mon, 2013-06-03 at 15:30 +0100, Matthew Garrett wrote:
> > > Windows calls SetVirtualAddressMap(), so the only way these systems have
> > > been tested is with SetVirtualAddressMap().
> >
> > I know, but that's not what I said.
> >
> > If you look at the implementation, SetVirtualAddressMap() does a massive
> > pointer chase through the images. It not only tries to relocate the
> > text and data, but it also tries to relocate all the users of the data.
> > Some of these sources of data are boot time and some runtime. Those
> > both need to be relocated by a separate pointer chase. What we saw with
> > the QueryVariableInfo() problem was that a boot time pointer wasn't
> > relocated. That's got to mean that windows only calls QueryVariableInfo
> > from runtime.
>
> Sure.
>
> > My point is that if we elect to call SetVirtualAddressMap() we'll be
> > restricted to only making the calls at boot time that windows does
> > otherwise we'll end up with these unrelocated pointers. That's a huge
> > nasty verification burden on us. Alternatively, if we never call
> > SetVirtualAddressMap() it seems to me that we just don't have to worry
> > about pointer relocation issues. Thus, I think it would be better we
> > use the 1:1 mapping instead of calling SetVirtualAddressMap().
>
> Some hardware just arbitrarily fails some calls if
> SetVirtualAddressMap() isn't called. As you pointed out, the only
> situation that these systems are ever tested in is the one where calls
> are made in roughly the same order as Windows, ie:
>
> Calls made in boot services:
>
> GetTime()
> Getvariable()
> ExitBootServices()
>
> Calls made in runtime:
>
> SetVirtualAddressMap()
> GetNextVariable()
> GetVariable()
> SetVariable()
>
> So far I haven't been able to convince Windows to make any other runtime
> calls, which makes me a little unhappy about even calling
> QueryVariableInfo() during runtime, but on the other hand our options
> there are either to call it or to kill Samsungs, so I think we're stuck
> with it. But, overall, refusing to call SetVirtualAddressMap() simply
> isn't an option.

I don't entirely buy that. All EFI programs run with the physical
address map, therefore every API an EFI program uses is also tested, at
boot time only, obviously. However, the ExitBootServices() code seems
to be much simpler, so I don't think it will cause too many bugs. The
UEFI test suites also seem to try UEFI calls before and after
ExitBootServices(), so I think relying on a 1:1 mapping looks safer to
me.

James


2013-06-03 16:24:43

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 09:18:06AM -0700, James Bottomley wrote:

> I don't entirely buy that. All EFI programs run with the physical
> address map, therefore every API an EFI program uses is also tested, at
> boot time only, obviously.

That seems optimistic. Windows never calls QueryVariableInfo() during
boot services, so what makes you think doing so has ever been tested?

> However, the ExitBootServices() code seems to be much simpler, so I
> don't think it will cause too many bugs. The UEFI test suites also
> seem to try UEFI calls before and after ExitBootServices(), so I think
> relying on a 1:1 mapping looks safer to me.

I have no expectation that the majority of system vendors run the test
suite, but I have every expectation that every system vendor runs
Windows. We should behave as close to the tested mechanism as possible,
ie do what Windows does - and that includes calling
SetVirtualAddressMap().

--
Matthew Garrett | [email protected]

2013-06-03 16:35:13

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 2013-06-03 at 17:24 +0100, Matthew Garrett wrote:
> On Mon, Jun 03, 2013 at 09:18:06AM -0700, James Bottomley wrote:
>
> > I don't entirely buy that. All EFI programs run with the physical
> > address map, therefore every API an EFI program uses is also tested, at
> > boot time only, obviously.
>
> That seems optimistic. Windows never calls QueryVariableInfo() during
> boot services, so what makes you think doing so has ever been tested?

It's used by the UEFI shell package ... every system which boots to the
shell automatically tests this. I know no locked down UEFI system ships
with a shell but almost every system in test has a Shell in some form,
so I think its fairly safe to call it from boot services.

> > However, the ExitBootServices() code seems to be much simpler, so I
> > don't think it will cause too many bugs. The UEFI test suites also
> > seem to try UEFI calls before and after ExitBootServices(), so I think
> > relying on a 1:1 mapping looks safer to me.
>
> I have no expectation that the majority of system vendors run the test
> suite, but I have every expectation that every system vendor runs
> Windows. We should behave as close to the tested mechanism as possible,
> ie do what Windows does - and that includes calling
> SetVirtualAddressMap().

OK, so we basically agree to disagree. When I looked at the actual
SetVirtualAddressMap() implementation, my heart skipped several beats:
it's a massive set of pointer chasing heuristics which is bound to be
incorrect in some instance, just because its so complex and easy to get
wrong. Every time it's incorrect, we'll get a physical pointer used in
a virtual space and an oops within the UEFI code. Conversely, I think
the engineering risk that a particular UEFI call is expecting to have
had SetVirtualAddressMap called is much lower.

However, what about a compromise: why don't we implement 1:1 mapping and
then call SetVirtualAddressMap with the 1:1 map ... in theory the
pointer chases should then be nops (it will be replacing the physical
address with the same virtual address), so everything should just work
and anything the UEFI vendor missed will still work because the physical
address will work also in this scenario.

James

2013-06-03 16:42:45

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 09:35:07AM -0700, James Bottomley wrote:
> On Mon, 2013-06-03 at 17:24 +0100, Matthew Garrett wrote:
> > That seems optimistic. Windows never calls QueryVariableInfo() during
> > boot services, so what makes you think doing so has ever been tested?
>
> It's used by the UEFI shell package ... every system which boots to the
> shell automatically tests this. I know no locked down UEFI system ships
> with a shell but almost every system in test has a Shell in some form,
> so I think its fairly safe to call it from boot services.

Why do you persist in this belief that all system vendors are going to
have run a shell, let alone any kind of test suite? That runs counter to
everything we've learned about x86 firmware. People verify that it runs
Windows and then ship it.

> However, what about a compromise: why don't we implement 1:1 mapping and
> then call SetVirtualAddressMap with the 1:1 map ... in theory the
> pointer chases should then be nops (it will be replacing the physical
> address with the same virtual address), so everything should just work
> and anything the UEFI vendor missed will still work because the physical
> address will work also in this scenario.

The problem there is that you're saying "In theory". We know that
Windows doesn't behave this way, so we have no legitimate expectation
that it'll work. We know that it doesn't on some Apple hardware.

--
Matthew Garrett | [email protected]

2013-06-03 18:05:09

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 2013-06-03 at 17:42 +0100, Matthew Garrett wrote:
> On Mon, Jun 03, 2013 at 09:35:07AM -0700, James Bottomley wrote:
> > On Mon, 2013-06-03 at 17:24 +0100, Matthew Garrett wrote:
> > > That seems optimistic. Windows never calls QueryVariableInfo() during
> > > boot services, so what makes you think doing so has ever been tested?
> >
> > It's used by the UEFI shell package ... every system which boots to the
> > shell automatically tests this. I know no locked down UEFI system ships
> > with a shell but almost every system in test has a Shell in some form,
> > so I think its fairly safe to call it from boot services.
>
> Why do you persist in this belief that all system vendors are going to
> have run a shell, let alone any kind of test suite? That runs counter to
> everything we've learned about x86 firmware. People verify that it runs
> Windows and then ship it.

I don't, but I find it hard to believe no vendor ever runs an EFI shell
on their systems. The feedback I got from a couple of OEMs is that they
use the shell mostly for internal testing.

> > However, what about a compromise: why don't we implement 1:1 mapping and
> > then call SetVirtualAddressMap with the 1:1 map ... in theory the
> > pointer chases should then be nops (it will be replacing the physical
> > address with the same virtual address), so everything should just work
> > and anything the UEFI vendor missed will still work because the physical
> > address will work also in this scenario.
>
> The problem there is that you're saying "In theory". We know that
> Windows doesn't behave this way, so we have no legitimate expectation
> that it'll work. We know that it doesn't on some Apple hardware.

Fine, you say we need to call SetVirtualAddressMap because windows does,
I agree, I'm just saying we get additional safety from calling it with
the 1:1 map ... I don't see what the problem is.

James


2013-06-03 18:11:17

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 11:05:03AM -0700, James Bottomley wrote:
> On Mon, 2013-06-03 at 17:42 +0100, Matthew Garrett wrote:
> > Why do you persist in this belief that all system vendors are going to
> > have run a shell, let alone any kind of test suite? That runs counter to
> > everything we've learned about x86 firmware. People verify that it runs
> > Windows and then ship it.
>
> I don't, but I find it hard to believe no vendor ever runs an EFI shell
> on their systems. The feedback I got from a couple of OEMs is that they
> use the shell mostly for internal testing.

There are vendors that do, and I expect that there are vendors who
don't. There are vendors who will make changes to firmware and run the
EFI test suite, and there are vendors who will make changes to firmware
and run Windows.

> > The problem there is that you're saying "In theory". We know that
> > Windows doesn't behave this way, so we have no legitimate expectation
> > that it'll work. We know that it doesn't on some Apple hardware.
>
> Fine, you say we need to call SetVirtualAddressMap because windows does,
> I agree, I'm just saying we get additional safety from calling it with
> the 1:1 map ... I don't see what the problem is.

No. I'm saying that calling it with the 1:1 map is something very
different to the behaviour of Windows, and I'm saying that doing so is
known to cause variable writes on some Apple hardware to stop working.
If we're aiming for maximum compatibility, we need to call
SetVirtualAddressMap() with addresses above the canonicalisation hole.

--
Matthew Garrett | [email protected]

2013-06-03 21:19:13

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, 2013-06-03 at 19:11 +0100, Matthew Garrett wrote:
> > > The problem there is that you're saying "In theory". We know that
> > > Windows doesn't behave this way, so we have no legitimate expectation
> > > that it'll work. We know that it doesn't on some Apple hardware.
> >
> > Fine, you say we need to call SetVirtualAddressMap because windows does,
> > I agree, I'm just saying we get additional safety from calling it with
> > the 1:1 map ... I don't see what the problem is.
>
> No. I'm saying that calling it with the 1:1 map is something very
> different to the behaviour of Windows, and I'm saying that doing so is
> known to cause variable writes on some Apple hardware to stop working.
> If we're aiming for maximum compatibility, we need to call
> SetVirtualAddressMap() with addresses above the canonicalisation hole.

OK, so tell me this problem: it's a new one one me. I think you're
saying if we don't call SetVirtualAddressMap with a mapping above a
certain value, some Apple system breaks somehow? (how?).

James


2013-06-03 21:29:43

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 02:19:05PM -0700, James Bottomley wrote:
> On Mon, 2013-06-03 at 19:11 +0100, Matthew Garrett wrote:
> > No. I'm saying that calling it with the 1:1 map is something very
> > different to the behaviour of Windows, and I'm saying that doing so is
> > known to cause variable writes on some Apple hardware to stop working.
> > If we're aiming for maximum compatibility, we need to call
> > SetVirtualAddressMap() with addresses above the canonicalisation hole.
>
> OK, so tell me this problem: it's a new one one me. I think you're
> saying if we don't call SetVirtualAddressMap with a mapping above a
> certain value, some Apple system breaks somehow? (how?).

SetVariable() returns an error. Digging through disassembly of the code
generating the error got me to the point of realising it was doing some
sort of pointer arithmetic and then giving up.

--
Matthew Garrett | [email protected]

2013-06-04 08:16:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/4] EFI 1:1 mapping

On Mon, Jun 03, 2013 at 03:54:12PM +0100, Matt Fleming wrote:
> On Mon, 03 Jun, at 03:32:52PM, Matthew Garrett wrote:
> > We can only pass one set of addresses to SetVirtualAddressMap(), but it
> > doesn't seem like there's any intrinsic reason we can't the runtime
> > regions mapped to multiple virtual addresses.
>
> Indeed. That's the approach I took with my 1:1 series from last year. If
> Windows is mapping things at higher addresses like you said, then
> they're probably doing the same.

Right, the way I've done it now is to do the virtual mapping
unconditionally, then do the 1:1 mapping and call SetVirtualAddressMap
(f*cking camelcase is so nasty to type - that's why they need whole IDEs
to program :-)) with that map, if "efi=1:1_map" has been passed on the
cmd line.

After the call, we use *only* the 1:1 map but the virtual mapping is
still there.

The initial approach to addressing the b0rked Macs would then be to
never use the 1:1 map on them.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-04 12:19:07

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Sun, Jun 02, 2013 at 02:56:09PM +0200, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> .. for passing miscellaneous options from the kernel command line.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/platform/efi/efi.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 82089d8b1954..aea4337f7023 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -88,6 +88,11 @@ static u64 active_size;
>
> unsigned long x86_efi_facility;
>
> + /* 1:1 mapping of services regions */
> +#define EFI_CFG_MAP11 BIT(0)
> +
> +static unsigned long efi_config;
> +
> /*
> * Returns 1 if 'facility' is enabled, 0 otherwise.
> */
> @@ -1167,3 +1172,14 @@ efi_status_t efi_query_variable_store(u32 attributes, unsigned long size)
> return EFI_SUCCESS;
> }
> EXPORT_SYMBOL_GPL(efi_query_variable_store);
> +
> +static int __init parse_efi_cmdline(char *str)

note to self:

#ifdef CONFIG_X86_64

> +{
> + if (*str == '=')
> + str++;
> + if (!strncmp(str, "1:1_map", 7))
> + efi_config |= EFI_CFG_MAP11;

#endif

to temporarily workaround the 64-bit only requirement.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 10:07:50

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/4] efi: Convert runtime services function ptrs

On Sun, 02 Jun, at 02:56:07PM, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> ... to void * like the boot services and lose all the void * casts. No
> functionality change.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/boot/compressed/eboot.c | 2 +-
> arch/x86/include/asm/efi.h | 28 ++++++++++++++--------------
> include/linux/efi.h | 28 ++++++++++++++--------------
> 3 files changed, 29 insertions(+), 29 deletions(-)

Thanks! I've chucked this into my 'urgent' queue because it's a pretty
straight forward cleanup and there's no point in allowing it to bitrot
any further (the change to eboot.c is already stale).

--
Matt Fleming, Intel Open Source Technology Center

2013-06-06 10:22:39

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86, cpa: Map in an arbitrary pgd

On Sun, 02 Jun, at 02:56:08PM, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> Add the ability to map pages in an arbitrary pgd.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/include/asm/pgtable_types.h | 3 +-
> arch/x86/mm/pageattr.c | 80 ++++++++++++++++++++++++++++--------
> 2 files changed, 65 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index e6423002c10b..0613e147f083 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -352,7 +352,8 @@ static inline void update_page_count(int level, unsigned long pages) { }
> */
> extern pte_t *lookup_address(unsigned long address, unsigned int *level);
> extern phys_addr_t slow_virt_to_phys(void *__address);
> -
> +extern void kernel_map_pages_in_pgd(pgd_t *pgd, unsigned long address,
> + unsigned numpages, unsigned long page_flags);
> #endif /* !__ASSEMBLY__ */
>
> #endif /* _ASM_X86_PGTABLE_DEFS_H */
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index bb32480c2d71..3d64e5fc2adc 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -30,6 +30,7 @@
> */
> struct cpa_data {
> unsigned long *vaddr;
> + pgd_t *pgd;
> pgprot_t mask_set;
> pgprot_t mask_clr;
> int numpages;

[...]

> @@ -697,7 +714,10 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
> else
> address = *cpa->vaddr;
> repeat:
> - kpte = lookup_address(address, &level);
> + if (cpa->pgd)
> + kpte = __lookup_address_in_pgd(cpa->pgd, address, &level);
> + else
> + kpte = _lookup_address_cpa(cpa, address, &level);

Don't you also need to initialise .pgd in __set_pages_p() and
__set_pages_np()?

--
Matt Fleming, Intel Open Source Technology Center

2013-06-06 10:42:29

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Sun, 02 Jun, at 02:56:09PM, Borislav Petkov wrote:
> +
> +static int __init parse_efi_cmdline(char *str)
> +{
> + if (*str == '=')
> + str++;
> + if (!strncmp(str, "1:1_map", 7))
> + efi_config |= EFI_CFG_MAP11;
> +
> + return 0;
> +}
> +early_param("efi", parse_efi_cmdline);

This is fine for testing, but I genuinely think that this should be on
by default once these patches are final.

--
Matt Fleming, Intel Open Source Technology Center

2013-06-06 13:14:44

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On Sun, 02 Jun, at 02:56:10PM, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> Due to the braindead design of EFI, we cannot map runtime services more
> than once for the duration of a booted system. Thus, if we want to use
> EFI runtime services in a kexec'ed kernel, maybe the only possible and
> sensible approach would be to map them 1:1 so that when the kexec kernel
> loads, it can simply call those addresses without the need for remapping
> (which doesn't work anyway).
>
> Furthermore, this mapping approach could be of help with b0rked EFI
> implementations for a different set of reasons.
>
> This implementation is 64-bit only for now and it boots fine in kvm with
> OVMF BIOS.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/include/asm/efi.h | 2 +
> arch/x86/platform/efi/efi.c | 161 +++++++++++++++++++++++++++++-------
> arch/x86/platform/efi/efi_stub_64.S | 48 +++++++++++
> 3 files changed, 180 insertions(+), 31 deletions(-)

This patch makes my TunnelMountain machine spin in handle_pte_fault()
when triggering one of the "firmware makes references to physical
addresses" code paths.

I'll try and dig into this tomorrow to figure out what's going on.

--
Matt Fleming, Intel Open Source Technology Center

2013-06-06 13:24:24

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86, cpa: Map in an arbitrary pgd

On Thu, Jun 06, 2013 at 11:22:33AM +0100, Matt Fleming wrote:
> > @@ -697,7 +714,10 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
> > else
> > address = *cpa->vaddr;
> > repeat:
> > - kpte = lookup_address(address, &level);
> > + if (cpa->pgd)
> > + kpte = __lookup_address_in_pgd(cpa->pgd, address, &level);
> > + else
> > + kpte = _lookup_address_cpa(cpa, address, &level);
>
> Don't you also need to initialise .pgd in __set_pages_p() and
> __set_pages_np()?

Hmm, I don't think so. The idea is to leave the current functionality in
pageattr.c untouched. Currently, it maps PTEs in init_mm.pgd by default
because this is the kernel PGD.

For EFI, we switch the PGD and we want to map the PTEs in this other
PGD, thus we have our own kernel_map_pages_in_pgd() which passes in the
PGD we want to map in through struct cpa_data.

Then, we differentiate the PGD down that code by looking at cpa->pgd.

This was at least the idea.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 13:26:21

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 11:42:24AM +0100, Matt Fleming wrote:
> On Sun, 02 Jun, at 02:56:09PM, Borislav Petkov wrote:
> > +
> > +static int __init parse_efi_cmdline(char *str)
> > +{
> > + if (*str == '=')
> > + str++;
> > + if (!strncmp(str, "1:1_map", 7))
> > + efi_config |= EFI_CFG_MAP11;
> > +
> > + return 0;
> > +}
> > +early_param("efi", parse_efi_cmdline);
>
> This is fine for testing, but I genuinely think that this should be on
> by default once these patches are final.

This would break the Macs, remember?

We could make it be the default though and flip the logic so that users
can fall back to the current ioremap functionality, i.e. boot with
"efi=no_1:1_map".

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 13:29:26

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On Thu, Jun 06, 2013 at 02:14:39PM +0100, Matt Fleming wrote:
> This patch makes my TunnelMountain machine spin in handle_pte_fault()
> when triggering one of the "firmware makes references to physical
> addresses" code paths.
>
> I'll try and dig into this tomorrow to figure out what's going on.

Hmmm, looks like we have forgotten to map an EFI region and we do a #PF
when executing the EFI code.

If so, I'm thinking we don't want to touch the #PF code and teach it
about EFI so maybe we should make sure we've mapped *everything* EFI
runtime would need before we switch PGDs...

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 13:30:30

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86, cpa: Map in an arbitrary pgd

On Thu, 06 Jun, at 03:24:06PM, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 11:22:33AM +0100, Matt Fleming wrote:
> > > @@ -697,7 +714,10 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
> > > else
> > > address = *cpa->vaddr;
> > > repeat:
> > > - kpte = lookup_address(address, &level);
> > > + if (cpa->pgd)
> > > + kpte = __lookup_address_in_pgd(cpa->pgd, address, &level);
> > > + else
> > > + kpte = _lookup_address_cpa(cpa, address, &level);
> >
> > Don't you also need to initialise .pgd in __set_pages_p() and
> > __set_pages_np()?
>
> Hmm, I don't think so. The idea is to leave the current functionality in
> pageattr.c untouched. Currently, it maps PTEs in init_mm.pgd by default
> because this is the kernel PGD.

I meant because the .pgd member is uninitialised and contains garbage.

--
Matt Fleming, Intel Open Source Technology Center

2013-06-06 13:33:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86, cpa: Map in an arbitrary pgd

On Thu, Jun 06, 2013 at 02:30:23PM +0100, Matt Fleming wrote:
> I meant because the .pgd member is uninitialised and contains garbage.

Oh yeah, good catch! Fixed.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 13:36:14

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, 06 Jun, at 03:26:03PM, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 11:42:24AM +0100, Matt Fleming wrote:
> > On Sun, 02 Jun, at 02:56:09PM, Borislav Petkov wrote:
> > > +
> > > +static int __init parse_efi_cmdline(char *str)
> > > +{
> > > + if (*str == '=')
> > > + str++;
> > > + if (!strncmp(str, "1:1_map", 7))
> > > + efi_config |= EFI_CFG_MAP11;
> > > +
> > > + return 0;
> > > +}
> > > +early_param("efi", parse_efi_cmdline);
> >
> > This is fine for testing, but I genuinely think that this should be on
> > by default once these patches are final.
>
> This would break the Macs, remember?
>
> We could make it be the default though and flip the logic so that users
> can fall back to the current ioremap functionality, i.e. boot with
> "efi=no_1:1_map".

Yeah, or white/blacklist them. We've already had enough of "Oh, your
machine used to work but now it's bust? Yeah, turn on this cmdline
parameter to get things working again".

But the point is that the funnies that have been seen on the Macs should
be treated as the unusual case. The number of machines I've seen that
require this 1:1 map is staggering, especially once you start poking at
some of the other runtime services, like UpdateCapsule().

--
Matt Fleming, Intel Open Source Technology Center

2013-06-06 15:58:41

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On Thu, Jun 06, 2013 at 03:29:08PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 02:14:39PM +0100, Matt Fleming wrote:
> > This patch makes my TunnelMountain machine spin in handle_pte_fault()
> > when triggering one of the "firmware makes references to physical
> > addresses" code paths.
> >
> > I'll try and dig into this tomorrow to figure out what's going on.
>
> Hmmm, looks like we have forgotten to map an EFI region and we do a #PF
> when executing the EFI code.

How about this one - I don't know whether it would flood dmesg, I guess
you'll have to try.

We could try using blockconsole to catch dmesg to an usb drive, though,
even if you don't have serial or netconsole or such. :-).

--
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 654be4ae3047..7a6129afdff1 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1021,6 +1021,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
/* Get the faulting address: */
address = read_cr2();

+ if (read_cr3() == (unsigned long)real_mode_header->trampoline_pgd)
+ pr_err("%s: #PF addr: 0x%lx\n", __func__, address);
+
/*
* Detect and handle instructions that would cause a page fault for
* both a tracked kernel page and a userspace page.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 17:51:09

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 03:26:03PM +0200, Borislav Petkov wrote:

> This would break the Macs, remember?

I think the Macs will be fine as long as we're passing the high mappings
into SetVirtualAddressMap().

--
Matthew Garrett | [email protected]

2013-06-06 18:51:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 06:50:52PM +0100, Matthew Garrett wrote:
> On Thu, Jun 06, 2013 at 03:26:03PM +0200, Borislav Petkov wrote:
>
> > This would break the Macs, remember?
>
> I think the Macs will be fine as long as we're passing the high mappings
> into SetVirtualAddressMap().

Right, on those we'll fall back to the current mappings and simply not
have the 1:1 thing.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 19:29:01

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On 06/06/2013 08:58 AM, Borislav Petkov wrote:
> --
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 654be4ae3047..7a6129afdff1 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1021,6 +1021,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
> /* Get the faulting address: */
> address = read_cr2();
>
> + if (read_cr3() == (unsigned long)real_mode_header->trampoline_pgd)
> + pr_err("%s: #PF addr: 0x%lx\n", __func__, address);
> +
> /*
> * Detect and handle instructions that would cause a page fault for
> * both a tracked kernel page and a userspace page.
>

Or we could materialize mappings for this specific PGD. However, adding
a read of %cr3 in __do_page_fault sounds expensive.

-hpa

2013-06-06 19:35:56

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 08:51:40PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 06:50:52PM +0100, Matthew Garrett wrote:
> > On Thu, Jun 06, 2013 at 03:26:03PM +0200, Borislav Petkov wrote:
> >
> > > This would break the Macs, remember?
> >
> > I think the Macs will be fine as long as we're passing the high mappings
> > into SetVirtualAddressMap().
>
> Right, on those we'll fall back to the current mappings and simply not
> have the 1:1 thing.

No, I think that's the wrong thing to do. We should set up the current
mappings and the 1:1 mappings, and pass the current mappings through
SetVirtualAddressMap(). That matches the behaviour of Windows.

--
Matthew Garrett | [email protected]

2013-06-06 19:36:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On Thu, Jun 06, 2013 at 12:28:20PM -0700, H. Peter Anvin wrote:
> Or we could materialize mappings for this specific PGD. However,
> adding a read of %cr3 in __do_page_fault sounds expensive.

Yes, I think we want to make sure all mappings are there when we do an
EFI runtime call so that we never #PF while it executes.

Matt mentioned on IRC that the it could be that his EFI runtime is
referencing EFI_RESERVED area which we don't map. However, we need to
confirm/disprove that first, as it is currently only a hunch.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 19:38:45

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On 06/06/2013 12:36 PM, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 12:28:20PM -0700, H. Peter Anvin wrote:
>> Or we could materialize mappings for this specific PGD. However,
>> adding a read of %cr3 in __do_page_fault sounds expensive.
>
> Yes, I think we want to make sure all mappings are there when we do an
> EFI runtime call so that we never #PF while it executes.
>
> Matt mentioned on IRC that the it could be that his EFI runtime is
> referencing EFI_RESERVED area which we don't map. However, we need to
> confirm/disprove that first, as it is currently only a hunch.

If it is true, then maybe a #PF handler which materializes mappings on
demand might be the right thing to do.

Again, it would be interesting to stop a WinX VM in the process of doing
an EFI call and see exactly what it has set up in its page tables.

-hpa

2013-06-06 19:41:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 08:35:48PM +0100, Matthew Garrett wrote:
> No, I think that's the wrong thing to do. We should set up the current
> mappings and the 1:1 mappings, and pass the current mappings through
> SetVirtualAddressMap(). That matches the behaviour of Windows.

And when do we use the 1:1 mappings and when the current mappings when
doing runtime calls?

Also, would the 1:1 mappings even work if not passed through
SetVirtualAddressMap? I'm sensing a "yes" but I don't know...

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 19:54:55

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 09:41:34PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 08:35:48PM +0100, Matthew Garrett wrote:
> > No, I think that's the wrong thing to do. We should set up the current
> > mappings and the 1:1 mappings, and pass the current mappings through
> > SetVirtualAddressMap(). That matches the behaviour of Windows.
>
> And when do we use the 1:1 mappings and when the current mappings when
> doing runtime calls?

We want both to be available when we're making the call, but I think we
should probably enter via the high addresses. The only reason we're
doing this at all is that some systems don't update all of their
pointers from physical mode, and we'd prefer them to work rather than
fault...

--
Matthew Garrett | [email protected]

2013-06-06 20:07:37

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 08:54:50PM +0100, Matthew Garrett wrote:
> We want both to be available when we're making the call, but I think
> we should probably enter via the high addresses. The only reason we're
> doing this at all is that some systems don't update all of their
> pointers from physical mode, and we'd prefer them to work rather than
> fault...

Actually, we do the 1:1 thing so that EFI runtime works in a kexec
kernel too. Which won't work if we use the high addresses.

However, if we can use the 1:1 map *after* SetVirtualAddressMap() has
been called with the high mappings, then my issue is solved - we drop
to using 1:1 in the kexec kernel only. But I don't think that is the
case...

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 20:18:39

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 10:07:05PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 08:54:50PM +0100, Matthew Garrett wrote:
> > We want both to be available when we're making the call, but I think
> > we should probably enter via the high addresses. The only reason we're
> > doing this at all is that some systems don't update all of their
> > pointers from physical mode, and we'd prefer them to work rather than
> > fault...
>
> Actually, we do the 1:1 thing so that EFI runtime works in a kexec
> kernel too. Which won't work if we use the high addresses.

kexec seems like a lower priority than compatibility. Perhaps keep the
efi argument for people who want to use kexec? hpa suggested allocating
a fixed high area for UEFI mappings, which would also solve this.

> However, if we can use the 1:1 map *after* SetVirtualAddressMap() has
> been called with the high mappings, then my issue is solved - we drop
> to using 1:1 in the kexec kernel only. But I don't think that is the
> case...

Yeah, calling SetVirtualAddressMap() with high addresses is going to
result in the firmware having a bunch of pointers to high addresses.

--
Matthew Garrett | [email protected]

2013-06-06 20:27:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 09:18:28PM +0100, Matthew Garrett wrote:
> kexec seems like a lower priority than compatibility. Perhaps keep the
> efi argument for people who want to use kexec?

This is what I currently have in the code: if you boot with efi=1:1_map,
you get them.

> hpa suggested allocating a fixed high area for UEFI mappings, which
> would also solve this.

I guess we can do that too.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 20:31:03

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 10:27:17PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 09:18:28PM +0100, Matthew Garrett wrote:
> > kexec seems like a lower priority than compatibility. Perhaps keep the
> > efi argument for people who want to use kexec?
>
> This is what I currently have in the code: if you boot with efi=1:1_map,
> you get them.

Well, we want the 1:1 mappings to exist all the time. The only thing the
option should change is whether they're passed to SetVirtualAddressMap()
or not.

--
Matthew Garrett | [email protected]

2013-06-06 20:44:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 09:30:57PM +0100, Matthew Garrett wrote:
> Well, we want the 1:1 mappings to exist all the time. The only
> thing the option should change is whether they're passed to
> SetVirtualAddressMap() or not.

But can you call them even if they haven't been passed through
SetVirtualAddressMap, *after* SetVirtualAddressMap has been called?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 20:51:04

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 10:44:10PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 09:30:57PM +0100, Matthew Garrett wrote:
> > Well, we want the 1:1 mappings to exist all the time. The only
> > thing the option should change is whether they're passed to
> > SetVirtualAddressMap() or not.
>
> But can you call them even if they haven't been passed through
> SetVirtualAddressMap, *after* SetVirtualAddressMap has been called?

What do you mean by "call them"? I don't think we ever want to call by
physical address, other than maybe in the kexec case. The only reason we
really care about the physical addresses being mapped 1:1 is that some
pointers may not have been updated.

--
Matthew Garrett | [email protected]

2013-06-06 21:02:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 09:50:57PM +0100, Matthew Garrett wrote:
> What do you mean by "call them"? I don't think we ever want to call by
> physical address, other than maybe in the kexec case. The only reason
> we really care about the physical addresses being mapped 1:1 is that
> some pointers may not have been updated.

I want to be able to call the runtime services in the kexec kernel.
Which means, the kexec kernel would simply map the runtime code/data
regions 1:1 and then use the physical addresses to call the runtime
services.

Question is: would that work even if SetVirtualAddressMap has already
run in the original kernel and with virtual addresses?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-06 21:03:47

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

On Thu, Jun 06, 2013 at 11:02:18PM +0200, Borislav Petkov wrote:
> On Thu, Jun 06, 2013 at 09:50:57PM +0100, Matthew Garrett wrote:
> > What do you mean by "call them"? I don't think we ever want to call by
> > physical address, other than maybe in the kexec case. The only reason
> > we really care about the physical addresses being mapped 1:1 is that
> > some pointers may not have been updated.
>
> I want to be able to call the runtime services in the kexec kernel.
> Which means, the kexec kernel would simply map the runtime code/data
> regions 1:1 and then use the physical addresses to call the runtime
> services.
>
> Question is: would that work even if SetVirtualAddressMap has already
> run in the original kernel and with virtual addresses?

No. You'll need to have an option for that case.

--
Matthew Garrett | [email protected]

2013-06-10 12:55:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

On Thu, Jun 06, 2013 at 12:38:10PM -0700, H. Peter Anvin wrote:
> On 06/06/2013 12:36 PM, Borislav Petkov wrote:
> > On Thu, Jun 06, 2013 at 12:28:20PM -0700, H. Peter Anvin wrote:
> >> Or we could materialize mappings for this specific PGD. However,
> >> adding a read of %cr3 in __do_page_fault sounds expensive.
> >
> > Yes, I think we want to make sure all mappings are there when we do an
> > EFI runtime call so that we never #PF while it executes.
> >
> > Matt mentioned on IRC that the it could be that his EFI runtime is
> > referencing EFI_RESERVED area which we don't map. However, we need to
> > confirm/disprove that first, as it is currently only a hunch.

FWIW,

booting the patchset on my Dell looks good here. Booting at least, I
don't know about other stuff. If you have an EFI test suite or want me
to try stuff out, let me know. efibootmgr output looks sane too.

Btw, I've added a printk to the code so that we know that we've managed
switching to the 1:1 thing:

[ 0.073119] efi: Using 1:1 map.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-11 06:49:18

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/4] efi: Convert runtime services function ptrs

On Thu, 06 Jun, at 11:07:45AM, Matt Fleming wrote:
> Thanks! I've chucked this into my 'urgent' queue because it's a pretty
> straight forward cleanup and there's no point in allowing it to bitrot
> any further (the change to eboot.c is already stale).

OK, I chickened out of sending this in my latest pull request after
reading Linus' -rc5 email about him not wanting to see any non-critical
changes. I've stuck it in the 'next' branch with the rest of the stuff
for v3.11.

--
Matt Fleming, Intel Open Source Technology Center

2013-06-11 09:33:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/4] efi: Convert runtime services function ptrs

On Tue, Jun 11, 2013 at 07:49:12AM +0100, Matt Fleming wrote:
> OK, I chickened out of sending this in my latest pull request
> after reading Linus' -rc5 email about him not wanting to see any
> non-critical changes. I've stuck it in the 'next' branch with the rest
> of the stuff for v3.11.

Yep, did the same with a couple of edac patches too - cleanups can wait.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-02-10 04:04:34

by Yuhong Bao

[permalink] [raw]
Subject: Linux, UEFI, and Chromebooks (was RE: [PATCH 0/4] EFI 1:1 mapping)

> On Mon, Jun 03, 2013 at 09:35:07AM -0700, James Bottomley wrote:
>> On Mon, 2013-06-03 at 17:24 +0100, Matthew Garrett wrote:
>>> That seems optimistic. Windows never calls QueryVariableInfo() during
>>> boot services, so what makes you think doing so has ever been tested?
>>
>> It's used by the UEFI shell package ... every system which boots to the
>> shell automatically tests this. I know no locked down UEFI system ships
>> with a shell but almost every system in test has a Shell in some form,
>> so I think its fairly safe to call it from boot services.
>
> Why do you persist in this belief that all system vendors are going to
> have run a shell, let alone any kind of test suite? That runs counter to
> everything we've learned about x86 firmware. People verify that it runs
> Windows and then ship it.
What is frustrating here is that Google decided that x86 Chromebooks should use different firmware, otherwise it would be easier to convince vendor to fix these firmware bugs.

Yuhong Bao -