2020-01-02 04:31:03

by Dan Williams

[permalink] [raw]
Subject: [PATCH v3 0/4] efi: Fix handling of multiple efi_fake_mem= entries

Changes since v2 [1]:
- Move the efi_memmap_free() until efi_memmap_install() is committed to
installing the new map. (Dave).

- Handle the case of a memblock allocated memmap being freed after the slab
allocator is up. Just use memblock_free_late() for that case rather
than warn. (Prompted by Dave's feedback on how many successful
efi_memmap_free() calls occur during a boot).

- Not changed was anything additional related to Dave's concern about
efi_fake_mem= being applied to overlapping entries. I tested
"efi_fake_mem=4G@9G:0x40000,4G@12G:0x40000" which triggers the second
entry to overwrite the first as well as another entry. The result is
reasonable and functional for what is otherwise garbage input:

efi: mem53: [Conventional Memory| | |SP| | | | | | |WB|WT|WC|UC] range=[0x240000000-0x2ffffffff] (3072MB)
efi: mem54: [Conventional Memory| | |SP| | | | | | |WB|WT|WC|UC] range=[0x300000000-0x33fffffff] (1024MB)
efi: mem55: [Conventional Memory| | |SP| | | | | | |WB|WT|WC|UC] range=[0x340000000-0x3ffffffff] (3072MB)
efi: mem56: [Conventional Memory| | | | | | | | | |WB|WT|WC|UC] range=[0x400000000-0x43fffffff] (1024MB)

# cat /proc/iomem | grep Sof
240000000-3ffffffff : Soft Reserved

[1]: http://lore.kernel.org/r/157782985777.367056.14741265874314204783.stgit@dwillia2-desk3.amr.corp.intel.com

---

While testing an upcoming patchset to enhance the "soft reservation"
implementation it started crashing when rebased on v5.5-rc3. This
uncovered a few bugs in the efi_fake_mem= handling and
efi_memmap_alloc() leaks.

---

Copied from patch4:

Dave noticed that when specifying multiple efi_fake_mem= entries only
the last entry was successfully being reflected in the efi memory map.
This is due to the fact that the efi_memmap_insert() is being called
multiple times, but on successive invocations the insertion should be
applied to the last new memmap rather than the original map at
efi_fake_memmap() entry.

Rework efi_fake_memmap() to install the new memory map after each
efi_fake_mem= entry is parsed.

This also fixes an issue in efi_fake_memmap() that caused it to litter
emtpy entries into the end of the efi memory map. The empty entry causes
efi_memmap_insert() to attempt more memmap splits / copies than
efi_memmap_split_count() accounted for when sizing the new map.

BUG: unable to handle page fault for address: ffffffffff281000
[..]
RIP: 0010:efi_memmap_insert+0x11d/0x191
[..]
Call Trace:
? bgrt_init+0xbe/0xbe
? efi_arch_mem_reserve+0x1cb/0x228
? acpi_parse_bgrt+0xa/0xd
? acpi_table_parse+0x86/0xb8
? acpi_boot_init+0x494/0x4e3
? acpi_parse_x2apic+0x87/0x87
? setup_acpi_sci+0xa2/0xa2
? setup_arch+0x8db/0x9e1
? start_kernel+0x6a/0x547
? secondary_startup_64+0xb6/0xc0

Commit af1648984828 "x86/efi: Update e820 with reserved EFI boot
services data to fix kexec breakage" is listed in Fixes: since it
introduces more occurrences where efi_memmap_insert() is invoked after
an efi_fake_mem= configuration has been parsed. Previously the side
effects of vestigial empty entries were benign, but with commit
af1648984828 that follow-on efi_memmap_insert() invocation triggers the
above crash signature.

---

Dan Williams (4):
efi: Add a flags parameter to efi_memory_map
efi: Add tracking for dynamically allocated memmaps
efi: Fix efi_memmap_alloc() leaks
efi: Fix handling of multiple efi_fake_mem= entries


arch/x86/platform/efi/efi.c | 2 +
arch/x86/platform/efi/quirks.c | 11 ++++---
drivers/firmware/efi/fake_mem.c | 37 +++++++++++++------------
drivers/firmware/efi/memmap.c | 58 ++++++++++++++++++++++++++++++---------
include/linux/efi.h | 13 +++++++--
5 files changed, 81 insertions(+), 40 deletions(-)


2020-01-02 04:31:04

by Dan Williams

[permalink] [raw]
Subject: [PATCH v3 2/4] efi: Add tracking for dynamically allocated memmaps

In preparation for fixing efi_memmap_alloc() leaks, add support for
recording whether the memmap was dynamically allocated from slab,
memblock, or is the original physical memmap provided by the platform.

Cc: Taku Izumi <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
arch/x86/platform/efi/efi.c | 2 +-
arch/x86/platform/efi/quirks.c | 11 ++++++-----
drivers/firmware/efi/fake_mem.c | 5 +++--
drivers/firmware/efi/memmap.c | 16 ++++++++++------
include/linux/efi.h | 8 ++++++--
5 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 38d44f36d5ed..7086afbb84fd 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -333,7 +333,7 @@ static void __init efi_clean_memmap(void)
u64 size = efi.memmap.nr_map - n_removal;

pr_warn("Removing %d invalid memory map entries.\n", n_removal);
- efi_memmap_install(efi.memmap.phys_map, size);
+ efi_memmap_install(efi.memmap.phys_map, size, 0);
}
}

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f8f0220b6a66..4a71c790f9c3 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -244,6 +244,7 @@ EXPORT_SYMBOL_GPL(efi_query_variable_store);
void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
{
phys_addr_t new_phys, new_size;
+ unsigned long flags = 0;
struct efi_mem_range mr;
efi_memory_desc_t md;
int num_entries;
@@ -272,8 +273,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
num_entries += efi.memmap.nr_map;

new_size = efi.memmap.desc_size * num_entries;
-
- new_phys = efi_memmap_alloc(num_entries);
+ new_phys = efi_memmap_alloc(num_entries, &flags);
if (!new_phys) {
pr_err("Could not allocate boot services memmap\n");
return;
@@ -288,7 +288,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
efi_memmap_insert(&efi.memmap, new, &mr);
early_memunmap(new, new_size);

- efi_memmap_install(new_phys, num_entries);
+ efi_memmap_install(new_phys, num_entries, flags);
e820__range_update(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED);
e820__update_table(e820_table);
}
@@ -408,6 +408,7 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md)
void __init efi_free_boot_services(void)
{
phys_addr_t new_phys, new_size;
+ unsigned long flags = 0;
efi_memory_desc_t *md;
int num_entries = 0;
void *new, *new_md;
@@ -463,7 +464,7 @@ void __init efi_free_boot_services(void)
return;

new_size = efi.memmap.desc_size * num_entries;
- new_phys = efi_memmap_alloc(num_entries);
+ new_phys = efi_memmap_alloc(num_entries, &flags);
if (!new_phys) {
pr_err("Failed to allocate new EFI memmap\n");
return;
@@ -493,7 +494,7 @@ void __init efi_free_boot_services(void)

memunmap(new);

- if (efi_memmap_install(new_phys, num_entries)) {
+ if (efi_memmap_install(new_phys, num_entries, flags)) {
pr_err("Could not install new EFI memmap\n");
return;
}
diff --git a/drivers/firmware/efi/fake_mem.c b/drivers/firmware/efi/fake_mem.c
index bb9fc70d0cfa..7e53e5520548 100644
--- a/drivers/firmware/efi/fake_mem.c
+++ b/drivers/firmware/efi/fake_mem.c
@@ -39,6 +39,7 @@ void __init efi_fake_memmap(void)
int new_nr_map = efi.memmap.nr_map;
efi_memory_desc_t *md;
phys_addr_t new_memmap_phy;
+ unsigned long flags = 0;
void *new_memmap;
int i;

@@ -55,7 +56,7 @@ void __init efi_fake_memmap(void)
}

/* allocate memory for new EFI memmap */
- new_memmap_phy = efi_memmap_alloc(new_nr_map);
+ new_memmap_phy = efi_memmap_alloc(new_nr_map, &flags);
if (!new_memmap_phy)
return;

@@ -73,7 +74,7 @@ void __init efi_fake_memmap(void)
/* swap into new EFI memmap */
early_memunmap(new_memmap, efi.memmap.desc_size * new_nr_map);

- efi_memmap_install(new_memmap_phy, new_nr_map);
+ efi_memmap_install(new_memmap_phy, new_nr_map, flags);

/* print new EFI memmap */
efi_print_memmap();
diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index 813674ef9000..2b81ee6858a9 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -32,6 +32,7 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
/**
* efi_memmap_alloc - Allocate memory for the EFI memory map
* @num_entries: Number of entries in the allocated map.
+ * @flags: Late map, memblock alloc, slab alloc flags
*
* Depending on whether mm_init() has already been invoked or not,
* either memblock or "normal" page allocation is used.
@@ -39,20 +40,23 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
* Returns the physical address of the allocated memory map on
* success, zero on failure.
*/
-phys_addr_t __init efi_memmap_alloc(unsigned int num_entries)
+phys_addr_t __init efi_memmap_alloc(unsigned int num_entries, unsigned long *flags)
{
unsigned long size = num_entries * efi.memmap.desc_size;

- if (slab_is_available())
+ if (slab_is_available()) {
+ *flags |= EFI_MEMMAP_SLAB;
return __efi_memmap_alloc_late(size);
+ }

+ *flags |= EFI_MEMMAP_MEMBLOCK;
return __efi_memmap_alloc_early(size);
}

/**
* __efi_memmap_init - Common code for mapping the EFI memory map
* @data: EFI memory map data
- * @flags: Use early or late mapping function?
+ * @flags: Use early or late mapping function, and allocator
*
* This function takes care of figuring out which function to use to
* map the EFI memory map in efi.memmap based on how far into the boot
@@ -192,10 +196,10 @@ int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size)
*
* Returns zero on success, a negative error code on failure.
*/
-int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map)
+int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map,
+ unsigned long flags)
{
struct efi_memory_map_data data;
- unsigned long flags;

efi_memmap_unmap();

@@ -203,7 +207,7 @@ int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map)
data.size = efi.memmap.desc_size * nr_map;
data.desc_version = efi.memmap.desc_version;
data.desc_size = efi.memmap.desc_size;
- flags = efi.memmap.flags & EFI_MEMMAP_LATE;
+ flags |= efi.memmap.flags & EFI_MEMMAP_LATE;

return __efi_memmap_init(&data, flags);
}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index b8e930f5ff77..fa2668a992ae 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -796,6 +796,8 @@ struct efi_memory_map {
unsigned long desc_version;
unsigned long desc_size;
#define EFI_MEMMAP_LATE (1UL << 0)
+#define EFI_MEMMAP_MEMBLOCK (1UL << 1)
+#define EFI_MEMMAP_SLAB (1UL << 2)
unsigned long flags;
};

@@ -1057,11 +1059,13 @@ static inline efi_status_t efi_query_variable_store(u32 attributes,
#endif
extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);

-extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries);
+extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries,
+ unsigned long *flags);
extern int __init efi_memmap_init_early(struct efi_memory_map_data *data);
extern int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size);
extern void __init efi_memmap_unmap(void);
-extern int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map);
+extern int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map,
+ unsigned long flags);
extern int __init efi_memmap_split_count(efi_memory_desc_t *md,
struct range *range);
extern void __init efi_memmap_insert(struct efi_memory_map *old_memmap,

2020-01-02 04:32:18

by Dan Williams

[permalink] [raw]
Subject: [PATCH v3 3/4] efi: Fix efi_memmap_alloc() leaks

With efi_fake_memmap() and efi_arch_mem_reserve() the efi table may be
updated and replaced multiple times. When that happens a previous
dynamically allocated efi memory map can be garbage collected. Use the
new EFI_MEMMAP_{SLAB,MEMBLOCK} flags to detect when a dynamically
allocated memory map is being replaced.

Debug statements in efi_memmap_free() reveal:

efi: __efi_memmap_free:37: phys: 0x23ffdd580 size: 2688 flags: 0x2
efi: __efi_memmap_free:37: phys: 0x9db00 size: 2640 flags: 0x2
efi: __efi_memmap_free:37: phys: 0x9e580 size: 2640 flags: 0x2

...a savings of 7968 bytes on a qemu boot with 2 entries specified to
efi_fake_mem=.

Cc: Taku Izumi <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/firmware/efi/memmap.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index 2b81ee6858a9..46c8b4056cc1 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -29,6 +29,28 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
return PFN_PHYS(page_to_pfn(p));
}

+static void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags)
+{
+ if (flags & EFI_MEMMAP_MEMBLOCK) {
+ if (slab_is_available())
+ memblock_free_late(phys, size);
+ else
+ memblock_free(phys, size);
+ } else if (flags & EFI_MEMMAP_SLAB) {
+ struct page *p = pfn_to_page(PHYS_PFN(phys));
+ unsigned int order = get_order(size);
+
+ free_pages((unsigned long) page_address(p), order);
+ }
+}
+
+static void __init efi_memmap_free(void)
+{
+ __efi_memmap_free(efi.memmap.phys_map,
+ efi.memmap.desc_size * efi.memmap.nr_map,
+ efi.memmap.flags);
+}
+
/**
* efi_memmap_alloc - Allocate memory for the EFI memory map
* @num_entries: Number of entries in the allocated map.
@@ -90,6 +112,8 @@ __efi_memmap_init(struct efi_memory_map_data *data, unsigned long flags)
return -ENOMEM;
}

+ efi_memmap_free();
+
map.phys_map = data->phys_map;
map.nr_map = data->size / data->desc_size;
map.map_end = map.map + data->size;

2020-01-02 04:32:39

by Dan Williams

[permalink] [raw]
Subject: [PATCH v3 4/4] efi: Fix handling of multiple efi_fake_mem= entries

Dave noticed that when specifying multiple efi_fake_mem= entries only
the last entry was successfully being reflected in the efi memory map.
This is due to the fact that the efi_memmap_insert() is being called
multiple times, but on successive invocations the insertion should be
applied to the last new memmap rather than the original map at
efi_fake_memmap() entry.

Rework efi_fake_memmap() to install the new memory map after each
efi_fake_mem= entry is parsed.

This also fixes an issue in efi_fake_memmap() that caused it to litter
emtpy entries into the end of the efi memory map. The empty entry causes
efi_memmap_insert() to attempt more memmap splits / copies than
efi_memmap_split_count() accounted for when sizing the new map.

BUG: unable to handle page fault for address: ffffffffff281000
[..]
RIP: 0010:efi_memmap_insert+0x11d/0x191
[..]
Call Trace:
? bgrt_init+0xbe/0xbe
? efi_arch_mem_reserve+0x1cb/0x228
? acpi_parse_bgrt+0xa/0xd
? acpi_table_parse+0x86/0xb8
? acpi_boot_init+0x494/0x4e3
? acpi_parse_x2apic+0x87/0x87
? setup_acpi_sci+0xa2/0xa2
? setup_arch+0x8db/0x9e1
? start_kernel+0x6a/0x547
? secondary_startup_64+0xb6/0xc0

Commit af1648984828 "x86/efi: Update e820 with reserved EFI boot
services data to fix kexec breakage" is listed in Fixes: since it
introduces more occurrences where efi_memmap_insert() is invoked after
an efi_fake_mem= configuration has been parsed. Previously the side
effects of vestigial empty entries were benign, but with commit
af1648984828 that follow-on efi_memmap_insert() invocation triggers the
above crash signature.

Fixes: 0f96a99dab36 ("efi: Add 'efi_fake_mem' boot option")
Fixes: af1648984828 ("x86/efi: Update e820 with reserved EFI boot services...")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: Dave Young <[email protected]>
Cc: Taku Izumi <[email protected]>
Cc: Michael Weiser <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/firmware/efi/fake_mem.c | 32 +++++++++++++++++---------------
drivers/firmware/efi/memmap.c | 2 +-
include/linux/efi.h | 2 ++
3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/drivers/firmware/efi/fake_mem.c b/drivers/firmware/efi/fake_mem.c
index 7e53e5520548..68d752d8af21 100644
--- a/drivers/firmware/efi/fake_mem.c
+++ b/drivers/firmware/efi/fake_mem.c
@@ -34,26 +34,17 @@ static int __init cmp_fake_mem(const void *x1, const void *x2)
return 0;
}

-void __init efi_fake_memmap(void)
+static void __init efi_fake_range(struct efi_mem_range *efi_range)
{
int new_nr_map = efi.memmap.nr_map;
efi_memory_desc_t *md;
phys_addr_t new_memmap_phy;
unsigned long flags = 0;
void *new_memmap;
- int i;
-
- if (!efi_enabled(EFI_MEMMAP) || !nr_fake_mem)
- return;

/* count up the number of EFI memory descriptor */
- for (i = 0; i < nr_fake_mem; i++) {
- for_each_efi_memory_desc(md) {
- struct range *r = &efi_fake_mems[i].range;
-
- new_nr_map += efi_memmap_split_count(md, r);
- }
- }
+ for_each_efi_memory_desc(md)
+ new_nr_map += efi_memmap_split_count(md, &efi_range->range);

/* allocate memory for new EFI memmap */
new_memmap_phy = efi_memmap_alloc(new_nr_map, &flags);
@@ -64,17 +55,28 @@ void __init efi_fake_memmap(void)
new_memmap = early_memremap(new_memmap_phy,
efi.memmap.desc_size * new_nr_map);
if (!new_memmap) {
- memblock_free(new_memmap_phy, efi.memmap.desc_size * new_nr_map);
+ __efi_memmap_free(new_memmap_phy,
+ efi.memmap.desc_size * new_nr_map, flags);
return;
}

- for (i = 0; i < nr_fake_mem; i++)
- efi_memmap_insert(&efi.memmap, new_memmap, &efi_fake_mems[i]);
+ efi_memmap_insert(&efi.memmap, new_memmap, efi_range);

/* swap into new EFI memmap */
early_memunmap(new_memmap, efi.memmap.desc_size * new_nr_map);

efi_memmap_install(new_memmap_phy, new_nr_map, flags);
+}
+
+void __init efi_fake_memmap(void)
+{
+ int i;
+
+ if (!efi_enabled(EFI_MEMMAP) || !nr_fake_mem)
+ return;
+
+ for (i = 0; i < nr_fake_mem; i++)
+ efi_fake_range(&efi_fake_mems[i]);

/* print new EFI memmap */
efi_print_memmap();
diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index 46c8b4056cc1..157b7776caf5 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -29,7 +29,7 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
return PFN_PHYS(page_to_pfn(p));
}

-static void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags)
+void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags)
{
if (flags & EFI_MEMMAP_MEMBLOCK) {
if (slab_is_available())
diff --git a/include/linux/efi.h b/include/linux/efi.h
index fa2668a992ae..6ae31e064321 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1061,6 +1061,8 @@ extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);

extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries,
unsigned long *flags);
+extern void __efi_memmap_free(u64 phys, unsigned long size,
+ unsigned long flags);
extern int __init efi_memmap_init_early(struct efi_memory_map_data *data);
extern int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size);
extern void __init efi_memmap_unmap(void);

2020-01-02 09:04:30

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] efi: Add tracking for dynamically allocated memmaps

Hi Dan,

Thanks for taking the time to really fix this properly.

Comments/questions below.

On Thu, 2 Jan 2020 at 05:29, Dan Williams <[email protected]> wrote:
>
> In preparation for fixing efi_memmap_alloc() leaks, add support for
> recording whether the memmap was dynamically allocated from slab,
> memblock, or is the original physical memmap provided by the platform.
>
> Cc: Taku Izumi <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> ---
> arch/x86/platform/efi/efi.c | 2 +-
> arch/x86/platform/efi/quirks.c | 11 ++++++-----
> drivers/firmware/efi/fake_mem.c | 5 +++--
> drivers/firmware/efi/memmap.c | 16 ++++++++++------
> include/linux/efi.h | 8 ++++++--
> 5 files changed, 26 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 38d44f36d5ed..7086afbb84fd 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -333,7 +333,7 @@ static void __init efi_clean_memmap(void)
> u64 size = efi.memmap.nr_map - n_removal;
>
> pr_warn("Removing %d invalid memory map entries.\n", n_removal);
> - efi_memmap_install(efi.memmap.phys_map, size);
> + efi_memmap_install(efi.memmap.phys_map, size, 0);
> }
> }
>
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index f8f0220b6a66..4a71c790f9c3 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -244,6 +244,7 @@ EXPORT_SYMBOL_GPL(efi_query_variable_store);
> void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> {
> phys_addr_t new_phys, new_size;
> + unsigned long flags = 0;
> struct efi_mem_range mr;
> efi_memory_desc_t md;
> int num_entries;
> @@ -272,8 +273,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> num_entries += efi.memmap.nr_map;
>
> new_size = efi.memmap.desc_size * num_entries;
> -
> - new_phys = efi_memmap_alloc(num_entries);
> + new_phys = efi_memmap_alloc(num_entries, &flags);
> if (!new_phys) {
> pr_err("Could not allocate boot services memmap\n");
> return;
> @@ -288,7 +288,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> efi_memmap_insert(&efi.memmap, new, &mr);
> early_memunmap(new, new_size);
>
> - efi_memmap_install(new_phys, num_entries);
> + efi_memmap_install(new_phys, num_entries, flags);
> e820__range_update(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED);
> e820__update_table(e820_table);
> }
> @@ -408,6 +408,7 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md)
> void __init efi_free_boot_services(void)
> {
> phys_addr_t new_phys, new_size;
> + unsigned long flags = 0;
> efi_memory_desc_t *md;
> int num_entries = 0;
> void *new, *new_md;
> @@ -463,7 +464,7 @@ void __init efi_free_boot_services(void)
> return;
>
> new_size = efi.memmap.desc_size * num_entries;
> - new_phys = efi_memmap_alloc(num_entries);
> + new_phys = efi_memmap_alloc(num_entries, &flags);
> if (!new_phys) {
> pr_err("Failed to allocate new EFI memmap\n");
> return;
> @@ -493,7 +494,7 @@ void __init efi_free_boot_services(void)
>
> memunmap(new);
>
> - if (efi_memmap_install(new_phys, num_entries)) {
> + if (efi_memmap_install(new_phys, num_entries, flags)) {
> pr_err("Could not install new EFI memmap\n");
> return;
> }
> diff --git a/drivers/firmware/efi/fake_mem.c b/drivers/firmware/efi/fake_mem.c
> index bb9fc70d0cfa..7e53e5520548 100644
> --- a/drivers/firmware/efi/fake_mem.c
> +++ b/drivers/firmware/efi/fake_mem.c
> @@ -39,6 +39,7 @@ void __init efi_fake_memmap(void)
> int new_nr_map = efi.memmap.nr_map;
> efi_memory_desc_t *md;
> phys_addr_t new_memmap_phy;
> + unsigned long flags = 0;
> void *new_memmap;
> int i;
>
> @@ -55,7 +56,7 @@ void __init efi_fake_memmap(void)
> }
>
> /* allocate memory for new EFI memmap */
> - new_memmap_phy = efi_memmap_alloc(new_nr_map);
> + new_memmap_phy = efi_memmap_alloc(new_nr_map, &flags);
> if (!new_memmap_phy)
> return;
>
> @@ -73,7 +74,7 @@ void __init efi_fake_memmap(void)
> /* swap into new EFI memmap */
> early_memunmap(new_memmap, efi.memmap.desc_size * new_nr_map);
>
> - efi_memmap_install(new_memmap_phy, new_nr_map);
> + efi_memmap_install(new_memmap_phy, new_nr_map, flags);
>

So it is the caller's responsibility to record the flags returned by
efi_memmap_alloc() and pass them into efi_memmap_install(), right?
Given that we are now passing three pieces of info that need to be in
sync between the two, could we use a dedicated data structure instead,
a reference to which is taken by both?


> /* print new EFI memmap */
> efi_print_memmap();
> diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
> index 813674ef9000..2b81ee6858a9 100644
> --- a/drivers/firmware/efi/memmap.c
> +++ b/drivers/firmware/efi/memmap.c
> @@ -32,6 +32,7 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
> /**
> * efi_memmap_alloc - Allocate memory for the EFI memory map
> * @num_entries: Number of entries in the allocated map.
> + * @flags: Late map, memblock alloc, slab alloc flags
> *
> * Depending on whether mm_init() has already been invoked or not,
> * either memblock or "normal" page allocation is used.
> @@ -39,20 +40,23 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
> * Returns the physical address of the allocated memory map on
> * success, zero on failure.
> */
> -phys_addr_t __init efi_memmap_alloc(unsigned int num_entries)
> +phys_addr_t __init efi_memmap_alloc(unsigned int num_entries, unsigned long *flags)
> {
> unsigned long size = num_entries * efi.memmap.desc_size;
>
> - if (slab_is_available())
> + if (slab_is_available()) {
> + *flags |= EFI_MEMMAP_SLAB;
> return __efi_memmap_alloc_late(size);
> + }
>
> + *flags |= EFI_MEMMAP_MEMBLOCK;

This assumes flags has neither bit set, but perhaps we should at least
clear the memblock one if we set the slab one?

> return __efi_memmap_alloc_early(size);
> }
>
> /**
> * __efi_memmap_init - Common code for mapping the EFI memory map
> * @data: EFI memory map data
> - * @flags: Use early or late mapping function?
> + * @flags: Use early or late mapping function, and allocator
> *
> * This function takes care of figuring out which function to use to
> * map the EFI memory map in efi.memmap based on how far into the boot
> @@ -192,10 +196,10 @@ int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size)
> *
> * Returns zero on success, a negative error code on failure.
> */
> -int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map)
> +int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map,
> + unsigned long flags)
> {
> struct efi_memory_map_data data;
> - unsigned long flags;
>
> efi_memmap_unmap();
>
> @@ -203,7 +207,7 @@ int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map)
> data.size = efi.memmap.desc_size * nr_map;
> data.desc_version = efi.memmap.desc_version;
> data.desc_size = efi.memmap.desc_size;
> - flags = efi.memmap.flags & EFI_MEMMAP_LATE;
> + flags |= efi.memmap.flags & EFI_MEMMAP_LATE;
>
> return __efi_memmap_init(&data, flags);
> }
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index b8e930f5ff77..fa2668a992ae 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -796,6 +796,8 @@ struct efi_memory_map {
> unsigned long desc_version;
> unsigned long desc_size;
> #define EFI_MEMMAP_LATE (1UL << 0)
> +#define EFI_MEMMAP_MEMBLOCK (1UL << 1)
> +#define EFI_MEMMAP_SLAB (1UL << 2)
> unsigned long flags;
> };
>
> @@ -1057,11 +1059,13 @@ static inline efi_status_t efi_query_variable_store(u32 attributes,
> #endif
> extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);
>
> -extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries);
> +extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries,
> + unsigned long *flags);
> extern int __init efi_memmap_init_early(struct efi_memory_map_data *data);
> extern int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size);
> extern void __init efi_memmap_unmap(void);
> -extern int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map);
> +extern int __init efi_memmap_install(phys_addr_t addr, unsigned int nr_map,
> + unsigned long flags);
> extern int __init efi_memmap_split_count(efi_memory_desc_t *md,
> struct range *range);
> extern void __init efi_memmap_insert(struct efi_memory_map *old_memmap,
>

2020-01-06 19:06:41

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] efi: Add tracking for dynamically allocated memmaps

On Thu, Jan 2, 2020 at 1:02 AM Ard Biesheuvel <[email protected]> wrote:
>
> Hi Dan,
>
> Thanks for taking the time to really fix this properly.
>
> Comments/questions below.
>
> On Thu, 2 Jan 2020 at 05:29, Dan Williams <[email protected]> wrote:
> >
> > In preparation for fixing efi_memmap_alloc() leaks, add support for
> > recording whether the memmap was dynamically allocated from slab,
> > memblock, or is the original physical memmap provided by the platform.
> >
> > Cc: Taku Izumi <[email protected]>
> > Cc: Ard Biesheuvel <[email protected]>
> > Signed-off-by: Dan Williams <[email protected]>
> > ---
> > arch/x86/platform/efi/efi.c | 2 +-
> > arch/x86/platform/efi/quirks.c | 11 ++++++-----
> > drivers/firmware/efi/fake_mem.c | 5 +++--
> > drivers/firmware/efi/memmap.c | 16 ++++++++++------
> > include/linux/efi.h | 8 ++++++--
> > 5 files changed, 26 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > index 38d44f36d5ed..7086afbb84fd 100644
> > --- a/arch/x86/platform/efi/efi.c
> > +++ b/arch/x86/platform/efi/efi.c
> > @@ -333,7 +333,7 @@ static void __init efi_clean_memmap(void)
> > u64 size = efi.memmap.nr_map - n_removal;
> >
> > pr_warn("Removing %d invalid memory map entries.\n", n_removal);
> > - efi_memmap_install(efi.memmap.phys_map, size);
> > + efi_memmap_install(efi.memmap.phys_map, size, 0);
> > }
> > }
> >
> > diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> > index f8f0220b6a66..4a71c790f9c3 100644
> > --- a/arch/x86/platform/efi/quirks.c
> > +++ b/arch/x86/platform/efi/quirks.c
> > @@ -244,6 +244,7 @@ EXPORT_SYMBOL_GPL(efi_query_variable_store);
> > void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> > {
> > phys_addr_t new_phys, new_size;
> > + unsigned long flags = 0;
> > struct efi_mem_range mr;
> > efi_memory_desc_t md;
> > int num_entries;
> > @@ -272,8 +273,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> > num_entries += efi.memmap.nr_map;
> >
> > new_size = efi.memmap.desc_size * num_entries;
> > -
> > - new_phys = efi_memmap_alloc(num_entries);
> > + new_phys = efi_memmap_alloc(num_entries, &flags);
> > if (!new_phys) {
> > pr_err("Could not allocate boot services memmap\n");
> > return;
> > @@ -288,7 +288,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> > efi_memmap_insert(&efi.memmap, new, &mr);
> > early_memunmap(new, new_size);
> >
> > - efi_memmap_install(new_phys, num_entries);
> > + efi_memmap_install(new_phys, num_entries, flags);
> > e820__range_update(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED);
> > e820__update_table(e820_table);
> > }
> > @@ -408,6 +408,7 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md)
> > void __init efi_free_boot_services(void)
> > {
> > phys_addr_t new_phys, new_size;
> > + unsigned long flags = 0;
> > efi_memory_desc_t *md;
> > int num_entries = 0;
> > void *new, *new_md;
> > @@ -463,7 +464,7 @@ void __init efi_free_boot_services(void)
> > return;
> >
> > new_size = efi.memmap.desc_size * num_entries;
> > - new_phys = efi_memmap_alloc(num_entries);
> > + new_phys = efi_memmap_alloc(num_entries, &flags);
> > if (!new_phys) {
> > pr_err("Failed to allocate new EFI memmap\n");
> > return;
> > @@ -493,7 +494,7 @@ void __init efi_free_boot_services(void)
> >
> > memunmap(new);
> >
> > - if (efi_memmap_install(new_phys, num_entries)) {
> > + if (efi_memmap_install(new_phys, num_entries, flags)) {
> > pr_err("Could not install new EFI memmap\n");
> > return;
> > }
> > diff --git a/drivers/firmware/efi/fake_mem.c b/drivers/firmware/efi/fake_mem.c
> > index bb9fc70d0cfa..7e53e5520548 100644
> > --- a/drivers/firmware/efi/fake_mem.c
> > +++ b/drivers/firmware/efi/fake_mem.c
> > @@ -39,6 +39,7 @@ void __init efi_fake_memmap(void)
> > int new_nr_map = efi.memmap.nr_map;
> > efi_memory_desc_t *md;
> > phys_addr_t new_memmap_phy;
> > + unsigned long flags = 0;
> > void *new_memmap;
> > int i;
> >
> > @@ -55,7 +56,7 @@ void __init efi_fake_memmap(void)
> > }
> >
> > /* allocate memory for new EFI memmap */
> > - new_memmap_phy = efi_memmap_alloc(new_nr_map);
> > + new_memmap_phy = efi_memmap_alloc(new_nr_map, &flags);
> > if (!new_memmap_phy)
> > return;
> >
> > @@ -73,7 +74,7 @@ void __init efi_fake_memmap(void)
> > /* swap into new EFI memmap */
> > early_memunmap(new_memmap, efi.memmap.desc_size * new_nr_map);
> >
> > - efi_memmap_install(new_memmap_phy, new_nr_map);
> > + efi_memmap_install(new_memmap_phy, new_nr_map, flags);
> >
>
> So it is the caller's responsibility to record the flags returned by
> efi_memmap_alloc() and pass them into efi_memmap_install(), right?
> Given that we are now passing three pieces of info that need to be in
> sync between the two, could we use a dedicated data structure instead,
> a reference to which is taken by both?

Sounds good, looks like I can mostly reuse 'struct
efi_memory_map_data' for this purpose.

>
>
> > /* print new EFI memmap */
> > efi_print_memmap();
> > diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
> > index 813674ef9000..2b81ee6858a9 100644
> > --- a/drivers/firmware/efi/memmap.c
> > +++ b/drivers/firmware/efi/memmap.c
> > @@ -32,6 +32,7 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
> > /**
> > * efi_memmap_alloc - Allocate memory for the EFI memory map
> > * @num_entries: Number of entries in the allocated map.
> > + * @flags: Late map, memblock alloc, slab alloc flags
> > *
> > * Depending on whether mm_init() has already been invoked or not,
> > * either memblock or "normal" page allocation is used.
> > @@ -39,20 +40,23 @@ static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
> > * Returns the physical address of the allocated memory map on
> > * success, zero on failure.
> > */
> > -phys_addr_t __init efi_memmap_alloc(unsigned int num_entries)
> > +phys_addr_t __init efi_memmap_alloc(unsigned int num_entries, unsigned long *flags)
> > {
> > unsigned long size = num_entries * efi.memmap.desc_size;
> >
> > - if (slab_is_available())
> > + if (slab_is_available()) {
> > + *flags |= EFI_MEMMAP_SLAB;
> > return __efi_memmap_alloc_late(size);
> > + }
> >
> > + *flags |= EFI_MEMMAP_MEMBLOCK;
>
> This assumes flags has neither bit set, but perhaps we should at least
> clear the memblock one if we set the slab one?

Ok.