2013-10-16 13:51:08

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 0/5] EFI capsule pstore support

From: Matt Fleming <[email protected]>

The UEFI spec describes a capsule mechanism that allows data blobs to be
handed to the firmware at runtime. If the firmware doesn't recognise the
guid of the capsule and if certain flags are set in the capsule header,
the firmware will preserve the memory region containing the capsule
across a reboot. We can utilise this feature to perform crash dumps and
function tracing to aid in crash analysis.

The capsule buffers containing pstore data can be much larger than is
possible with the EFI variable pstore backend, which makes it
particularly attractive for function tracing. Futhermore, because the
memory regions containing the capsule data are registered with the
firmware prior to the crash (as opposed to efi-pstore.c which invokes
variable services from the crash handler) it's more useful for debugging
hard hangs.

Matt Fleming (5):
pstore/ftrace: Don't increment initial data offset
efi: Introduce a Runtime Services lock
efi: Add common efi_reboot() implementation
efi: Move efi_status_to_err() to efi.h
efi: Capsule update support and pstore backend

arch/ia64/kernel/efi.c | 33 +-
arch/ia64/kernel/process.c | 2 +-
arch/x86/kernel/reboot.c | 21 +-
arch/x86/platform/efi/efi.c | 108 +++++-
drivers/firmware/efi/Kconfig | 19 +
drivers/firmware/efi/Makefile | 3 +-
drivers/firmware/efi/capsule.c | 802 +++++++++++++++++++++++++++++++++++++++++
drivers/firmware/efi/efi.c | 12 +
drivers/firmware/efi/reboot.c | 37 ++
drivers/firmware/efi/vars.c | 52 +--
fs/pstore/inode.c | 1 -
include/linux/efi.h | 54 +++
12 files changed, 1072 insertions(+), 72 deletions(-)
create mode 100644 drivers/firmware/efi/capsule.c
create mode 100644 drivers/firmware/efi/reboot.c

--
1.8.1.4


2013-10-16 13:51:15

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 1/5] pstore/ftrace: Don't increment initial data offset

From: Matt Fleming <[email protected]>

Delete the following expression,

data->off = ps->size % REC_SIZE;

as it is not reasonable to expect users to allocate an exact multiple
of REC_SIZE because that constant isn't exported outside of
fs/pstore. There are already checks in the ftrace code to ensure that
no accesses happen beyond the bounds of the buffer, so there's no real
reason to skip the beginning of the data buffer.

It's likely this hasn't been caught before because this code mainly
runs under ARM where REC_SIZE is 8 bytes. On x86-64 REC_SIZE is 24
bytes and so it's more likely that ps->size isn't going to be a multiple.

Cc: Anton Vorontsov <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Tony Luck <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
---
fs/pstore/inode.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 71bf5f4..cc6ec0e 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -71,7 +71,6 @@ static void *pstore_ftrace_seq_start(struct seq_file *s, loff_t *pos)
if (!data)
return NULL;

- data->off = ps->size % REC_SIZE;
data->off += *pos * REC_SIZE;
if (data->off + REC_SIZE > ps->size) {
kfree(data);
--
1.8.1.4

2013-10-16 13:51:32

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 5/5] efi: Capsule update support and pstore backend

From: Matt Fleming <[email protected]>

The EFI capsule mechanism allows data blobs to be passed to the EFI
firmware. By setting the EFI_CAPSULE_POPULATE_SYSTEM_TABLE and the
EFI_CAPSULE_PERSIST_ACROSS_REBOOT flags, the firmware will place a
pointer to our data blob in the EFI System Table on the next boot. We
can get access to the array of EFI capsules when parsing the
configuration tables the next time we boot.

We can utilise this facility to save crash dumps, call traces, etc in a
region of memory and have them preserved by the firmware across a
reboot.

Once a capsule has been passed to the firmware, the next reboot will
always be performed using the ResetSystem() EFI runtime service, which
may involve overriding the reboot type specified by reboot=. This
ensures the reset value returned by QueryCapsuleCapabilities() is used
to reset the system, which is required for the capsule to be processed.

Cc: Andi Kleen <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
---
arch/x86/kernel/reboot.c | 7 +
drivers/firmware/efi/Kconfig | 19 +
drivers/firmware/efi/Makefile | 1 +
drivers/firmware/efi/capsule.c | 802 +++++++++++++++++++++++++++++++++++++++++
drivers/firmware/efi/efi.c | 4 +
drivers/firmware/efi/reboot.c | 12 +
include/linux/efi.h | 17 +
7 files changed, 862 insertions(+)
create mode 100644 drivers/firmware/efi/capsule.c

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 78a1c67..af540a0 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -458,6 +458,13 @@ static void native_machine_emergency_restart(void)
mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0;
*((unsigned short *)__va(0x472)) = mode;

+ /*
+ * If an EFI capsule has been registered with the firmware then
+ * override the reboot= parameter.
+ */
+ if (efi_capsule_pending(NULL))
+ reboot_type = BOOT_EFI;
+
for (;;) {
/* Could also try the reset bit in the Hammer NB */
switch (reboot_type) {
diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index b0fc7c7..62ace0e 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -36,4 +36,23 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE
backend for pstore by default. This setting can be overridden
using the efivars module's pstore_disable parameter.

+config EFI_CAPSULE_PSTORE
+ bool "EFI capsule pstore backend"
+ depends on EFI && PSTORE
+ help
+ The EFI capsule mechanism can be used to store crash dumps and
+ function tracing data by passing the data to the firmware, which
+ will be preserved across a reboot.
+
+ It should be noted that enabling this opton will pass a capsule
+ to the firmware on every boot. Some firmware will not allow a
+ user to enter the BIOS setup when a capsule has been registered
+ on the previous boot.
+
+ Many EFI machines have buggy implementations of the UpdateCapsule()
+ runtime service. This option will enable code that may not function
+ correctly with your firmware.
+
+ If unsure, say N.
+
endmenu
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index 6375e14..102d6e4 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -4,3 +4,4 @@
obj-y += efi.o vars.o reboot.o
obj-$(CONFIG_EFI_VARS) += efivars.o
obj-$(CONFIG_EFI_VARS_PSTORE) += efi-pstore.o
+obj-$(CONFIG_EFI_CAPSULE_PSTORE) += capsule.o
diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
new file mode 100644
index 0000000..89f054e
--- /dev/null
+++ b/drivers/firmware/efi/capsule.c
@@ -0,0 +1,802 @@
+/*
+ * EFI capsule support.
+ *
+ * Copyright 2013 Intel Corporation <[email protected]>
+ *
+ * This file is part of the Linux kernel, and is made available under
+ * the terms of the GNU General Public License version 2.
+ */
+
+#define pr_fmt(fmt) "efi-capsule: " fmt
+
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/highmem.h>
+#include <linux/efi.h>
+
+typedef struct {
+ u64 length;
+ u64 data;
+} efi_capsule_block_desc_t;
+
+static bool capsule_pending;
+static int efi_reset_type = -1;
+
+/*
+ * capsule_mutex serialises access to both 'capsule_pending' and
+ * 'efi_reset_type'.
+ *
+ * This mutex must be held across calls to efi_capsule_supported() and
+ * efi_update_capsule() so that the operation is atomic. This ensures
+ * that efi_update_capsule() isn't called with a capsule that requires a
+ * different reset type to the registered 'efi_reset_type'.
+ */
+static DEFINE_MUTEX(capsule_mutex);
+
+static int efi_update_capsule(efi_capsule_header_t *capsule,
+ struct page **pages, size_t size, int reset);
+
+/**
+ * efi_capsule_pending - has a capsule been passed to the firmware?
+ * @reset_type: store the type of EFI reset if capsule is pending
+ *
+ * To ensure that the registered capsule is processed correctly by the
+ * firmware we need to perform a specific type of reset. If a capsule is
+ * pending return the reset type in @reset_type.
+ */
+bool efi_capsule_pending(int *reset_type)
+{
+ bool rv = false;
+
+ mutex_lock(&capsule_mutex);
+ if (!capsule_pending)
+ goto out;
+
+ if (reset_type)
+ *reset_type = efi_reset_type;
+ rv = true;
+
+out:
+ mutex_unlock(&capsule_mutex);
+ return rv;
+}
+
+/**
+ * efi_capsule_supported - does the firmware support the capsule?
+ * @guid: vendor guid of capsule
+ * @flags: capsule flags
+ * @size: size of capsule data
+ * @reset: the reset type required for this capsule
+ *
+ * Check whether a capsule with @flags is supported and that @size
+ * doesn't exceed the maximum size for a capsule, and that the reset
+ * type of the capsule is compatible with any that have previously been
+ * registered.
+ *
+ * We must be called with capsule_mutex held.
+ */
+static int efi_capsule_supported(efi_guid_t guid, u32 flags,
+ size_t size, int *reset)
+{
+ efi_capsule_header_t *capsule;
+ efi_status_t status;
+ u64 max_size;
+ int rv = 0;
+
+ capsule = kmalloc(sizeof(*capsule), GFP_KERNEL);
+ if (!capsule)
+ return -ENOMEM;
+
+ capsule->headersize = capsule->imagesize = sizeof(*capsule);
+ memcpy(&capsule->guid, &guid, sizeof(efi_guid_t));
+ capsule->flags = flags;
+
+ status = efi.query_capsule_caps(&capsule, 1, &max_size, reset);
+ if (status != EFI_SUCCESS) {
+ rv = efi_status_to_err(status);
+ goto out;
+ }
+
+ if (size > max_size) {
+ rv = -ENOSPC;
+ goto out;
+ }
+
+ if (efi_reset_type >= 0 && efi_reset_type != *reset) {
+ pr_err("Incompatible capsule reset type %d\n", *reset);
+ rv = -EINVAL;
+ }
+
+out:
+ kfree(capsule);
+ return rv;
+}
+
+struct efi_capsule_ctx {
+ struct page **pages;
+ unsigned int nr_pages;
+ efi_capsule_header_t *capsule;
+ size_t capsule_size;
+ void *data;
+ size_t data_size;
+};
+
+struct efi_capsule_pstore_buf {
+ void *data;
+ size_t size;
+ atomic_long_t offset;
+};
+
+struct efi_capsule_pstore {
+ /* Previous records */
+ efi_capsule_header_t **hdrs;
+ uint32_t nr_hdrs;
+ uint32_t hdr_index; /* Index of current header in 'hdrs' */
+ off_t hdr_offset; /* Offset into current header */
+
+ /* New records */
+ struct efi_capsule_pstore_buf console;
+ struct efi_capsule_pstore_buf ftrace;
+ struct efi_capsule_pstore_buf dmesg;
+};
+
+struct efi_capsule_pstore_record {
+ u64 timestamp;
+ u64 id;
+ enum pstore_type_id type;
+ size_t size;
+ int count;
+ bool inuse;
+ char data[];
+} __packed;
+
+static struct pstore_info efi_capsule_info;
+
+/**
+ * efi_build_pstore_capsule - alloc capsule and send to firmware
+ * @size: size in bytes of the capsule data
+ *
+ * This is a helper function for allocating enough room for user data
+ * + the size of an EFI capsule header, and passing that capsule to the
+ * firmware.
+ *
+ * Returns a pointer to the capsule on success, an ERR_PTR() value on
+ * error. If an error is returned we guarantee that the capsule has not
+ * been passed to the firmware.
+ */
+static efi_capsule_header_t *
+efi_build_pstore_capsule(size_t size)
+{
+ efi_capsule_header_t *capsule = NULL;
+ unsigned int nr_pages = 0;
+ size_t capsule_size;
+ struct page **pages;
+ int i, rv = -ENOMEM;
+ int reset_type;
+ efi_guid_t guid = LINUX_EFI_CRASH_GUID;
+ u32 flags = EFI_CAPSULE_PERSIST_ACROSS_RESET |
+ EFI_CAPSULE_POPULATE_SYSTEM_TABLE;
+
+ capsule_size = size + sizeof(*capsule);
+
+ mutex_lock(&capsule_mutex);
+ rv = efi_capsule_supported(guid, flags, capsule_size, &reset_type);
+ if (rv) {
+ mutex_unlock(&capsule_mutex);
+ return ERR_PTR(rv);
+ }
+
+ nr_pages = ALIGN(capsule_size, PAGE_SIZE) >> PAGE_SHIFT;
+ pages = kzalloc(nr_pages * sizeof(void *), GFP_KERNEL);
+ if (!pages)
+ goto fail;
+
+ for (i = 0; i < nr_pages; i++) {
+ struct page *page;
+
+ page = alloc_page(GFP_KERNEL);
+ if (!page)
+ goto fail;
+
+ pages[i] = page;
+ }
+
+ capsule = vmap(pages, nr_pages, 0, PAGE_KERNEL);
+ if (!capsule)
+ goto fail;
+
+ /*
+ * Setup the EFI capsule header.
+ */
+ memcpy(&capsule->guid, &guid, sizeof(guid));
+
+ capsule->headersize = sizeof(*capsule);
+ capsule->imagesize = capsule_size;
+ capsule->flags = flags;
+
+ memset((void *)capsule + capsule->headersize, 0, size);
+
+ rv = efi_update_capsule(capsule, pages, capsule_size, reset_type);
+ if (rv)
+ goto fail;
+
+out:
+ mutex_unlock(&capsule_mutex);
+ kfree(pages);
+ return capsule;
+
+fail:
+ vunmap(capsule);
+ for (i = 0; i < nr_pages; i++) {
+ if (!pages[i])
+ break;
+
+ __free_page(pages[i]);
+ }
+ capsule = ERR_PTR(rv);
+ goto out;
+}
+
+/**
+ * efi_capsule_lookup - search capsule array for entries.
+ * @caps: the array of capsules to search.
+ * @nr_caps: the number of capsules in @caps.
+ * @guid: the guid to search for.
+ * @nr_found: the number of entries found.
+ *
+ * Map each capsule header into the kernel's virtual address space and
+ * inspect the guid. Build an array of capsule headers with every
+ * capsule that is found with @guid. If a match is found the capsule
+ * remains mapped, otherwise it is unmapped.
+ *
+ * Returns an array of capsule headers, each element of which has the
+ * guid @guid. The number of elements in the array is stored in
+ * @nr_found. Returns %NULL and stores zero in @nr_found if no capsules
+ * were found.
+ */
+static efi_capsule_header_t **
+efi_capsule_lookup(efi_capsule_header_t **caps, uint32_t nr_caps,
+ efi_guid_t guid, uint32_t *nr_found)
+{
+ efi_capsule_header_t **capsules = NULL;
+ size_t capsules_size = 0;
+ int i;
+
+ *nr_found = 0;
+ for (i = 0; i < nr_caps; i++) {
+ efi_capsule_header_t *c;
+ size_t size;
+
+ c = ioremap((resource_size_t)caps[i], sizeof(*c));
+ if (!c) {
+ pr_err("failed to ioremap capsule\n");
+ goto fail;
+ }
+
+ size = c->imagesize;
+ iounmap(c);
+
+ c = ioremap((resource_size_t)caps[i], size);
+ if (!c) {
+ pr_err("failed to ioremap header + data\n");
+ goto fail;
+ }
+
+ if (!efi_guidcmp(c->guid, guid)) {
+ void *new;
+
+ capsules_size += sizeof(**capsules);
+ new = krealloc(capsules, capsules_size, GFP_KERNEL);
+ if (!new) {
+ pr_err("failed to realloc capsule array\n");
+ iounmap(c);
+ goto fail;
+ }
+
+ capsules = new;
+ capsules[(*nr_found)++] = c;
+ continue;
+ }
+
+ iounmap(c);
+ }
+
+ return capsules;
+
+fail:
+ for (i = 0; i < *nr_found; i++)
+ iounmap(capsules[i]);
+ *nr_found = 0;
+
+ kfree(capsules);
+ return ERR_PTR(-ENOMEM);
+}
+
+static efi_capsule_header_t *
+efi_setup_pstore_buffer(struct efi_capsule_pstore_buf *buf,
+ size_t size, enum pstore_type_id type)
+{
+ struct efi_capsule_pstore_record *rec;
+ efi_capsule_header_t *capsule;
+
+ capsule = efi_build_pstore_capsule(size);
+ if (IS_ERR(capsule))
+ return capsule;
+
+ rec = (void *)capsule + capsule->headersize;
+ rec->size = size - offsetof(typeof(*rec), data);
+ rec->type = type;
+
+ rec->inuse = false;
+
+ buf->size = rec->size;
+ atomic_long_set(&buf->offset, 0);
+ buf->data = rec->data;
+
+ return capsule;
+}
+
+/*
+ * We may not be in a position to allocate memory at the time of a
+ * crash, so pre-allocate some space now and register it with the
+ * firmware via efi_capsule_update().
+ *
+ * Also, iterate through the array of capsules pointed to from the EFI
+ * system table and take note of any LINUX_EFI_CRASH_GUID
+ * capsules. They will be parsed by efi_capsule_pstore_read().
+ */
+static int efi_capsule_pstore_setup(void)
+{
+ struct efi_capsule_pstore *pctx = NULL;
+ struct efi_capsule_pstore_buf *buf;
+ efi_capsule_header_t *capsule;
+ void *crash_buf = NULL;
+ size_t size, crash_size;
+ int rv;
+
+ pctx = kzalloc(sizeof(*pctx), GFP_KERNEL);
+ if (!pctx)
+ return -ENOMEM;
+
+ size = 65536;
+ capsule = efi_build_pstore_capsule(size);
+ if (IS_ERR(capsule)) {
+ rv = PTR_ERR(capsule);
+ goto fail;
+ }
+
+ pctx->dmesg.data = (void *)capsule + capsule->headersize;
+ atomic_long_set(&pctx->dmesg.offset, 0);
+ pctx->dmesg.size = size;
+
+ buf = &pctx->console;
+ capsule = efi_setup_pstore_buffer(buf, size, PSTORE_TYPE_CONSOLE);
+ if (IS_ERR(capsule)) {
+ rv = PTR_ERR(capsule);
+ goto fail;
+ }
+
+ buf = &pctx->ftrace;
+ capsule = efi_setup_pstore_buffer(buf, size, PSTORE_TYPE_FTRACE);
+ if (IS_ERR(capsule)) {
+ rv = PTR_ERR(capsule);
+ goto fail;
+ }
+
+ crash_size = 4096;
+ crash_buf = kmalloc(crash_size, GFP_KERNEL);
+ if (!crash_buf) {
+ rv = -ENOMEM;
+ goto fail;
+ }
+
+ /*
+ * Register the capsule backend with pstore.
+ */
+ spin_lock_init(&efi_capsule_info.buf_lock);
+
+ efi_capsule_info.buf = crash_buf;
+ efi_capsule_info.bufsize = crash_size;
+ efi_capsule_info.data = pctx;
+
+ rv = pstore_register(&efi_capsule_info);
+ if (rv) {
+ pr_err("pstore registration failed: %d\n", rv);
+ goto fail;
+ }
+
+ return rv;
+
+fail:
+ kfree(crash_buf);
+ kfree(pctx);
+ return rv;
+}
+
+static int efi_capsule_pstore_open(struct pstore_info *psi)
+{
+ struct efi_capsule_pstore *pctx = psi->data;
+ efi_capsule_header_t **capsules;
+ uint32_t nr_capsules;
+ size_t size;
+ void *cap;
+ int rv = 0;
+
+ /*
+ * Read any pstore entries that were passed across a reboot.
+ */
+ if (efi.capsule == EFI_INVALID_TABLE_ADDR)
+ return -ENODEV;
+
+ cap = ioremap(efi.capsule, sizeof(nr_capsules));
+ if (!cap)
+ return -ENOMEM;
+
+ /*
+ * The array of capsules is prefixed with the number of
+ * capsule entries in the array.
+ */
+ nr_capsules = *(uint32_t *)cap;
+ iounmap(cap);
+
+ if (!nr_capsules)
+ return -ENODEV;
+
+ size = nr_capsules * sizeof(*cap);
+ cap = ioremap(efi.capsule, size);
+ if (!cap)
+ return -ENOMEM;
+
+ capsules = cap + sizeof(uint32_t *);
+
+ pctx->hdrs = efi_capsule_lookup(capsules, nr_capsules,
+ LINUX_EFI_CRASH_GUID,
+ &pctx->nr_hdrs);
+ if (IS_ERR(pctx->hdrs)) {
+ rv = PTR_ERR(pctx->hdrs);
+ pctx->hdrs = NULL;
+ }
+
+ iounmap(cap);
+ return rv;
+}
+
+static int efi_capsule_pstore_close(struct pstore_info *psi)
+{
+ struct efi_capsule_pstore *pctx = psi->data;
+ int i;
+
+ for (i = 0; i < pctx->nr_hdrs; i++)
+ iounmap(pctx->hdrs[i]);
+
+ pctx->nr_hdrs = 0;
+ pctx->hdr_index = 0;
+ kfree(pctx->hdrs);
+
+ return 0;
+}
+
+/*
+ * Return the next pstore record that was passed to us across a reboot
+ * in an EFI capsule.
+ *
+ * This is expected to be called under the pstore
+ * read_mutex. Therefore, no serialisation is done here.
+ */
+static struct efi_capsule_pstore_record *
+get_pstore_read_record(struct efi_capsule_pstore *pctx)
+{
+ struct efi_capsule_pstore_record *rec;
+ efi_capsule_header_t *hdr;
+ off_t remaining;
+
+next:
+ if (pctx->hdr_index == pctx->nr_hdrs)
+ return NULL;
+
+ hdr = pctx->hdrs[pctx->hdr_index];
+ rec = (void *)hdr + hdr->headersize + pctx->hdr_offset;
+
+ remaining = hdr->imagesize - hdr->headersize -
+ pctx->hdr_offset - offsetof(typeof(*rec), data);
+
+ /*
+ * A single EFI capsule may contain multiple pstore records, but
+ * there is no guarantee it will be filled completely, so we
+ * need to handle partial records.
+ *
+ * If there are no more entries in this capsule try the next.
+ */
+ if (!rec->inuse) {
+ pctx->hdr_index++;
+ pctx->hdr_offset = 0;
+ goto next;
+ }
+
+ /*
+ * If we've finished parsing all records in this capsule, move
+ * onto the next. Otherwise, increment the offset into the
+ * current capsule (pctx->hdr_offset).
+ */
+ if (rec->size == remaining) {
+ pctx->hdr_index++;
+ pctx->hdr_offset = 0;
+ } else
+ pctx->hdr_offset += rec->size + offsetof(typeof(*rec), data);
+
+ return rec;
+}
+
+static ssize_t efi_capsule_pstore_read(u64 *id, enum pstore_type_id *type,
+ int *count, struct timespec *time,
+ char **buf, struct pstore_info *psi)
+{
+ struct efi_capsule_pstore_record *rec;
+ struct efi_capsule_pstore *pctx = psi->data;
+ ssize_t size;
+
+ rec = get_pstore_read_record(pctx);
+ if (!rec)
+ return 0;
+
+ *type = rec->type;
+ time->tv_sec = rec->timestamp;
+ time->tv_nsec = 0;
+ size = rec->size;
+ *id = rec->id;
+ *count = rec->count;
+
+ *buf = kmalloc(size, GFP_KERNEL);
+ if (!*buf)
+ return -ENOMEM;
+
+ memcpy(*buf, rec->data, size);
+
+ return size;
+}
+
+/*
+ * We expect to be called with ->buf_lock held, and so don't perform
+ * any serialisation.
+ */
+static struct notrace efi_capsule_pstore_record *
+get_pstore_write_record(struct efi_capsule_pstore_buf *pbuf, size_t size)
+{
+ struct efi_capsule_pstore_record *rec;
+ long offset = atomic_long_read(&pbuf->offset);
+
+ if (offset + size > pbuf->size)
+ return NULL;
+
+ rec = pbuf->data + offset;
+
+ atomic_long_add(offsetof(typeof(*rec), data) + size, &pbuf->offset);
+ rec->inuse = true;
+
+ return rec;
+}
+
+static int notrace
+efi_capsule_pstore_write(enum pstore_type_id type,
+ enum kmsg_dump_reason reason, u64 *id,
+ unsigned int part, int count, size_t hsize,
+ size_t size, struct pstore_info *psi)
+{
+ struct efi_capsule_pstore_record *rec;
+ struct efi_capsule_pstore *pctx = psi->data;
+
+ if (!size)
+ return -EINVAL;
+
+ rec = get_pstore_write_record(&pctx->dmesg, size);
+ if (!rec)
+ return -ENOSPC;
+
+ rec->type = type;
+ rec->timestamp = get_seconds();
+ rec->size = size;
+ *id = rec->id = part;
+ rec->count = count;
+ memcpy(rec->data, psi->buf, size);
+
+ return 0;
+}
+
+static inline void buf_inuse(struct efi_capsule_pstore_buf *pbuf)
+{
+ struct efi_capsule_pstore_record *rec;
+
+ rec = pbuf->data - sizeof(*rec);
+ rec->inuse = true;
+}
+
+static notrace void *
+get_pstore_buf(struct efi_capsule_pstore_buf *pbuf, size_t size)
+{
+ long next, curr;
+
+ if (size > pbuf->size)
+ return NULL;
+
+ buf_inuse(pbuf);
+
+ do {
+ curr = atomic_long_read(&pbuf->offset);
+ next = curr + size;
+
+ /* Wrap? */
+ if (next > pbuf->size) {
+ next = size;
+ if (atomic_long_cmpxchg(&pbuf->offset, curr, next)) {
+ curr = 0;
+ break;
+ }
+
+ continue;
+ }
+
+ } while (atomic_long_cmpxchg(&pbuf->offset, curr, next) != curr);
+
+ return pbuf->data + curr;
+}
+
+static int notrace
+efi_capsule_pstore_write_buf(enum pstore_type_id type,
+ enum kmsg_dump_reason reason,
+ u64 *id, unsigned int part,
+ const char *buf, size_t hsize,
+ size_t size, struct pstore_info *psi)
+{
+ struct efi_capsule_pstore *pctx = psi->data;
+ void *dst;
+
+ if (type == PSTORE_TYPE_FTRACE)
+ dst = get_pstore_buf(&pctx->ftrace, size);
+ else if (type == PSTORE_TYPE_CONSOLE)
+ dst = get_pstore_buf(&pctx->console, size);
+ else
+ return -EINVAL;
+
+ if (!dst)
+ return -ENOSPC;
+
+ memcpy(dst, buf, size);
+ return 0;
+}
+
+static struct pstore_info efi_capsule_info = {
+ .owner = THIS_MODULE,
+ .name = "capsule",
+ .open = efi_capsule_pstore_open,
+ .close = efi_capsule_pstore_close,
+ .read = efi_capsule_pstore_read,
+ .write = efi_capsule_pstore_write,
+ .write_buf = efi_capsule_pstore_write_buf,
+};
+
+#define BLOCKS_PER_PAGE (PAGE_SIZE / sizeof(efi_capsule_block_desc_t))
+
+/*
+ * How many pages of block descriptors do we need to map 'nr_pages'?
+ *
+ * Every list of block descriptors in a page must end with a
+ * continuation pointer. The last continuation pointer of the lage page
+ * must be zero to mark the end of the chain.
+ */
+static inline unsigned int num_block_pages(unsigned int nr_pages)
+{
+ return DIV_ROUND_UP(nr_pages, BLOCKS_PER_PAGE - 1);
+}
+
+/**
+ * efi_update_capsule - pass a single capsule to the firmware.
+ * @capsule: capsule to send to the firmware.
+ * @pages: an array of capsule data.
+ * @size: total size of capsule data + headers in @capsule.
+ * @reset: the reset type required for @capsule
+ *
+ * Map @capsule with EFI capsule block descriptors in PAGE_SIZE chunks.
+ * @size needn't necessarily be a multiple of PAGE_SIZE - we can handle
+ * a trailing chunk that is smaller than PAGE_SIZE.
+ *
+ * @capsule MUST be virtually contiguous.
+ *
+ * Return 0 on success.
+ */
+static int efi_update_capsule(efi_capsule_header_t *capsule,
+ struct page **pages, size_t size, int reset)
+{
+ efi_capsule_block_desc_t *block = NULL;
+ struct page **block_pgs, *page;
+ efi_status_t status;
+ unsigned int nr_data_pgs, nr_block_pgs;
+ int i, j, err = -ENOMEM;
+
+ nr_data_pgs = DIV_ROUND_UP(size, PAGE_SIZE);
+ nr_block_pgs = num_block_pages(nr_data_pgs);
+
+ block_pgs = kzalloc(nr_block_pgs * sizeof(page), GFP_KERNEL);
+ if (!block_pgs)
+ return -ENOMEM;
+
+ for (i = 0; i < nr_block_pgs; i++) {
+ block_pgs[i] = alloc_page(GFP_KERNEL);
+ if (!block_pgs[i])
+ goto fail;
+ }
+
+ page = pages[0];
+ for (i = 0; i < nr_block_pgs; i++) {
+ block = kmap(block_pgs[i]);
+ if (!block)
+ goto fail;
+
+ for (j = 0; j < BLOCKS_PER_PAGE - 1 && nr_data_pgs > 0; j++) {
+ u64 sz = min_t(u64, size, PAGE_SIZE);
+
+ block[j].length = sz;
+ block[j].data = page_to_phys(page);
+
+ size -= sz;
+ nr_data_pgs--;
+ page++;
+ }
+
+ /* Continuation pointer */
+ block[j].length = 0;
+
+ if (i + 1 == nr_block_pgs)
+ block[j].data = 0;
+ else
+ block[j].data = page_to_phys(block_pgs[i + 1]);
+
+ kunmap(block_pgs[i]);
+ }
+
+ status = efi.update_capsule(&capsule, 1, page_to_phys(block_pgs[0]));
+ if (status != EFI_SUCCESS) {
+ pr_err("update_capsule fail: 0x%lx\n", status);
+ err = efi_status_to_err(status);
+ goto fail;
+ }
+
+ capsule_pending = true;
+ efi_reset_type = reset;
+
+ kfree(block_pgs);
+ return 0;
+
+fail:
+ for (i = 0; i < nr_block_pgs; i++) {
+ if (block_pgs[i])
+ __free_page(block_pgs[i]);
+ }
+
+ kfree(block_pgs);
+ return err;
+}
+
+/*
+ * efi_capsule_init - initialise the EFI capsule system
+ */
+static __init int efi_capsule_init(void)
+{
+ int rv, reset;
+ u32 flags = EFI_CAPSULE_PERSIST_ACROSS_RESET |
+ EFI_CAPSULE_POPULATE_SYSTEM_TABLE;
+
+ if (!efi_enabled(EFI_RUNTIME_SERVICES))
+ return -ENODEV;
+
+ mutex_lock(&capsule_mutex);
+ rv = efi_capsule_supported(LINUX_EFI_CRASH_GUID, flags, 0, &reset);
+ mutex_unlock(&capsule_mutex);
+
+ if (rv)
+ return rv;
+
+ efi_capsule_pstore_setup();
+
+ return 0;
+}
+device_initcall(efi_capsule_init);
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 772f559..46a96a3 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -32,6 +32,7 @@ struct efi __read_mostly efi = {
.hcdp = EFI_INVALID_TABLE_ADDR,
.uga = EFI_INVALID_TABLE_ADDR,
.uv_systab = EFI_INVALID_TABLE_ADDR,
+ .capsule = EFI_INVALID_TABLE_ADDR,
};
EXPORT_SYMBOL(efi);

@@ -72,6 +73,8 @@ static ssize_t systab_show(struct kobject *kobj,
str += sprintf(str, "BOOTINFO=0x%lx\n", efi.boot_info);
if (efi.uga != EFI_INVALID_TABLE_ADDR)
str += sprintf(str, "UGA=0x%lx\n", efi.uga);
+ if (efi.capsule != EFI_INVALID_TABLE_ADDR)
+ str += sprintf(str, "CAPSULE=0x%lx\n", efi.capsule);

return str - buf;
}
@@ -198,6 +201,7 @@ static __initdata efi_config_table_type_t common_tables[] = {
{SAL_SYSTEM_TABLE_GUID, "SALsystab", &efi.sal_systab},
{SMBIOS_TABLE_GUID, "SMBIOS", &efi.smbios},
{UGA_IO_PROTOCOL_GUID, "UGA", &efi.uga},
+ {LINUX_EFI_CRASH_GUID, "CAPSULE", &efi.capsule},
{NULL_GUID, NULL, 0},
};

diff --git a/drivers/firmware/efi/reboot.c b/drivers/firmware/efi/reboot.c
index f9f34eb..d6ea42a 100644
--- a/drivers/firmware/efi/reboot.c
+++ b/drivers/firmware/efi/reboot.c
@@ -10,6 +10,9 @@

void efi_reboot(int mode)
{
+ const char *str[] = { "cold", "warm", "shutdown", "platform" };
+ int cap_reset_mode;
+
switch (mode) {
case EFI_RESET_COLD:
case EFI_RESET_WARM:
@@ -21,5 +24,14 @@ void efi_reboot(int mode)
return;
}

+ if (efi_capsule_pending(&cap_reset_mode)) {
+ if (mode != cap_reset_mode)
+ printk("efi: %s reset requested but pending capsule "
+ "update requires %s reset... Performing "
+ "%s reset\n", str[mode], str[cap_reset_mode],
+ str[cap_reset_mode]);
+ mode = cap_reset_mode;
+ }
+
efi.reset_system(mode, EFI_SUCCESS, 0, NULL);
}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 6a10640..5ce71ca 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -117,6 +117,13 @@ typedef struct {
} efi_capsule_header_t;

/*
+ * EFI capsule flags
+ */
+#define EFI_CAPSULE_PERSIST_ACROSS_RESET 0x00010000
+#define EFI_CAPSULE_POPULATE_SYSTEM_TABLE 0x00020000
+#define EFI_CAPSULE_INITIATE_RESET 0x00040000
+
+/*
* Allocation types for calls to boottime->allocate_pages.
*/
#define EFI_ALLOCATE_ANY_PAGES 0
@@ -557,6 +564,7 @@ extern struct efi {
unsigned long hcdp; /* HCDP table */
unsigned long uga; /* UGA table */
unsigned long uv_systab; /* UV system table */
+ unsigned long capsule; /* EFI capsule table */
efi_get_time_t *get_time;
efi_set_time_t *set_time;
efi_get_wakeup_time_t *get_wakeup_time;
@@ -905,4 +913,13 @@ int efivars_sysfs_init(void);

#endif /* CONFIG_EFI_VARS */

+#ifdef CONFIG_EFI_CAPSULE_PSTORE
+extern bool efi_capsule_pending(int *reset_type);
+#else
+static inline bool efi_capsule_pending(int *reset_type)
+{
+ return false;
+}
+#endif /* CONFIG_EFI_CAPSULE_PSTORE */
+
#endif /* _LINUX_EFI_H */
--
1.8.1.4

2013-10-16 13:51:22

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 3/5] efi: Add common efi_reboot() implementation

From: Matt Fleming <[email protected]>

The reboot functionality is the same for all EFI implementations, so
provide a wrapper that is useable by all architectures.

Cc: Leif Lindholm <[email protected]>
Cc: Tony Luck <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
---
arch/ia64/kernel/process.c | 2 +-
arch/x86/kernel/reboot.c | 14 +++++++++-----
drivers/firmware/efi/Makefile | 2 +-
drivers/firmware/efi/reboot.c | 25 +++++++++++++++++++++++++
include/linux/efi.h | 2 ++
5 files changed, 38 insertions(+), 7 deletions(-)
create mode 100644 drivers/firmware/efi/reboot.c

diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index 55d4ba4..2b55b8e 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -662,7 +662,7 @@ void
machine_restart (char *restart_cmd)
{
(void) notify_die(DIE_MACHINE_RESTART, restart_cmd, NULL, 0, 0, 0);
- (*efi.reset_system)(EFI_RESET_WARM, 0, 0, NULL);
+ efi_reboot(EFI_RESET_WARM);
}

void
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 563ed91..78a1c67 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -497,11 +497,15 @@ static void native_machine_emergency_restart(void)
break;

case BOOT_EFI:
- if (efi_enabled(EFI_RUNTIME_SERVICES))
- efi.reset_system(reboot_mode == REBOOT_WARM ?
- EFI_RESET_WARM :
- EFI_RESET_COLD,
- EFI_SUCCESS, 0, NULL);
+ if (efi_enabled(EFI_RUNTIME_SERVICES)) {
+ int mode = EFI_RESET_COLD;
+
+ if (reboot_mode == REBOOT_WARM)
+ mode = EFI_RESET_WARM;
+
+ efi_reboot(mode);
+ }
+
reboot_type = BOOT_KBD;
break;

diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index 99245ab..6375e14 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -1,6 +1,6 @@
#
# Makefile for linux kernel
#
-obj-y += efi.o vars.o
+obj-y += efi.o vars.o reboot.o
obj-$(CONFIG_EFI_VARS) += efivars.o
obj-$(CONFIG_EFI_VARS_PSTORE) += efi-pstore.o
diff --git a/drivers/firmware/efi/reboot.c b/drivers/firmware/efi/reboot.c
new file mode 100644
index 0000000..f9f34eb
--- /dev/null
+++ b/drivers/firmware/efi/reboot.c
@@ -0,0 +1,25 @@
+/*
+ * Copyright 2013 Intel Corporation <[email protected]>
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ */
+
+#include <linux/efi.h>
+
+void efi_reboot(int mode)
+{
+ switch (mode) {
+ case EFI_RESET_COLD:
+ case EFI_RESET_WARM:
+ case EFI_RESET_SHUTDOWN:
+ case EFI_RESET_PLATFORM_SPECIFIC:
+ break;
+ default:
+ printk("efi: invalid reboot mode %d\n", mode);
+ return;
+ }
+
+ efi.reset_system(mode, EFI_SUCCESS, 0, NULL);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 153df45..eed69c9 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -283,6 +283,7 @@ typedef struct {
#define EFI_RESET_COLD 0
#define EFI_RESET_WARM 1
#define EFI_RESET_SHUTDOWN 2
+#define EFI_RESET_PLATFORM_SPECIFIC 3

/*
* EFI Runtime Services table
@@ -591,6 +592,7 @@ extern void efi_map_pal_code (void);
extern void efi_memmap_walk (efi_freemem_callback_t callback, void *arg);
extern void efi_gettimeofday (struct timespec *ts);
extern void efi_enter_virtual_mode (void); /* switch EFI to virtual mode, if possible */
+extern void efi_reboot(int mode);
#ifdef CONFIG_X86
extern void efi_late_init(void);
extern void efi_free_boot_services(void);
--
1.8.1.4

2013-10-16 13:51:48

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 4/5] efi: Move efi_status_to_err() to efi.h

From: Matt Fleming <[email protected]>

Move efi_status_to_err() into the efi.h header as it's generally useful
in all bits of EFI code where there is a need to convert an efi_status_t
to a kernel error value.

Signed-off-by: Matt Fleming <[email protected]>
---
drivers/firmware/efi/vars.c | 33 ---------------------------------
include/linux/efi.h | 33 +++++++++++++++++++++++++++++++++
2 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index ac9a4ea..6ff19ff 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -237,39 +237,6 @@ check_var_size(u32 attributes, unsigned long size)
return fops->query_variable_store(attributes, size);
}

-static int efi_status_to_err(efi_status_t status)
-{
- int err;
-
- switch (status) {
- case EFI_SUCCESS:
- err = 0;
- break;
- case EFI_INVALID_PARAMETER:
- err = -EINVAL;
- break;
- case EFI_OUT_OF_RESOURCES:
- err = -ENOSPC;
- break;
- case EFI_DEVICE_ERROR:
- err = -EIO;
- break;
- case EFI_WRITE_PROTECTED:
- err = -EROFS;
- break;
- case EFI_SECURITY_VIOLATION:
- err = -EACCES;
- break;
- case EFI_NOT_FOUND:
- err = -ENOENT;
- break;
- default:
- err = -EINVAL;
- }
-
- return err;
-}
-
static bool variable_is_present(efi_char16_t *variable_name, efi_guid_t *vendor,
struct list_head *head)
{
diff --git a/include/linux/efi.h b/include/linux/efi.h
index eed69c9..6a10640 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -753,6 +753,39 @@ static inline void memrange_efi_to_native(u64 *addr, u64 *npages)
*addr &= PAGE_MASK;
}

+static inline int efi_status_to_err(efi_status_t status)
+{
+ int err;
+
+ switch (status) {
+ case EFI_SUCCESS:
+ err = 0;
+ break;
+ case EFI_INVALID_PARAMETER:
+ err = -EINVAL;
+ break;
+ case EFI_OUT_OF_RESOURCES:
+ err = -ENOSPC;
+ break;
+ case EFI_DEVICE_ERROR:
+ err = -EIO;
+ break;
+ case EFI_WRITE_PROTECTED:
+ err = -EROFS;
+ break;
+ case EFI_SECURITY_VIOLATION:
+ err = -EACCES;
+ break;
+ case EFI_NOT_FOUND:
+ err = -ENOENT;
+ break;
+ default:
+ err = -EINVAL;
+ }
+
+ return err;
+}
+
/*
* EFI Variable support.
*
--
1.8.1.4

2013-10-16 13:52:12

by Matt Fleming

[permalink] [raw]
Subject: [PATCH 2/5] efi: Introduce a Runtime Services lock

From: Matt Fleming <[email protected]>

Section 7.1 Runtime Services Rules and Restrictions of the UEFI spec
describes how many of the runtime services are not reentrant. We need a
method of prohibiting functions from being called concurrently.

We've managed to get away without requiring a runtime services lock
until now because most of the interactions with EFI involve EFI
variables, and those operations are already serialised with
__efivars->lock.

This change does mean that we can avoid acquiring __efivars->lock where
the only reason we acquired it in the first place was to prevent
concurrent execution of the EFI variable services, e.g.
efivar_entry_size().

The spec makes allowances for accessing the runtime services from NMI
context. In that case we want to avoid grabbing the runtime lock to
ensure we make forward progress, e.g. in efi_pstore_write().

Cc: Leif Lindholm <[email protected]>
Cc: Tony Luck <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
---
arch/ia64/kernel/efi.c | 33 +++++++++++++-
arch/x86/platform/efi/efi.c | 108 ++++++++++++++++++++++++++++++++++++++------
drivers/firmware/efi/efi.c | 8 ++++
drivers/firmware/efi/vars.c | 19 ++------
include/linux/efi.h | 2 +
5 files changed, 139 insertions(+), 31 deletions(-)

diff --git a/arch/ia64/kernel/efi.c b/arch/ia64/kernel/efi.c
index da5b462..cb4f13e 100644
--- a/arch/ia64/kernel/efi.c
+++ b/arch/ia64/kernel/efi.c
@@ -56,7 +56,21 @@ extern efi_status_t efi_call_phys (void *, ...);
static efi_runtime_services_t *runtime;
static u64 mem_limit = ~0UL, max_addr = ~0UL, min_addr = 0UL;

-#define efi_call_virt(f, args...) (*(f))(args)
+#define efi_call_virt(f, args...) \
+({ \
+ unsigned long __flags; \
+ efi_status_t __status; \
+ bool __nmi = in_nmi(); \
+ \
+ if (!__nmi) \
+ spin_lock_irqsave(&efi_runtime_lock, __flags); \
+ \
+ __status = (*(f))(args); \
+ \
+ if (!__nmi) \
+ spin_unlock_irqrestore(&efi_runtime_lock, __flags); \
+ __status; \
+})

#define STUB_GET_TIME(prefix, adjust_arg) \
static efi_status_t \
@@ -192,6 +206,21 @@ prefix##_get_next_high_mono_count (u32 *count) \
return ret; \
}

+#define efi_call_reset_phys efi_call_phys
+#define efi_call_reset_virt(f, args...) \
+({ \
+ unsigned long __flags; \
+ bool __nmi = in_nmi(); \
+ \
+ if (__nmi) \
+ spin_lock_irqsave(&efi_runtime_lock, __flags); \
+ \
+ (*f)(args); \
+ \
+ if (__nmi) \
+ spin_unlock_irqrestore(&efi_runtime_lock, __flags); \
+})
+
#define STUB_RESET_SYSTEM(prefix, adjust_arg) \
static void \
prefix##_reset_system (int reset_type, efi_status_t status, \
@@ -204,7 +233,7 @@ prefix##_reset_system (int reset_type, efi_status_t status, \
adata = adjust_arg(data); \
\
ia64_save_scratch_fpregs(fr); \
- efi_call_##prefix( \
+ efi_call_reset_##prefix( \
(efi_reset_system_t *) __va(runtime->reset_system), \
reset_type, status, data_size, adata); \
/* should not return, but just in case... */ \
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 1d3372a..8944e2c 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -117,7 +117,11 @@ static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc)
efi_status_t status;

spin_lock_irqsave(&rtc_lock, flags);
+
+ spin_lock(&efi_runtime_lock);
status = efi_call_virt2(get_time, tm, tc);
+ spin_unlock(&efi_runtime_lock);
+
spin_unlock_irqrestore(&rtc_lock, flags);
return status;
}
@@ -128,7 +132,11 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
efi_status_t status;

spin_lock_irqsave(&rtc_lock, flags);
+
+ spin_lock(&efi_runtime_lock);
status = efi_call_virt1(set_time, tm);
+ spin_unlock(&efi_runtime_lock);
+
spin_unlock_irqrestore(&rtc_lock, flags);
return status;
}
@@ -141,8 +149,12 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t *enabled,
efi_status_t status;

spin_lock_irqsave(&rtc_lock, flags);
+
+ spin_lock(&efi_runtime_lock);
status = efi_call_virt3(get_wakeup_time,
enabled, pending, tm);
+ spin_unlock(&efi_runtime_lock);
+
spin_unlock_irqrestore(&rtc_lock, flags);
return status;
}
@@ -153,8 +165,12 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t enabled, efi_time_t *tm)
efi_status_t status;

spin_lock_irqsave(&rtc_lock, flags);
+
+ spin_lock(&efi_runtime_lock);
status = efi_call_virt2(set_wakeup_time,
enabled, tm);
+ spin_unlock(&efi_runtime_lock);
+
spin_unlock_irqrestore(&rtc_lock, flags);
return status;
}
@@ -165,17 +181,31 @@ static efi_status_t virt_efi_get_variable(efi_char16_t *name,
unsigned long *data_size,
void *data)
{
- return efi_call_virt5(get_variable,
- name, vendor, attr,
- data_size, data);
+ unsigned long flags;
+ efi_status_t status;
+
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+ status = efi_call_virt5(get_variable,
+ name, vendor, attr,
+ data_size, data);
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static efi_status_t virt_efi_get_next_variable(unsigned long *name_size,
efi_char16_t *name,
efi_guid_t *vendor)
{
- return efi_call_virt3(get_next_variable,
- name_size, name, vendor);
+ unsigned long flags;
+ efi_status_t status;
+
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+ status = efi_call_virt3(get_next_variable,
+ name_size, name, vendor);
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static efi_status_t virt_efi_set_variable(efi_char16_t *name,
@@ -184,9 +214,21 @@ static efi_status_t virt_efi_set_variable(efi_char16_t *name,
unsigned long data_size,
void *data)
{
- return efi_call_virt5(set_variable,
- name, vendor, attr,
- data_size, data);
+ unsigned long flags;
+ efi_status_t status;
+ bool nmi = in_nmi();
+
+ if (!nmi)
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+
+ status = efi_call_virt5(set_variable,
+ name, vendor, attr,
+ data_size, data);
+
+ if (!nmi)
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static efi_status_t virt_efi_query_variable_info(u32 attr,
@@ -194,16 +236,35 @@ static efi_status_t virt_efi_query_variable_info(u32 attr,
u64 *remaining_space,
u64 *max_variable_size)
{
+ unsigned long flags;
+ efi_status_t status;
+ bool nmi = in_nmi();
+
if (efi.runtime_version < EFI_2_00_SYSTEM_TABLE_REVISION)
return EFI_UNSUPPORTED;

- return efi_call_virt4(query_variable_info, attr, storage_space,
- remaining_space, max_variable_size);
+ if (!nmi)
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+
+ status = efi_call_virt4(query_variable_info, attr, storage_space,
+ remaining_space, max_variable_size);
+
+ if (!nmi)
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static efi_status_t virt_efi_get_next_high_mono_count(u32 *count)
{
- return efi_call_virt1(get_next_high_mono_count, count);
+ unsigned long flags;
+ efi_status_t status;
+
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+ status = efi_call_virt1(get_next_high_mono_count, count);
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static void virt_efi_reset_system(int reset_type,
@@ -211,18 +272,30 @@ static void virt_efi_reset_system(int reset_type,
unsigned long data_size,
efi_char16_t *data)
{
+ unsigned long flags;
+ bool nmi = in_nmi();
+
+ spin_lock_irqsave(&efi_runtime_lock, flags);
efi_call_virt4(reset_system, reset_type, status,
data_size, data);
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
}

static efi_status_t virt_efi_update_capsule(efi_capsule_header_t **capsules,
unsigned long count,
unsigned long sg_list)
{
+ unsigned long flags;
+ efi_status_t status;
+
if (efi.runtime_version < EFI_2_00_SYSTEM_TABLE_REVISION)
return EFI_UNSUPPORTED;

- return efi_call_virt3(update_capsule, capsules, count, sg_list);
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+ status = efi_call_virt3(update_capsule, capsules, count, sg_list);
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static efi_status_t virt_efi_query_capsule_caps(efi_capsule_header_t **capsules,
@@ -230,11 +303,18 @@ static efi_status_t virt_efi_query_capsule_caps(efi_capsule_header_t **capsules,
u64 *max_size,
int *reset_type)
{
+ unsigned long flags;
+ efi_status_t status;
+
if (efi.runtime_version < EFI_2_00_SYSTEM_TABLE_REVISION)
return EFI_UNSUPPORTED;

- return efi_call_virt4(query_capsule_caps, capsules, count, max_size,
- reset_type);
+ spin_lock_irqsave(&efi_runtime_lock, flags);
+ status = efi_call_virt4(query_capsule_caps, capsules, count, max_size,
+ reset_type);
+ spin_unlock_irqrestore(&efi_runtime_lock, flags);
+
+ return status;
}

static efi_status_t __init phys_efi_set_virtual_address_map(
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 2e2fbde..772f559 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -35,6 +35,14 @@ struct efi __read_mostly efi = {
};
EXPORT_SYMBOL(efi);

+/*
+ * 'efi_runtime_lock' protects against concurrently invoking the EFI
+ * runtime services, many of which are not reentrant.
+ *
+ * You must disable interrupts when acquiring this lock.
+ */
+DEFINE_SPINLOCK(efi_runtime_lock);
+
static struct kobject *efi_kobj;
static struct kobject *efivars_kobj;

diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index 391c67b..ac9a4ea 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -376,7 +376,8 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
return -ENOMEM;
}

- spin_lock_irq(&__efivars->lock);
+ if (atomic)
+ spin_lock_irq(&__efivars->lock);

/*
* Per EFI spec, the maximum storage allocated for both
@@ -391,9 +392,6 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
&vendor_guid);
switch (status) {
case EFI_SUCCESS:
- if (!atomic)
- spin_unlock_irq(&__efivars->lock);
-
variable_name_size = var_name_strnsize(variable_name,
variable_name_size);

@@ -409,8 +407,6 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
variable_is_present(variable_name, &vendor_guid, head)) {
dup_variable_bug(variable_name, &vendor_guid,
variable_name_size);
- if (!atomic)
- spin_lock_irq(&__efivars->lock);

status = EFI_NOT_FOUND;
break;
@@ -420,9 +416,6 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
if (err)
status = EFI_NOT_FOUND;

- if (!atomic)
- spin_lock_irq(&__efivars->lock);
-
break;
case EFI_NOT_FOUND:
break;
@@ -435,7 +428,8 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),

} while (status != EFI_NOT_FOUND);

- spin_unlock_irq(&__efivars->lock);
+ if (atomic)
+ spin_unlock_irq(&__efivars->lock);

kfree(variable_name);

@@ -702,11 +696,8 @@ int efivar_entry_size(struct efivar_entry *entry, unsigned long *size)

*size = 0;

- spin_lock_irq(&__efivars->lock);
status = ops->get_variable(entry->var.VariableName,
&entry->var.VendorGuid, NULL, size, NULL);
- spin_unlock_irq(&__efivars->lock);
-
if (status != EFI_BUFFER_TOO_SMALL)
return efi_status_to_err(status);

@@ -754,11 +745,9 @@ int efivar_entry_get(struct efivar_entry *entry, u32 *attributes,
const struct efivar_operations *ops = __efivars->ops;
efi_status_t status;

- spin_lock_irq(&__efivars->lock);
status = ops->get_variable(entry->var.VariableName,
&entry->var.VendorGuid,
attributes, size, data);
- spin_unlock_irq(&__efivars->lock);

return efi_status_to_err(status);
}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index bc5687d..153df45 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -812,6 +812,8 @@ struct efi_simple_text_output_protocol {

extern struct list_head efivar_sysfs_list;

+extern spinlock_t efi_runtime_lock;
+
static inline void
efivar_unregister(struct efivar_entry *var)
{
--
1.8.1.4

2013-10-16 14:19:51

by Matthew Garrett

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

On Wed, Oct 16, 2013 at 02:51:00PM +0100, Matt Fleming wrote:

> + Many EFI machines have buggy implementations of the UpdateCapsule()
> + runtime service. This option will enable code that may not function
> + correctly with your firmware.

Where by "May not function correctly" you mean "May crash the system"?
I'm a little uneasy having this run by default if enabled, even if it's
disabled by default in the config.

--
Matthew Garrett | [email protected]

2013-10-16 14:52:30

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 5/5] efi: Capsule update support and pstore backend

> Where by "May not function correctly" you mean "May crash the system"?
> I'm a little uneasy having this run by default if enabled, even if it's
> disabled by default in the config.

There's also an "either/or" choice between using efi-capsule with pstore, and the
traditional kexec/kdump method for getting a memory dump from a crash. We
have to go through a reset to save the capsule - but we don't want a reset for
kexec. Perhaps we can pass the reset parameters through the kexec path to
the new kernel to make it do the right kind of reset ... but the value of the capsule
dump is minimal if kdump managed to dump everything.

Bottom line: users need to make an informed choice to use efi-capsule + pstore.

-Tony

2013-10-16 20:14:22

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

> + It should be noted that enabling this opton will pass a capsule
> + to the firmware on every boot. Some firmware will not allow a
> + user to enter the BIOS setup when a capsule has been registered
> + on the previous boot.

That sounds like a problem. Can this be fixed to only do it on demand?

> +
> + Many EFI machines have buggy implementations of the UpdateCapsule()
> + runtime service. This option will enable code that may not function
> + correctly with your firmware.

Do we need white/black lists?

Controlling such things form CONFIG is not very good.

-Andi

2013-10-16 23:37:12

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [PATCH 2/5] efi: Introduce a Runtime Services lock

> +#define efi_call_reset_virt(f, args...) \
> +({ \
> + unsigned long __flags; \
> + bool __nmi = in_nmi(); \
> + \
> + if (__nmi) \
> + spin_lock_irqsave(&efi_runtime_lock, __flags); \

If the lock is not held in the nmi context, runtime service may run concurrently
in non-nmi context as follows.
- In nmi context, cpu0 calls a runtime service (no lock is held.)
- In non-nmi context, cpu1 call can take the lock and call the runtime service.

To avoid this, using try_lock in nmi context is better..

If(in_nmi())
try_spin_lock_irqsave();
else
spin_lock_irqsave();

Please see the commit abd4d5587be911f63592537284dad78766d97d62,
which is introduced to pstore by Don Zickus.

Seiji


2013-10-17 00:06:52

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [PATCH 5/5] efi: Capsule update support and pstore backend

> There's also an "either/or" choice between using efi-capsule with pstore, and the
> traditional kexec/kdump method for getting a memory dump from a crash. We
> have to go through a reset to save the capsule - but we don't want a reset for
> kexec. Perhaps we can pass the reset parameters through the kexec path to
> the new kernel to make it do the right kind of reset ... but the value of the capsule
> dump is minimal if kdump managed to dump everything.

Tony,

I tried to log kmsg into the kexec path months ago.
It was rejected due to the impelementation problem.

But, as Eric said, it should be OK if it is implemented in the kdump kenel.

http://marc.info/?l=linux-kernel&m=136917686732183&w=2

The only problem with kdump here is the implementation in the initial
ram disk. Fixing the initial ramdisk so it logs to kmsg before it
touches scarier hardware should be the solution.

Seiji

2013-10-17 11:55:21

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

On Wed, 16 Oct, at 03:19:39PM, Matthew Garrett wrote:
> On Wed, Oct 16, 2013 at 02:51:00PM +0100, Matt Fleming wrote:
>
> > + Many EFI machines have buggy implementations of the UpdateCapsule()
> > + runtime service. This option will enable code that may not function
> > + correctly with your firmware.
>
> Where by "May not function correctly" you mean "May crash the system"?

Bingo.

> I'm a little uneasy having this run by default if enabled, even if it's
> disabled by default in the config.

What would be the canonical way to enable this feature then? Have a file
along the lines of /sys/kernel/debug/capsule_enable, where a user would
be required to,

echo 1 > /sys/kernel/debug/capsule_enable

to turn on the functionality?

--
Matt Fleming, Intel Open Source Technology Center

2013-10-17 12:05:25

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

On Wed, 16 Oct, at 02:52:25PM, Luck, Tony wrote:
> > Where by "May not function correctly" you mean "May crash the system"?
> > I'm a little uneasy having this run by default if enabled, even if it's
> > disabled by default in the config.
>
> There's also an "either/or" choice between using efi-capsule with pstore, and the
> traditional kexec/kdump method for getting a memory dump from a crash. We
> have to go through a reset to save the capsule - but we don't want a reset for
> kexec. Perhaps we can pass the reset parameters through the kexec path to
> the new kernel to make it do the right kind of reset ... but the value of the capsule
> dump is minimal if kdump managed to dump everything.

I admit that using kexec with the EFI capsule + pstore code is not
something I'd considered. Though as you say, if you manage to jump to
your crash kernel I'm not sure how much use the capsule would be.

--
Matt Fleming, Intel Open Source Technology Center

2013-10-17 12:14:07

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

On Wed, 16 Oct, at 01:14:16PM, Andi Kleen wrote:
> > + It should be noted that enabling this opton will pass a capsule
> > + to the firmware on every boot. Some firmware will not allow a
> > + user to enter the BIOS setup when a capsule has been registered
> > + on the previous boot.
>
> That sounds like a problem. Can this be fixed to only do it on demand?

Incorporating something along the lines of,

echo 1 > /sys/kernel/debug/capsule_enable

would fix this. Assuming that's the kind of thing Matthew had in mind.

> > +
> > + Many EFI machines have buggy implementations of the UpdateCapsule()
> > + runtime service. This option will enable code that may not function
> > + correctly with your firmware.
>
> Do we need white/black lists?
>
> Controlling such things form CONFIG is not very good.

We have, so far, avoided creating black/white lists in any of the EFI
code and because this is an optional debug feature and is in no way
required for a user's machine to boot, I'm not sure how much value there
would be in trying to maintain such a list.

On the other hand, the case could be made for failing more gracefully if
we discover the QueryCapsuleCapabilities()/UpdateCapsule() runtime
services are broken when enabling this feature, i.e. through gratuitous
use of FW_BUG.

--
Matt Fleming, Intel Open Source Technology Center

2013-10-17 23:18:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

> But, as Eric said, it should be OK if it is implemented in the kdump kenel.

kdump doesn't work for a lot of use cases (too much memory consumption)

-Andi

2013-10-17 23:19:31

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 5/5] efi: Capsule update support and pstore backend

> > I'm a little uneasy having this run by default if enabled, even if it's
> > disabled by default in the config.
>
> What would be the canonical way to enable this feature then? Have a file

White list systems and a option to force enable.

-Andi