2015-08-10 17:16:52

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH 0/3] SysFS driver for QEMU firmware config device (fw_cfg)

From: "Gabriel Somlo" <[email protected]>

This patch set makes QEMU fw_cfg blobs available for viewing (read-only)
via SysFS.

Several different architectures supported by QEMU are set up with a
"firmware configuration" (fw_cfg) device, used to pass configuration
"blobs" into the guest by the host running QEMU.

Historically, these config blobs were mostly of interest to the guest
BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
the command line, which makes them potentially interesting to userspace
(e.g. for passing early boot environment variables, etc.).

In addition to cc-ing the people and lists indicated by get-maintainer.pl,
I've added a few extra lists suggested by Matt Fleming on the qemu-devel
list, as well as the qemu-devel list itself.

Also cc-ing kernelnewbies, as this is my very first kenel contribution,
so please go easy on me for whatever silly n00b mistakes I might have still
missed, in spite of trying hard to do all my homework properly... :)

The series consists of three patches:

1/3 - probes for the qemu fw_cfg device in locations known to work on
the supported architectures, in decreasing order of "likelihood".

While it *may* be possible to detect the presence of fw_cfg via
acpi or dtb (on x86 and arm, respectively), there's no way I know
of attempting that on sun4 and ppc/mac, so I've stuck with simply
probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
in fw_cfg.c. I could use some advice on how else that could be
done more elegantly, if needed.

Upon successfully detecting a present fw_cfg device, we set up
/sys/firmware/fw_cfg/by_select entries for each blob available
on the fw_cfg device.

2/3 - export kset_find_obj() (in lib/kobject.c) for use with modules,
which will come in handy in patch #3, below.

3/3 - add a "user friendly" way of listing fw_cfg blobs by name rather
than by unique selector key.

Since fw_cfg blob names are traditionally set up to look like
path names, it would be nice to mirror that fact when displaying
them under /sys/firmware/fw_cfg in a "by_name" subdirectory.

I'm using ksets to reflect subdirectories matching "dirname"
tokens separated by '/' within each fw_cfg blob "filename",
and symlinks into the "by_select" subdirectory for each "basename".

Since the fw_cfg device doesn't enforce that blob names have
well-behaved and non-conflicting names, it is possible (though
unlikely) that there will be blobs named:

"etc/foo/bar"
and
"etc/foo"

where "foo" will try to be both a subdirectory AND a symlink under
/sys/firmware/fw_cfg/by_name/etc/. In such an event, the latter
fw_cfg blob will simply be skipped from having an entry created
under the user-friendly "by_name" subdirectory, remaining listed
only as an entry under the "by_select" subdirectory.

I have tested this on x86_64 and armv7hl+lpae kernels. Builds and installs
OK as both a module and linked directly into the kernel.

TIA for the feedback and advice,
--Gabriel

Gabriel Somlo (3):
firmware: introduce sysfs driver for QEMU's fw_cfg device
kobject: export kset_find_obj() to be used from modules
firmware: fw_cfg: create directory hierarchy for fw_cfg file names

Documentation/ABI/testing/sysfs-firmware-fw_cfg | 211 +++++++++
drivers/firmware/Kconfig | 10 +
drivers/firmware/Makefile | 1 +
drivers/firmware/fw_cfg.c | 542 ++++++++++++++++++++++++
lib/kobject.c | 1 +
5 files changed, 765 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-firmware-fw_cfg
create mode 100644 drivers/firmware/fw_cfg.c

--
2.4.3


2015-08-10 17:16:30

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH 1/3] firmware: introduce sysfs driver for QEMU's fw_cfg device

From: "Gabriel Somlo" <[email protected]>

Make fw_cfg entries of type "file" available via sysfs. Entries
are listed under /sys/firmware/fw_cfg/by_select, in folders named
after each entry's selector key. Filename, selector value, and
size read-only attributes are included for each entry. Also, a
"raw" attribute allows retrieval of the full binary content of
each entry.

This patch also provides a documentation file outlining the
guest-side "hardware" interface exposed by the QEMU fw_cfg device.

Signed-off-by: Gabriel Somlo <[email protected]>
---
Documentation/ABI/testing/sysfs-firmware-fw_cfg | 169 +++++++++
drivers/firmware/Kconfig | 10 +
drivers/firmware/Makefile | 1 +
drivers/firmware/fw_cfg.c | 438 ++++++++++++++++++++++++
4 files changed, 618 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-firmware-fw_cfg
create mode 100644 drivers/firmware/fw_cfg.c

diff --git a/Documentation/ABI/testing/sysfs-firmware-fw_cfg b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
new file mode 100644
index 0000000..3a7e7f2
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
@@ -0,0 +1,169 @@
+What: /sys/firmware/fw_cfg/
+Date: August 2015
+Contact: Gabriel Somlo <[email protected]>
+Description:
+ Several different architectures supported by QEMU (x86, arm,
+ sun4*, ppc/mac) are provisioned with a firmware configuration
+ (fw_cfg) device, used by the host to provide configuration data
+ to the starting guest. While most of this data is meant for use
+ by the guest BIOS, starting with QEMU v2.4, guest VMs may be
+ started with arbitrary fw_cfg entries supplied directly on the
+ command line, which therefore may be of interest to userspace.
+
+ === Guest-side Hardware Interface ===
+
+ The fw_cfg device is available to guest VMs as a pair (control
+ and data) of registers, accessible as either a IO ports or as
+ MMIO addresses, depending on the architecture.
+
+ --- Control Register ---
+
+ Width: 16-bit
+ Access: Write-Only
+ Endianness: LE (if IOport) or BE (if MMIO)
+
+ A write to the control register selects the index for one of
+ the firmware configuration items (or "blobs") available on the
+ fw_cfg device, which can subsequently be read from the data
+ register.
+
+ Each time the control register is written, an data offset
+ internal to the fw_cfg device will be set to zero. This data
+ offset impacts which portion of the selected fw_cfg blob is
+ accessed by reading the data register, as explained below.
+
+ --- Data Register ---
+
+ Width: 8-bit (if IOport), or 8/16/32/64-bit (if MMIO)
+ Access: Read-Only
+ Endianness: string preserving
+
+ The data register allows access to an array of bytes which
+ represent the fw_cfg blob last selected by a write to the
+ control register.
+
+ Immediately following a write to the control register, the data
+ offset will be set to zero. Each successful read access to the
+ data register will increment the data offset by the appropriate
+ access width.
+
+ Each fw_cfg blob has a maximum associated data length. Once the
+ data offset exceeds this maximum length, any subsequent reads
+ via the data register will return 0x00.
+
+ An N-byte wide read of the data register will return the next
+ available N bytes of the selected fw_cfg blob, as a substring,
+ in increasing address order, similar to memcpy(), zero-padded
+ if necessary should the maximum data length of the selected
+ item be reached, as described above.
+
+ --- Per-arch Register Details ---
+
+ -------------------------------------------------------------
+ arch access base ctrl ctrl data data
+ mode address offset endian offset width
+ max.
+ -------------------------------------------------------------
+ x86 IOport 0x510 0 LE 1 8
+ x86_64 IOport 0x510 0 LE 1 8
+ arm MMIO 0x9020000 8 BE 0 64
+ sun4u IOport 0x510 0 LE 1 8
+ sun4m MMIO 0xd00000510 0 BE 2 8
+ ppc/mac MMIO 0xf0000510 0 BE 2 8
+ -------------------------------------------------------------
+
+ NOTE: On platforms where the fw_cfg registers are exposed as
+ IO ports, the data port number will always be one greater than
+ the port number of the control register. I.e., the two ports
+ are overlapping, and can not be mapped separately.
+
+ === Firmware Configuration Items of Interest ===
+
+ Originally, the index key, size, and formatting of blobs in
+ fw_cfg was hard coded by mutual agreement between QEMU on the
+ host side, and the BIOS running on the guest. Later on, a file
+ transfer interface was added: by reading a special blob, the
+ fw_cfg consumer can retrieve a list of records containing the
+ name, selector key, and size of further fw_cfg blobs made
+ available by the host. Below we describe three fw_cfg blobs
+ of interest to the sysfs driver.
+
+ --- Signature (Key 0x0000, FW_CFG_SIGNATURE) ---
+
+ The presence of the fw_cfg device can be verified by selecting
+ the signature blob by writing 0x0000 to the control register,
+ and reading four bytes from the data register. If the fw_cfg
+ device is present, the four bytes read will match the ASCII
+ characters "QEMU".
+
+ --- Revision (Key 0x0001, FW_CFG_ID) ---
+
+ A 32-bit little-endian unsigned integer, this item is used as
+ an interface revision number.
+
+ --- File Directory (Key 0x0019, FW_CFG_FILE_DIR) ---
+
+ Any fw_cfg blobs stored at key 0x0020 FW_CFG_FILE_FIRST() or
+ higher will have an associated entry in this "directory" blob,
+ which facilitates the discovery of available items by software
+ (e.g. BIOS) running on the guest. The format of the directory
+ blob is shown below.
+
+ NOTE: All integers are stored in big-endian format!
+
+ /* the entire file directory "blob" */
+ struct FWCfgFiles {
+ uint32_t count; /* total number of entries */
+ struct FWCfgFile f[]; /* entry array, see below */
+ };
+
+ /* an individual directory entry, 64 bytes total */
+ struct FWCfgFile {
+ uint32_t size; /* size of referenced blob */
+ uint16_t select; /* blob selector key */
+ uint16_t reserved;
+ char name[56]; /* blob name, nul-term. ascii */
+ };
+
+ === SysFS fw_cfg Interface ===
+
+ The fw_cfg sysfs interface described in this document is only
+ intended to display discoverable blobs (i.e., those registered
+ with the file directory), as there is no way to determine the
+ presence or size of "legacy" blobs (with selector keys between
+ 0x0002 and 0x0018) programmatically.
+
+ All fw_cfg information is shown under:
+
+ /sys/firmware/fw_cfg/
+
+ The only legacy blob displayed is the fw_cfg device revision:
+
+ /sys/firmware/fw_cfg/rev
+
+ --- Discoverable fw_cfg blobs by selector key ---
+
+ All discoverable blobs listed in the fw_cfg file directory are
+ displayed as entries named after their unique selector key
+ value, e.g.:
+
+ /sys/firmware/fw_cfg/by_select/32
+ /sys/firmware/fw_cfg/by_select/33
+ /sys/firmware/fw_cfg/by_select/34
+ ...
+
+ Each such fw_cfg sysfs entry has the following values exported
+ as attributes:
+
+ name : The 56-byte nul-terminated ASCII string used as the
+ blob's 'file name' in the fw_cfg directory.
+ size : The length of the blob, as given in the fw_cfg
+ directory.
+ select : The value of the blob's selector key as given in the
+ fw_cfg directory. This value is the same as used in
+ the parent directory name.
+ how the rest of the entry should be interpreted.
+ raw : The raw bytes of the blob, obtained by selecting the
+ entry via the control register, and reading a number
+ of bytes equal to the blob size from the data
+ register.
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 99c69a3..f5cbe81 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -136,6 +136,16 @@ config QCOM_SCM
bool
depends on ARM || ARM64

+config FW_CFG_SYSFS
+ tristate "QEMU fw_cfg device support in sysfs"
+ depends on SYSFS
+ default n
+ help
+ Say Y or M here to enable the exporting of the QEMU firmware
+ configuration (fw_cfg) file entries via sysfs. Entries are
+ found under /sys/firmware/fw_cfg when this option is enabled
+ and loaded.
+
source "drivers/firmware/broadcom/Kconfig"
source "drivers/firmware/google/Kconfig"
source "drivers/firmware/efi/Kconfig"
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index 4a4b897..b81b46e 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
obj-$(CONFIG_QCOM_SCM) += qcom_scm-32.o
+obj-$(CONFIG_FW_CFG_SYSFS) += fw_cfg.o
CFLAGS_qcom_scm-32.o :=$(call as-instr,.arch_extension sec,-DREQUIRES_SEC=1)

obj-y += broadcom/
diff --git a/drivers/firmware/fw_cfg.c b/drivers/firmware/fw_cfg.c
new file mode 100644
index 0000000..be17411
--- /dev/null
+++ b/drivers/firmware/fw_cfg.c
@@ -0,0 +1,438 @@
+/*
+ * drivers/firmware/fw_cfg.c
+ *
+ * Expose entries from QEMU's firmware configuration (fw_cfg) device in
+ * sysfs (read-only, under "/sys/firmware/fw_cfg/...").
+ *
+ * Copyright 2015 Carnegie Mellon University
+ */
+
+#include <linux/module.h>
+#include <linux/capability.h>
+#include <linux/slab.h>
+#include <linux/io.h>
+#include <linux/ioport.h>
+#include <linux/ctype.h>
+
+/* selector values for "well-known" fw_cfg entries */
+#define FW_CFG_SIGNATURE 0x00
+#define FW_CFG_ID 0x01
+#define FW_CFG_FILE_DIR 0x19
+
+/* size in bytes of fw_cfg signature */
+#define FW_CFG_SIG_SIZE 4
+
+/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
+#define FW_CFG_MAX_FILE_PATH 56
+
+/* fw_cfg file directory entry type */
+struct fw_cfg_file {
+ uint32_t size;
+ uint16_t select;
+ uint16_t reserved;
+ char name[FW_CFG_MAX_FILE_PATH];
+};
+
+/* fw_cfg device i/o access options type */
+struct fw_cfg_access {
+ phys_addr_t start;
+ uint8_t size;
+ uint8_t ctrl_offset;
+ uint8_t data_offset;
+ bool is_mmio;
+ const char *name;
+};
+
+/* fw_cfg device i/o access available options for known architectures */
+static struct fw_cfg_access fw_cfg_modes[] = {
+ { 0x510, 2, 0, 1, false, "fw_cfg on i386, sun4u" },
+ { 0x9020000, 10, 8, 0, true, "fw_cfg on arm" },
+ { 0xd00000510, 3, 0, 2, true, "fw_cfg on sun4m" },
+ { 0xf0000510, 3, 0, 2, true, "fw_cfg on ppc/mac" },
+ { }
+};
+
+/* fw_cfg device i/o currently selected option set */
+static struct fw_cfg_access *fw_cfg_mode;
+
+/* fw_cfg device i/o register addresses */
+static void __iomem *fw_cfg_dev_base;
+static void __iomem *fw_cfg_dev_ctrl;
+static void __iomem *fw_cfg_dev_data;
+
+/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
+static DEFINE_MUTEX(fw_cfg_dev_lock);
+
+/* pick apropriate endianness for selector key */
+static inline uint16_t fw_cfg_sel_endianness(uint16_t select)
+{
+ return fw_cfg_mode->is_mmio ? cpu_to_be16(select) : cpu_to_le16(select);
+}
+
+/* type for fw_cfg "directory scan" visitor/callback function */
+typedef int (*fw_cfg_file_callback)(const struct fw_cfg_file *f);
+
+/* run a given callback on each fw_cfg directory entry */
+static int fw_cfg_scan_dir(fw_cfg_file_callback callback)
+{
+ int ret = 0;
+ uint32_t count, i;
+ struct fw_cfg_file f;
+
+ mutex_lock(&fw_cfg_dev_lock);
+ iowrite16(fw_cfg_sel_endianness(FW_CFG_FILE_DIR), fw_cfg_dev_ctrl);
+ ioread8_rep(fw_cfg_dev_data, &count, sizeof(count));
+ for (i = 0; i < be32_to_cpu(count); i++) {
+ ioread8_rep(fw_cfg_dev_data, &f, sizeof(f));
+ ret = callback(&f);
+ if (ret)
+ break;
+ }
+ mutex_unlock(&fw_cfg_dev_lock);
+ return ret;
+}
+
+/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
+static inline void fw_cfg_read_blob(uint16_t select,
+ void *buf, loff_t pos, size_t count)
+{
+ mutex_lock(&fw_cfg_dev_lock);
+ iowrite16(fw_cfg_sel_endianness(select), fw_cfg_dev_ctrl);
+ while (pos-- > 0)
+ ioread8(fw_cfg_dev_data);
+ ioread8_rep(fw_cfg_dev_data, buf, count);
+ mutex_unlock(&fw_cfg_dev_lock);
+}
+
+/* clean up fw_cfg device i/o setup */
+static void fw_cfg_io_cleanup(void)
+{
+ if (fw_cfg_mode->is_mmio) {
+ iounmap(fw_cfg_dev_base);
+ release_mem_region(fw_cfg_mode->start, fw_cfg_mode->size);
+ } else {
+ ioport_unmap(fw_cfg_dev_base);
+ release_region(fw_cfg_mode->start, fw_cfg_mode->size);
+ }
+}
+
+/* probe and map fw_cfg device */
+static int __init fw_cfg_io_probe(void)
+{
+ char sig[FW_CFG_SIG_SIZE];
+
+ for (fw_cfg_mode = &fw_cfg_modes[0];
+ fw_cfg_mode->start; fw_cfg_mode++) {
+
+ phys_addr_t start = fw_cfg_mode->start;
+ uint8_t size = fw_cfg_mode->size;
+
+ /* reserve and map mmio or ioport region */
+ if (fw_cfg_mode->is_mmio) {
+ if (!request_mem_region(start, size, fw_cfg_mode->name))
+ continue;
+ fw_cfg_dev_base = ioremap(start, size);
+ if (!fw_cfg_dev_base) {
+ release_mem_region(start, size);
+ continue;
+ }
+ } else {
+ if (!request_region(start, size, fw_cfg_mode->name))
+ continue;
+ fw_cfg_dev_base = ioport_map(start, size);
+ if (!fw_cfg_dev_base) {
+ release_region(start, size);
+ continue;
+ }
+ }
+
+ /* set control and data register addresses */
+ fw_cfg_dev_ctrl = fw_cfg_dev_base + fw_cfg_mode->ctrl_offset;
+ fw_cfg_dev_data = fw_cfg_dev_base + fw_cfg_mode->data_offset;
+
+ /* verify fw_cfg device signature */
+ fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
+ if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) == 0)
+ /* success, we're done */
+ return 0;
+
+ /* clean up before probing next access mode */
+ fw_cfg_io_cleanup();
+ }
+
+ return -ENODEV;
+}
+
+/* fw_cfg revision attribute, placed in top-level /sys/fw_cfg directory */
+static uint32_t fw_cfg_rev;
+
+static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
+{
+ return sprintf(buf, "%u\n", fw_cfg_rev);
+}
+
+static const struct {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
+} fw_cfg_rev_attr = {
+ .attr = { .name = "rev", .mode = S_IRUSR },
+ .show = fw_cfg_showrev,
+};
+
+/* fw_cfg_sysfs_entry type */
+struct fw_cfg_sysfs_entry {
+ struct kobject kobj;
+ struct fw_cfg_file f;
+ struct list_head list;
+};
+
+/* get fw_cfg_sysfs_entry from kobject member */
+static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
+{
+ return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
+}
+
+/* fw_cfg_sysfs_attribute type */
+struct fw_cfg_sysfs_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
+};
+
+/* get fw_cfg_sysfs_attribute from attribute member */
+static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
+{
+ return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
+}
+
+/* global cache of fw_cfg_sysfs_entry objects */
+static LIST_HEAD(fw_cfg_entry_cache);
+
+/* kobjects removed lazily by kernel, mutual exclusion needed */
+static DEFINE_SPINLOCK(fw_cfg_cache_lock);
+
+static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
+{
+ spin_lock(&fw_cfg_cache_lock);
+ list_add_tail(&entry->list, &fw_cfg_entry_cache);
+ spin_unlock(&fw_cfg_cache_lock);
+}
+
+static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
+{
+ spin_lock(&fw_cfg_cache_lock);
+ list_del(&entry->list);
+ spin_unlock(&fw_cfg_cache_lock);
+}
+
+static void fw_cfg_sysfs_cache_cleanup(void)
+{
+ struct fw_cfg_sysfs_entry *entry, *next;
+
+ list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
+ /* will end up invoking fw_cfg_sysfs_cache_delist()
+ * via each object's release() method (i.e. destructor) */
+ kobject_put(&entry->kobj);
+ }
+}
+
+/* default_attrs: per-entry attributes and show methods */
+
+#define FW_CFG_SYSFS_ATTR(_attr) \
+struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
+ .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
+ .show = fw_cfg_sysfs_show_##_attr, \
+}
+
+static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
+{
+ return sprintf(buf, "%u\n", e->f.size);
+}
+
+static ssize_t fw_cfg_sysfs_show_select(struct fw_cfg_sysfs_entry *e, char *buf)
+{
+ return sprintf(buf, "%u\n", e->f.select);
+}
+
+static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
+{
+ return sprintf(buf, "%s\n", e->f.name);
+}
+
+static FW_CFG_SYSFS_ATTR(size);
+static FW_CFG_SYSFS_ATTR(select);
+static FW_CFG_SYSFS_ATTR(name);
+
+static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
+ &fw_cfg_sysfs_attr_size.attr,
+ &fw_cfg_sysfs_attr_select.attr,
+ &fw_cfg_sysfs_attr_name.attr,
+ NULL,
+};
+
+/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
+static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
+ char *buf)
+{
+ struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
+ struct fw_cfg_sysfs_attribute *attr = to_attr(a);
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
+ return attr->show(entry, buf);
+}
+
+static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
+ .show = fw_cfg_sysfs_attr_show,
+};
+
+/* release: destructor, to be called via kobject_put() */
+static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
+{
+ struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
+
+ fw_cfg_sysfs_cache_delist(entry);
+ kfree(entry);
+}
+
+/* kobj_type: ties together all properties required to register an entry */
+static struct kobj_type fw_cfg_sysfs_entry_ktype = {
+ .default_attrs = fw_cfg_sysfs_entry_attrs,
+ .sysfs_ops = &fw_cfg_sysfs_attr_ops,
+ .release = fw_cfg_sysfs_release_entry,
+};
+
+/* raw-read method and attribute */
+static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *bin_attr,
+ char *buf, loff_t pos, size_t count)
+{
+ struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
+ if (pos > entry->f.size)
+ return -EINVAL;
+
+ if (count > entry->f.size - pos)
+ count = entry->f.size - pos;
+
+ fw_cfg_read_blob(entry->f.select, buf, pos, count);
+ return count;
+}
+
+static struct bin_attribute fw_cfg_sysfs_attr_raw = {
+ .attr = { .name = "raw", .mode = 0400 },
+ .read = fw_cfg_sysfs_read_raw,
+};
+
+/* kobjects & kset representing top-level, by_select, and by_name folders */
+static struct kobject *fw_cfg_top_ko;
+static struct kobject *fw_cfg_sel_ko;
+
+/* callback function to register an individual fw_cfg file */
+static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
+{
+ int err;
+ struct fw_cfg_sysfs_entry *entry;
+
+ /* allocate new entry */
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return -ENOMEM;
+
+ /* set file entry information */
+ entry->f.size = be32_to_cpu(f->size);
+ entry->f.select = be16_to_cpu(f->select);
+ strcpy(entry->f.name, f->name);
+
+ /* register entry under "/sys/firmware/fw_cfg/by_select/" */
+ err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
+ fw_cfg_sel_ko, "%d", entry->f.select);
+ if (err)
+ goto err_register;
+
+ /* add raw binary content access */
+ err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
+ if (err)
+ goto err_add_raw;
+
+ /* success, add entry to global cache */
+ fw_cfg_sysfs_cache_enlist(entry);
+ return 0;
+
+err_add_raw:
+ kobject_del(&entry->kobj);
+err_register:
+ kfree(entry);
+ return err;
+}
+
+/* unregister top-level or by_select folder */
+static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
+{
+ kobject_del(kobj);
+ kobject_put(kobj);
+}
+
+static int __init fw_cfg_sysfs_init(void)
+{
+ int err;
+
+ /* probe for the fw_cfg "hardware" */
+ err = fw_cfg_io_probe();
+ if (err)
+ return err;
+
+ /* create /sys/firmware/fwcfg/ and its subdirectories */
+ err = -ENOMEM;
+ fw_cfg_top_ko = kobject_create_and_add("fw_cfg", firmware_kobj);
+ if (!fw_cfg_top_ko)
+ goto err_top;
+ fw_cfg_sel_ko = kobject_create_and_add("by_select", fw_cfg_top_ko);
+ if (!fw_cfg_sel_ko)
+ goto err_sel;
+
+ /* get revision number, add matching top-level attribute */
+ fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
+ fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
+ err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
+ if (err)
+ goto err_rev;
+
+ /* process fw_cfg file directory entry, registering each file */
+ err = fw_cfg_scan_dir(fw_cfg_register_file);
+ if (err)
+ goto err_scan;
+
+ /* success */
+ pr_debug("fw_cfg: loaded.\n");
+ return 0;
+
+err_scan:
+ fw_cfg_sysfs_cache_cleanup();
+ sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
+err_rev:
+ fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
+err_sel:
+ fw_cfg_kobj_cleanup(fw_cfg_top_ko);
+err_top:
+ fw_cfg_io_cleanup();
+ return err;
+}
+
+static void __exit fw_cfg_sysfs_exit(void)
+{
+ pr_debug("fw_cfg: unloading.\n");
+ fw_cfg_sysfs_cache_cleanup();
+ fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
+ fw_cfg_kobj_cleanup(fw_cfg_top_ko);
+ fw_cfg_io_cleanup();
+}
+
+module_init(fw_cfg_sysfs_init);
+module_exit(fw_cfg_sysfs_exit);
+MODULE_AUTHOR("Gabriel L. Somlo <[email protected]>");
+MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
+MODULE_LICENSE("GPL");
--
2.4.3

2015-08-10 17:15:49

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH 2/3] kobject: export kset_find_obj() to be used from modules

From: "Gabriel Somlo" <[email protected]>

Signed-off-by: Gabriel Somlo <[email protected]>
---
lib/kobject.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/lib/kobject.c b/lib/kobject.c
index 3e3a5c3..f9754a0 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -1058,3 +1058,4 @@ EXPORT_SYMBOL(kobject_del);

EXPORT_SYMBOL(kset_register);
EXPORT_SYMBOL(kset_unregister);
+EXPORT_SYMBOL(kset_find_obj);
--
2.4.3

2015-08-10 17:15:56

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH 3/3] firmware: fw_cfg: create directory hierarchy for fw_cfg file names

From: "Gabriel Somlo" <[email protected]>

Each fw_cfg entry of type "file" has an associated 56-char,
nul-terminated ASCII string which represents its name. While
the fw_cfg device doesn't itself impose any specific naming
convention, QEMU developers have traditionally used path name
semantics (i.e. "etc/acpi/rsdp") to descriptively name the
various fw_cfg "blobs" passed into the guest.

This patch attempts, on a best effort basis, to create a
directory hierarchy representing the content of fw_cfg file
names, under /sys/firmware/fw_cfg/by_name.

Upon successful creation of all directories representing the
"dirname" portion of a fw_cfg file, a symlink will be created
to represent the "basename", pointing at the appropriate
/sys/firmware/fw_cfg/by_select entry. If a file name is not
suitable for this procedure (e.g., if its basename or dirname
components collide with an already existing dirname component
or basename, respectively) the corresponding fw_cfg blob is
skipped and will remain available in sysfs only by its selector
key value.

Signed-off-by: Gabriel Somlo <[email protected]>
---
Documentation/ABI/testing/sysfs-firmware-fw_cfg | 42 ++++++++++
drivers/firmware/fw_cfg.c | 104 ++++++++++++++++++++++++
2 files changed, 146 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-firmware-fw_cfg b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
index 3a7e7f2..1024776 100644
--- a/Documentation/ABI/testing/sysfs-firmware-fw_cfg
+++ b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
@@ -167,3 +167,45 @@ Description:
entry via the control register, and reading a number
of bytes equal to the blob size from the data
register.
+
+ --- Listing fw_cfg blobs by file name ---
+
+ While the fw_cfg device does not impose any specific naming
+ convention on the blobs registered in the file directory,
+ QEMU developers have traditionally used path name semantics
+ to give each blob a descriptive name. For example:
+
+ "bootorder"
+ "genroms/kvmvapic.bin"
+ "etc/e820"
+ "etc/boot-fail-wait"
+ "etc/system-states"
+ "etc/table-loader"
+ "etc/acpi/rsdp"
+ "etc/acpi/tables"
+ "etc/smbios/smbios-tables"
+ "etc/smbios/smbios-anchor"
+ ...
+
+ In addition to the listing by unique selector key described
+ above, the fw_cfg sysfs driver also attempts to build a tree
+ of directories matching the path name components of fw_cfg
+ blob names, ending in symlinks to the by_select entry for each
+ "basename", as illustrated below (assume current directory is
+ /sys/firmware):
+
+ fw_cfg/by_name/bootorder -> ../by_select/38
+ fw_cfg/by_name/etc/e820 -> ../../by_select/35
+ fw_cfg/by_name/etc/acpi/rsdp -> ../../../by_select/41
+ ...
+
+ Construction of the directory tree and symlinks is done on a
+ "best-effort" basis, as there is no guarantee that components
+ of fw_cfg blob names are always "well behaved". I.e., there is
+ the possibility that a symlink (basename) will conflict with
+ a dirname component of another fw_cfg blob, in which case the
+ creation of the offending /sys/firmware/fw_cfg/by_name entry
+ will be skipped.
+
+ The authoritative list of entries will continue to be found
+ under the /sys/firmware/fw_cfg/by_select directory.
diff --git a/drivers/firmware/fw_cfg.c b/drivers/firmware/fw_cfg.c
index be17411..94233a5 100644
--- a/drivers/firmware/fw_cfg.c
+++ b/drivers/firmware/fw_cfg.c
@@ -327,9 +327,104 @@ static struct bin_attribute fw_cfg_sysfs_attr_raw = {
.read = fw_cfg_sysfs_read_raw,
};

+/*
+ * Create a kset subdirectory matching each '/' delimited dirname token
+ * in 'name', starting with sysfs kset/folder 'dir'; At the end, create
+ * a symlink directed at the given 'target'.
+ * NOTE: We do this on a best-effort basis, since 'name' is not guaranteed
+ * to be a well-behaved path name. Whenever a symlink vs. kset directory
+ * name collision occurs, the kernel will issue big scary warnings while
+ * refusing to add the offending link or directory. We follow up with our
+ * own, slightly less scary error messages explaining the situation :)
+ */
+static int __init fw_cfg_build_symlink(struct kset *dir,
+ struct kobject *target,
+ const char *name)
+{
+ int ret;
+ struct kset *subdir;
+ struct kobject *ko;
+ char *name_copy, *p, *tok;
+
+ if (!dir || !target || !name || !*name)
+ return -EINVAL;
+
+ /* clone a copy of name for parsing */
+ name_copy = p = kstrdup(name, GFP_KERNEL);
+ if (!name_copy)
+ return -ENOMEM;
+
+ /* create folders for each dirname token, then symlink for basename */
+ while ((tok = strsep(&p, "/")) && *tok) {
+
+ /* last (basename) token? If so, add symlink here */
+ if (!p || !*p) {
+ ret = sysfs_create_link(&dir->kobj, target, tok);
+ break;
+ }
+
+ /* does the current dir contain an item named after tok ? */
+ ko = kset_find_obj(dir, tok);
+ if (ko) {
+ /* drop reference added by kset_find_obj */
+ kobject_put(ko);
+
+ /* ko MUST be a kset - we're about to use it as one ! */
+ if (ko->ktype != dir->kobj.ktype) {
+ ret = -EINVAL;
+ break;
+ }
+
+ /* descend into already existing subdirectory */
+ dir = to_kset(ko);
+ } else {
+ /* create new subdirectory kset */
+ subdir = kzalloc(sizeof(struct kset), GFP_KERNEL);
+ if (!subdir) {
+ ret = -ENOMEM;
+ break;
+ }
+ subdir->kobj.kset = dir;
+ subdir->kobj.ktype = dir->kobj.ktype;
+ ret = kobject_set_name(&subdir->kobj, "%s", tok);
+ if (ret) {
+ kfree(subdir);
+ break;
+ }
+ ret = kset_register(subdir);
+ if (ret) {
+ kfree(subdir);
+ break;
+ }
+
+ /* descend into newly created subdirectory */
+ dir = subdir;
+ }
+ }
+
+ /* we're done with cloned copy of name */
+ kfree(name_copy);
+ return ret;
+}
+
+/* recursively unregister fw_cfg/by_name/ kset directory tree */
+static void fw_cfg_kset_unregister_recursive(struct kset *kset)
+{
+ struct kobject *k, *next;
+
+ list_for_each_entry_safe(k, next, &kset->list, entry)
+ /* all set members are ksets too, but check just in case... */
+ if (k->ktype == kset->kobj.ktype)
+ fw_cfg_kset_unregister_recursive(to_kset(k));
+
+ /* symlinks are cleanly and automatically removed with the directory */
+ kset_unregister(kset);
+}
+
/* kobjects & kset representing top-level, by_select, and by_name folders */
static struct kobject *fw_cfg_top_ko;
static struct kobject *fw_cfg_sel_ko;
+static struct kset *fw_cfg_fname_kset;

/* callback function to register an individual fw_cfg file */
static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
@@ -358,6 +453,9 @@ static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
if (err)
goto err_add_raw;

+ /* try adding "/sys/firmware/fw_cfg/by_name/" symlink (best-effort) */
+ fw_cfg_build_symlink(fw_cfg_fname_kset, &entry->kobj, entry->f.name);
+
/* success, add entry to global cache */
fw_cfg_sysfs_cache_enlist(entry);
return 0;
@@ -393,6 +491,9 @@ static int __init fw_cfg_sysfs_init(void)
fw_cfg_sel_ko = kobject_create_and_add("by_select", fw_cfg_top_ko);
if (!fw_cfg_sel_ko)
goto err_sel;
+ fw_cfg_fname_kset = kset_create_and_add("by_name", NULL, fw_cfg_top_ko);
+ if (!fw_cfg_fname_kset)
+ goto err_name;

/* get revision number, add matching top-level attribute */
fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
@@ -414,6 +515,8 @@ err_scan:
fw_cfg_sysfs_cache_cleanup();
sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
err_rev:
+ fw_cfg_kset_unregister_recursive(fw_cfg_fname_kset);
+err_name:
fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
err_sel:
fw_cfg_kobj_cleanup(fw_cfg_top_ko);
@@ -426,6 +529,7 @@ static void __exit fw_cfg_sysfs_exit(void)
{
pr_debug("fw_cfg: unloading.\n");
fw_cfg_sysfs_cache_cleanup();
+ fw_cfg_kset_unregister_recursive(fw_cfg_fname_kset);
fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
fw_cfg_kobj_cleanup(fw_cfg_top_ko);
fw_cfg_io_cleanup();
--
2.4.3

2015-08-10 18:30:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/3] firmware: introduce sysfs driver for QEMU's fw_cfg device

On Mon, Aug 10, 2015 at 12:31:18PM -0400, Gabriel L. Somlo wrote:
> From: "Gabriel Somlo" <[email protected]>
>
> Make fw_cfg entries of type "file" available via sysfs. Entries
> are listed under /sys/firmware/fw_cfg/by_select, in folders named
> after each entry's selector key. Filename, selector value, and
> size read-only attributes are included for each entry. Also, a
> "raw" attribute allows retrieval of the full binary content of
> each entry.
>
> This patch also provides a documentation file outlining the
> guest-side "hardware" interface exposed by the QEMU fw_cfg device.
>
> Signed-off-by: Gabriel Somlo <[email protected]>
> ---
> Documentation/ABI/testing/sysfs-firmware-fw_cfg | 169 +++++++++
> drivers/firmware/Kconfig | 10 +
> drivers/firmware/Makefile | 1 +
> drivers/firmware/fw_cfg.c | 438 ++++++++++++++++++++++++
> 4 files changed, 618 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-firmware-fw_cfg
> create mode 100644 drivers/firmware/fw_cfg.c
>
> diff --git a/Documentation/ABI/testing/sysfs-firmware-fw_cfg b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
> new file mode 100644
> index 0000000..3a7e7f2
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
> @@ -0,0 +1,169 @@
> +What: /sys/firmware/fw_cfg/

/sys/firmware/qemu_fw/ ?

"fw_cfg" is very vague and not descriptive at all.


> +Date: August 2015
> +Contact: Gabriel Somlo <[email protected]>
> +Description:
> + Several different architectures supported by QEMU (x86, arm,
> + sun4*, ppc/mac) are provisioned with a firmware configuration
> + (fw_cfg) device, used by the host to provide configuration data
> + to the starting guest. While most of this data is meant for use
> + by the guest BIOS, starting with QEMU v2.4, guest VMs may be
> + started with arbitrary fw_cfg entries supplied directly on the
> + command line, which therefore may be of interest to userspace.
> +
> + === Guest-side Hardware Interface ===
> +
> + The fw_cfg device is available to guest VMs as a pair (control
> + and data) of registers, accessible as either a IO ports or as
> + MMIO addresses, depending on the architecture.
> +
> + --- Control Register ---
> +
> + Width: 16-bit
> + Access: Write-Only
> + Endianness: LE (if IOport) or BE (if MMIO)
> +
> + A write to the control register selects the index for one of
> + the firmware configuration items (or "blobs") available on the
> + fw_cfg device, which can subsequently be read from the data
> + register.
> +
> + Each time the control register is written, an data offset
> + internal to the fw_cfg device will be set to zero. This data
> + offset impacts which portion of the selected fw_cfg blob is
> + accessed by reading the data register, as explained below.
> +
> + --- Data Register ---
> +
> + Width: 8-bit (if IOport), or 8/16/32/64-bit (if MMIO)
> + Access: Read-Only
> + Endianness: string preserving
> +
> + The data register allows access to an array of bytes which
> + represent the fw_cfg blob last selected by a write to the
> + control register.
> +
> + Immediately following a write to the control register, the data
> + offset will be set to zero. Each successful read access to the
> + data register will increment the data offset by the appropriate
> + access width.
> +
> + Each fw_cfg blob has a maximum associated data length. Once the
> + data offset exceeds this maximum length, any subsequent reads
> + via the data register will return 0x00.
> +
> + An N-byte wide read of the data register will return the next
> + available N bytes of the selected fw_cfg blob, as a substring,
> + in increasing address order, similar to memcpy(), zero-padded
> + if necessary should the maximum data length of the selected
> + item be reached, as described above.
> +
> + --- Per-arch Register Details ---
> +
> + -------------------------------------------------------------
> + arch access base ctrl ctrl data data
> + mode address offset endian offset width
> + max.
> + -------------------------------------------------------------
> + x86 IOport 0x510 0 LE 1 8
> + x86_64 IOport 0x510 0 LE 1 8
> + arm MMIO 0x9020000 8 BE 0 64
> + sun4u IOport 0x510 0 LE 1 8
> + sun4m MMIO 0xd00000510 0 BE 2 8
> + ppc/mac MMIO 0xf0000510 0 BE 2 8
> + -------------------------------------------------------------
> +
> + NOTE: On platforms where the fw_cfg registers are exposed as
> + IO ports, the data port number will always be one greater than
> + the port number of the control register. I.e., the two ports
> + are overlapping, and can not be mapped separately.
> +
> + === Firmware Configuration Items of Interest ===
> +
> + Originally, the index key, size, and formatting of blobs in
> + fw_cfg was hard coded by mutual agreement between QEMU on the
> + host side, and the BIOS running on the guest. Later on, a file
> + transfer interface was added: by reading a special blob, the
> + fw_cfg consumer can retrieve a list of records containing the
> + name, selector key, and size of further fw_cfg blobs made
> + available by the host. Below we describe three fw_cfg blobs
> + of interest to the sysfs driver.
> +
> + --- Signature (Key 0x0000, FW_CFG_SIGNATURE) ---
> +
> + The presence of the fw_cfg device can be verified by selecting
> + the signature blob by writing 0x0000 to the control register,
> + and reading four bytes from the data register. If the fw_cfg
> + device is present, the four bytes read will match the ASCII
> + characters "QEMU".

Why is this a binary sysfs file? It really sounds like you want a char
device, so you can do ioctl commands on it, right?


> +
> + --- Revision (Key 0x0001, FW_CFG_ID) ---
> +
> + A 32-bit little-endian unsigned integer, this item is used as
> + an interface revision number.
> +
> + --- File Directory (Key 0x0019, FW_CFG_FILE_DIR) ---
> +
> + Any fw_cfg blobs stored at key 0x0020 FW_CFG_FILE_FIRST() or
> + higher will have an associated entry in this "directory" blob,
> + which facilitates the discovery of available items by software
> + (e.g. BIOS) running on the guest. The format of the directory
> + blob is shown below.
> +
> + NOTE: All integers are stored in big-endian format!
> +
> + /* the entire file directory "blob" */
> + struct FWCfgFiles {
> + uint32_t count; /* total number of entries */
> + struct FWCfgFile f[]; /* entry array, see below */
> + };
> +
> + /* an individual directory entry, 64 bytes total */
> + struct FWCfgFile {
> + uint32_t size; /* size of referenced blob */
> + uint16_t select; /* blob selector key */
> + uint16_t reserved;
> + char name[56]; /* blob name, nul-term. ascii */
> + };
> +
> + === SysFS fw_cfg Interface ===
> +
> + The fw_cfg sysfs interface described in this document is only
> + intended to display discoverable blobs (i.e., those registered
> + with the file directory), as there is no way to determine the
> + presence or size of "legacy" blobs (with selector keys between
> + 0x0002 and 0x0018) programmatically.
> +
> + All fw_cfg information is shown under:
> +
> + /sys/firmware/fw_cfg/
> +
> + The only legacy blob displayed is the fw_cfg device revision:
> +
> + /sys/firmware/fw_cfg/rev
> +
> + --- Discoverable fw_cfg blobs by selector key ---
> +
> + All discoverable blobs listed in the fw_cfg file directory are
> + displayed as entries named after their unique selector key
> + value, e.g.:
> +
> + /sys/firmware/fw_cfg/by_select/32
> + /sys/firmware/fw_cfg/by_select/33
> + /sys/firmware/fw_cfg/by_select/34
> + ...
> +
> + Each such fw_cfg sysfs entry has the following values exported
> + as attributes:
> +
> + name : The 56-byte nul-terminated ASCII string used as the
> + blob's 'file name' in the fw_cfg directory.
> + size : The length of the blob, as given in the fw_cfg
> + directory.
> + select : The value of the blob's selector key as given in the
> + fw_cfg directory. This value is the same as used in
> + the parent directory name.
> + how the rest of the entry should be interpreted.
> + raw : The raw bytes of the blob, obtained by selecting the
> + entry via the control register, and reading a number
> + of bytes equal to the blob size from the data
> + register.
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 99c69a3..f5cbe81 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -136,6 +136,16 @@ config QCOM_SCM
> bool
> depends on ARM || ARM64
>
> +config FW_CFG_SYSFS
> + tristate "QEMU fw_cfg device support in sysfs"
> + depends on SYSFS
> + default n
> + help
> + Say Y or M here to enable the exporting of the QEMU firmware
> + configuration (fw_cfg) file entries via sysfs. Entries are
> + found under /sys/firmware/fw_cfg when this option is enabled
> + and loaded.
> +
> source "drivers/firmware/broadcom/Kconfig"
> source "drivers/firmware/google/Kconfig"
> source "drivers/firmware/efi/Kconfig"
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 4a4b897..b81b46e 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> obj-$(CONFIG_QCOM_SCM) += qcom_scm-32.o
> +obj-$(CONFIG_FW_CFG_SYSFS) += fw_cfg.o
> CFLAGS_qcom_scm-32.o :=$(call as-instr,.arch_extension sec,-DREQUIRES_SEC=1)
>
> obj-y += broadcom/
> diff --git a/drivers/firmware/fw_cfg.c b/drivers/firmware/fw_cfg.c
> new file mode 100644
> index 0000000..be17411
> --- /dev/null
> +++ b/drivers/firmware/fw_cfg.c
> @@ -0,0 +1,438 @@
> +/*
> + * drivers/firmware/fw_cfg.c
> + *
> + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> + * sysfs (read-only, under "/sys/firmware/fw_cfg/...").
> + *
> + * Copyright 2015 Carnegie Mellon University
> + */
> +
> +#include <linux/module.h>
> +#include <linux/capability.h>
> +#include <linux/slab.h>
> +#include <linux/io.h>
> +#include <linux/ioport.h>
> +#include <linux/ctype.h>
> +
> +/* selector values for "well-known" fw_cfg entries */
> +#define FW_CFG_SIGNATURE 0x00
> +#define FW_CFG_ID 0x01
> +#define FW_CFG_FILE_DIR 0x19
> +
> +/* size in bytes of fw_cfg signature */
> +#define FW_CFG_SIG_SIZE 4
> +
> +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> +#define FW_CFG_MAX_FILE_PATH 56
> +
> +/* fw_cfg file directory entry type */
> +struct fw_cfg_file {
> + uint32_t size;
> + uint16_t select;
> + uint16_t reserved;

Those aren't valid kernel types, use u32, and u16 please.

> + char name[FW_CFG_MAX_FILE_PATH];
> +};
> +
> +/* fw_cfg device i/o access options type */
> +struct fw_cfg_access {
> + phys_addr_t start;
> + uint8_t size;
> + uint8_t ctrl_offset;
> + uint8_t data_offset;

u8.

> + bool is_mmio;
> + const char *name;
> +};
> +
> +/* fw_cfg device i/o access available options for known architectures */
> +static struct fw_cfg_access fw_cfg_modes[] = {
> + { 0x510, 2, 0, 1, false, "fw_cfg on i386, sun4u" },
> + { 0x9020000, 10, 8, 0, true, "fw_cfg on arm" },
> + { 0xd00000510, 3, 0, 2, true, "fw_cfg on sun4m" },
> + { 0xf0000510, 3, 0, 2, true, "fw_cfg on ppc/mac" },

named identifiers please.


> + { }
> +};
> +
> +/* fw_cfg device i/o currently selected option set */
> +static struct fw_cfg_access *fw_cfg_mode;
> +
> +/* fw_cfg device i/o register addresses */
> +static void __iomem *fw_cfg_dev_base;
> +static void __iomem *fw_cfg_dev_ctrl;
> +static void __iomem *fw_cfg_dev_data;
> +
> +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> +static DEFINE_MUTEX(fw_cfg_dev_lock);
> +
> +/* pick apropriate endianness for selector key */
> +static inline uint16_t fw_cfg_sel_endianness(uint16_t select)
> +{
> + return fw_cfg_mode->is_mmio ? cpu_to_be16(select) : cpu_to_le16(select);
> +}
> +
> +/* type for fw_cfg "directory scan" visitor/callback function */
> +typedef int (*fw_cfg_file_callback)(const struct fw_cfg_file *f);
> +
> +/* run a given callback on each fw_cfg directory entry */
> +static int fw_cfg_scan_dir(fw_cfg_file_callback callback)
> +{
> + int ret = 0;
> + uint32_t count, i;

u32. Please remove all the *_t variable types. and i should be an int
here, right?

> + struct fw_cfg_file f;
> +
> + mutex_lock(&fw_cfg_dev_lock);
> + iowrite16(fw_cfg_sel_endianness(FW_CFG_FILE_DIR), fw_cfg_dev_ctrl);
> + ioread8_rep(fw_cfg_dev_data, &count, sizeof(count));
> + for (i = 0; i < be32_to_cpu(count); i++) {
> + ioread8_rep(fw_cfg_dev_data, &f, sizeof(f));
> + ret = callback(&f);
> + if (ret)
> + break;
> + }
> + mutex_unlock(&fw_cfg_dev_lock);
> + return ret;
> +}
> +
> +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> +static inline void fw_cfg_read_blob(uint16_t select,
> + void *buf, loff_t pos, size_t count)
> +{
> + mutex_lock(&fw_cfg_dev_lock);
> + iowrite16(fw_cfg_sel_endianness(select), fw_cfg_dev_ctrl);
> + while (pos-- > 0)
> + ioread8(fw_cfg_dev_data);
> + ioread8_rep(fw_cfg_dev_data, buf, count);
> + mutex_unlock(&fw_cfg_dev_lock);
> +}
> +
> +/* clean up fw_cfg device i/o setup */
> +static void fw_cfg_io_cleanup(void)
> +{
> + if (fw_cfg_mode->is_mmio) {
> + iounmap(fw_cfg_dev_base);
> + release_mem_region(fw_cfg_mode->start, fw_cfg_mode->size);
> + } else {
> + ioport_unmap(fw_cfg_dev_base);
> + release_region(fw_cfg_mode->start, fw_cfg_mode->size);
> + }
> +}
> +
> +/* probe and map fw_cfg device */
> +static int __init fw_cfg_io_probe(void)
> +{
> + char sig[FW_CFG_SIG_SIZE];
> +
> + for (fw_cfg_mode = &fw_cfg_modes[0];
> + fw_cfg_mode->start; fw_cfg_mode++) {
> +
> + phys_addr_t start = fw_cfg_mode->start;
> + uint8_t size = fw_cfg_mode->size;
> +
> + /* reserve and map mmio or ioport region */
> + if (fw_cfg_mode->is_mmio) {
> + if (!request_mem_region(start, size, fw_cfg_mode->name))
> + continue;
> + fw_cfg_dev_base = ioremap(start, size);
> + if (!fw_cfg_dev_base) {
> + release_mem_region(start, size);
> + continue;
> + }
> + } else {
> + if (!request_region(start, size, fw_cfg_mode->name))
> + continue;
> + fw_cfg_dev_base = ioport_map(start, size);
> + if (!fw_cfg_dev_base) {
> + release_region(start, size);
> + continue;
> + }
> + }
> +
> + /* set control and data register addresses */
> + fw_cfg_dev_ctrl = fw_cfg_dev_base + fw_cfg_mode->ctrl_offset;
> + fw_cfg_dev_data = fw_cfg_dev_base + fw_cfg_mode->data_offset;
> +
> + /* verify fw_cfg device signature */
> + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) == 0)
> + /* success, we're done */
> + return 0;
> +
> + /* clean up before probing next access mode */
> + fw_cfg_io_cleanup();
> + }
> +
> + return -ENODEV;
> +}
> +
> +/* fw_cfg revision attribute, placed in top-level /sys/fw_cfg directory */
> +static uint32_t fw_cfg_rev;
> +
> +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> +{
> + return sprintf(buf, "%u\n", fw_cfg_rev);
> +}
> +
> +static const struct {
> + struct attribute attr;
> + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> +} fw_cfg_rev_attr = {
> + .attr = { .name = "rev", .mode = S_IRUSR },
> + .show = fw_cfg_showrev,
> +};
> +
> +/* fw_cfg_sysfs_entry type */
> +struct fw_cfg_sysfs_entry {
> + struct kobject kobj;
> + struct fw_cfg_file f;
> + struct list_head list;
> +};
> +
> +/* get fw_cfg_sysfs_entry from kobject member */
> +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> +{
> + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> +}
> +
> +/* fw_cfg_sysfs_attribute type */
> +struct fw_cfg_sysfs_attribute {
> + struct attribute attr;
> + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> +};
> +
> +/* get fw_cfg_sysfs_attribute from attribute member */
> +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> +{
> + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> +}
> +
> +/* global cache of fw_cfg_sysfs_entry objects */
> +static LIST_HEAD(fw_cfg_entry_cache);
> +
> +/* kobjects removed lazily by kernel, mutual exclusion needed */
> +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> +
> +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> +{
> + spin_lock(&fw_cfg_cache_lock);
> + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> + spin_unlock(&fw_cfg_cache_lock);
> +}
> +
> +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> +{
> + spin_lock(&fw_cfg_cache_lock);
> + list_del(&entry->list);
> + spin_unlock(&fw_cfg_cache_lock);
> +}
> +
> +static void fw_cfg_sysfs_cache_cleanup(void)
> +{
> + struct fw_cfg_sysfs_entry *entry, *next;
> +
> + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> + /* will end up invoking fw_cfg_sysfs_cache_delist()
> + * via each object's release() method (i.e. destructor) */
> + kobject_put(&entry->kobj);
> + }
> +}
> +
> +/* default_attrs: per-entry attributes and show methods */
> +
> +#define FW_CFG_SYSFS_ATTR(_attr) \
> +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> + .show = fw_cfg_sysfs_show_##_attr, \
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%u\n", e->f.size);
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_select(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%u\n", e->f.select);
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%s\n", e->f.name);
> +}
> +
> +static FW_CFG_SYSFS_ATTR(size);
> +static FW_CFG_SYSFS_ATTR(select);
> +static FW_CFG_SYSFS_ATTR(name);
> +
> +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> + &fw_cfg_sysfs_attr_size.attr,
> + &fw_cfg_sysfs_attr_select.attr,
> + &fw_cfg_sysfs_attr_name.attr,
> + NULL,
> +};
> +
> +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> + char *buf)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EACCES;

Shouldn't the file permissions handle this properly for you?

> +
> + return attr->show(entry, buf);
> +}
> +
> +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> + .show = fw_cfg_sysfs_attr_show,
> +};
> +
> +/* release: destructor, to be called via kobject_put() */
> +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> +
> + fw_cfg_sysfs_cache_delist(entry);
> + kfree(entry);
> +}
> +
> +/* kobj_type: ties together all properties required to register an entry */
> +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> + .default_attrs = fw_cfg_sysfs_entry_attrs,
> + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> + .release = fw_cfg_sysfs_release_entry,
> +};
> +
> +/* raw-read method and attribute */
> +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr,
> + char *buf, loff_t pos, size_t count)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EACCES;

Again, file permissions?

> +
> + if (pos > entry->f.size)
> + return -EINVAL;
> +
> + if (count > entry->f.size - pos)
> + count = entry->f.size - pos;
> +
> + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> + return count;
> +}
> +
> +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> + .attr = { .name = "raw", .mode = 0400 },
> + .read = fw_cfg_sysfs_read_raw,
> +};
> +
> +/* kobjects & kset representing top-level, by_select, and by_name folders */
> +static struct kobject *fw_cfg_top_ko;
> +static struct kobject *fw_cfg_sel_ko;
> +
> +/* callback function to register an individual fw_cfg file */
> +static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
> +{
> + int err;
> + struct fw_cfg_sysfs_entry *entry;
> +
> + /* allocate new entry */
> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + if (!entry)
> + return -ENOMEM;
> +
> + /* set file entry information */
> + entry->f.size = be32_to_cpu(f->size);
> + entry->f.select = be16_to_cpu(f->select);
> + strcpy(entry->f.name, f->name);
> +
> + /* register entry under "/sys/firmware/fw_cfg/by_select/" */
> + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> + fw_cfg_sel_ko, "%d", entry->f.select);
> + if (err)
> + goto err_register;
> +
> + /* add raw binary content access */
> + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> + if (err)
> + goto err_add_raw;
> +
> + /* success, add entry to global cache */
> + fw_cfg_sysfs_cache_enlist(entry);
> + return 0;
> +
> +err_add_raw:
> + kobject_del(&entry->kobj);
> +err_register:
> + kfree(entry);
> + return err;
> +}
> +
> +/* unregister top-level or by_select folder */
> +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> +{
> + kobject_del(kobj);
> + kobject_put(kobj);
> +}
> +
> +static int __init fw_cfg_sysfs_init(void)
> +{
> + int err;
> +
> + /* probe for the fw_cfg "hardware" */
> + err = fw_cfg_io_probe();
> + if (err)
> + return err;
> +
> + /* create /sys/firmware/fwcfg/ and its subdirectories */
> + err = -ENOMEM;
> + fw_cfg_top_ko = kobject_create_and_add("fw_cfg", firmware_kobj);
> + if (!fw_cfg_top_ko)
> + goto err_top;
> + fw_cfg_sel_ko = kobject_create_and_add("by_select", fw_cfg_top_ko);
> + if (!fw_cfg_sel_ko)
> + goto err_sel;
> +
> + /* get revision number, add matching top-level attribute */
> + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> + if (err)
> + goto err_rev;
> +
> + /* process fw_cfg file directory entry, registering each file */
> + err = fw_cfg_scan_dir(fw_cfg_register_file);

Dealing with all of these "raw" kobjecs makes me nervous. Why can't you
use 'struct device' for this instead, on the system bus?

thanks,

greg k-h

2015-08-10 18:32:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3/3] firmware: fw_cfg: create directory hierarchy for fw_cfg file names

On Mon, Aug 10, 2015 at 12:31:20PM -0400, Gabriel L. Somlo wrote:
> From: "Gabriel Somlo" <[email protected]>
>
> Each fw_cfg entry of type "file" has an associated 56-char,
> nul-terminated ASCII string which represents its name. While
> the fw_cfg device doesn't itself impose any specific naming
> convention, QEMU developers have traditionally used path name
> semantics (i.e. "etc/acpi/rsdp") to descriptively name the
> various fw_cfg "blobs" passed into the guest.
>
> This patch attempts, on a best effort basis, to create a
> directory hierarchy representing the content of fw_cfg file
> names, under /sys/firmware/fw_cfg/by_name.
>
> Upon successful creation of all directories representing the
> "dirname" portion of a fw_cfg file, a symlink will be created
> to represent the "basename", pointing at the appropriate
> /sys/firmware/fw_cfg/by_select entry. If a file name is not
> suitable for this procedure (e.g., if its basename or dirname
> components collide with an already existing dirname component
> or basename, respectively) the corresponding fw_cfg blob is
> skipped and will remain available in sysfs only by its selector
> key value.

Shouldn't all of this be done in userspace with the symlinks and all?
It seems like you are trying to duplicate the /dev/block/by-name and
such. Policy decisions like symlinks and naming should be done there,
in userspace, and not directly in sysfs if at all possible.

Again, why can't this be a bunch of character device nodes? It seems
like you want to access them that way (read/write, ioctl, etc.)

thanks,

greg k-h

2015-08-10 18:33:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/3] kobject: export kset_find_obj() to be used from modules

On Mon, Aug 10, 2015 at 12:31:19PM -0400, Gabriel L. Somlo wrote:
> From: "Gabriel Somlo" <[email protected]>
>
> Signed-off-by: Gabriel Somlo <[email protected]>
> ---
> lib/kobject.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/lib/kobject.c b/lib/kobject.c
> index 3e3a5c3..f9754a0 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -1058,3 +1058,4 @@ EXPORT_SYMBOL(kobject_del);
>
> EXPORT_SYMBOL(kset_register);
> EXPORT_SYMBOL(kset_unregister);
> +EXPORT_SYMBOL(kset_find_obj);

checkpatch doesn't like this for obvious reasons :)

2015-08-10 19:09:15

by Gabriel L. Somlo

[permalink] [raw]
Subject: Re: [PATCH 2/3] kobject: export kset_find_obj() to be used from modules

On Mon, Aug 10, 2015 at 11:33:04AM -0700, Greg KH wrote:
> On Mon, Aug 10, 2015 at 12:31:19PM -0400, Gabriel L. Somlo wrote:
> > From: "Gabriel Somlo" <[email protected]>
> >
> > Signed-off-by: Gabriel Somlo <[email protected]>
> > ---
> > lib/kobject.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index 3e3a5c3..f9754a0 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -1058,3 +1058,4 @@ EXPORT_SYMBOL(kobject_del);
> >
> > EXPORT_SYMBOL(kset_register);
> > EXPORT_SYMBOL(kset_unregister);
> > +EXPORT_SYMBOL(kset_find_obj);
>
> checkpatch doesn't like this for obvious reasons :)

When I run checkpatch on this one, it comes back fine. Could you
please elaborate ?

[staging]$ scripts/checkpatch.pl
0002-kobject-export-kset_find_obj-to-be-used-from-modules.patch
total: 0 errors, 0 warnings, 4 lines checked

0002-kobject-export-kset_find_obj-to-be-used-from-modules.patch has no
obvious style problems and is ready for submission.


Thanks much,
--Gabriel

2015-08-10 18:54:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/3] kobject: export kset_find_obj() to be used from modules

On Mon, Aug 10, 2015 at 02:43:10PM -0400, Gabriel L. Somlo wrote:
> On Mon, Aug 10, 2015 at 11:33:04AM -0700, Greg KH wrote:
> > On Mon, Aug 10, 2015 at 12:31:19PM -0400, Gabriel L. Somlo wrote:
> > > From: "Gabriel Somlo" <[email protected]>
> > >
> > > Signed-off-by: Gabriel Somlo <[email protected]>
> > > ---
> > > lib/kobject.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > index 3e3a5c3..f9754a0 100644
> > > --- a/lib/kobject.c
> > > +++ b/lib/kobject.c
> > > @@ -1058,3 +1058,4 @@ EXPORT_SYMBOL(kobject_del);
> > >
> > > EXPORT_SYMBOL(kset_register);
> > > EXPORT_SYMBOL(kset_unregister);
> > > +EXPORT_SYMBOL(kset_find_obj);
> >
> > checkpatch doesn't like this for obvious reasons :)
>
> When I run checkpatch on this one, it comes back fine. Could you
> please elaborate ?
>
> [staging]$ scripts/checkpatch.pl
> 0002-kobject-export-kset_find_obj-to-be-used-from-modules.patch
> total: 0 errors, 0 warnings, 4 lines checked
>
> 0002-kobject-export-kset_find_obj-to-be-used-from-modules.patch has no
> obvious style problems and is ready for submission.

EXPORT_SYMBOL() should be below the function definition itself, not at
the bottom of the file. Odd that checkpatch doesn't catch this when
done in this manner, it will if you run it on the whole file after your
patch is applied (along with the existing problems.)

But this is a minor issue, see my other review first...

greg k-h

2015-08-10 19:05:53

by Gabriel L. Somlo

[permalink] [raw]
Subject: Re: [PATCH 1/3] firmware: introduce sysfs driver for QEMU's fw_cfg device

On Mon, Aug 10, 2015 at 11:30:54AM -0700, Greg KH wrote:
> On Mon, Aug 10, 2015 at 12:31:18PM -0400, Gabriel L. Somlo wrote:
> > From: "Gabriel Somlo" <[email protected]>
> >
> > Make fw_cfg entries of type "file" available via sysfs. Entries
> > are listed under /sys/firmware/fw_cfg/by_select, in folders named
> > after each entry's selector key. Filename, selector value, and
> > size read-only attributes are included for each entry. Also, a
> > "raw" attribute allows retrieval of the full binary content of
> > each entry.
> >
> > This patch also provides a documentation file outlining the
> > guest-side "hardware" interface exposed by the QEMU fw_cfg device.
> >
> > Signed-off-by: Gabriel Somlo <[email protected]>
> > ---
> > Documentation/ABI/testing/sysfs-firmware-fw_cfg | 169 +++++++++
> > drivers/firmware/Kconfig | 10 +
> > drivers/firmware/Makefile | 1 +
> > drivers/firmware/fw_cfg.c | 438 ++++++++++++++++++++++++
> > 4 files changed, 618 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-firmware-fw_cfg
> > create mode 100644 drivers/firmware/fw_cfg.c
> >
> > diff --git a/Documentation/ABI/testing/sysfs-firmware-fw_cfg b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
> > new file mode 100644
> > index 0000000..3a7e7f2
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-firmware-fw_cfg
> > @@ -0,0 +1,169 @@
> > +What: /sys/firmware/fw_cfg/
>
> /sys/firmware/qemu_fw/ ?
>
> "fw_cfg" is very vague and not descriptive at all.

How about /sys/firmware/qemu_fw_cfg/, then ? In the QEMU universe,
"fw_cfg" is familiar. OTOH "qemu_fw" would not immediately ring a bell...

>
>
> > +Date: August 2015
> > +Contact: Gabriel Somlo <[email protected]>
> > +Description:
> > + Several different architectures supported by QEMU (x86, arm,
> > + sun4*, ppc/mac) are provisioned with a firmware configuration
> > + (fw_cfg) device, used by the host to provide configuration data
> > + to the starting guest. While most of this data is meant for use
> > + by the guest BIOS, starting with QEMU v2.4, guest VMs may be
> > + started with arbitrary fw_cfg entries supplied directly on the
> > + command line, which therefore may be of interest to userspace.
> > +
> > + === Guest-side Hardware Interface ===
> > +
> > + The fw_cfg device is available to guest VMs as a pair (control
> > + and data) of registers, accessible as either a IO ports or as
> > + MMIO addresses, depending on the architecture.
> > +
> > + --- Control Register ---
> > +
> > + Width: 16-bit
> > + Access: Write-Only
> > + Endianness: LE (if IOport) or BE (if MMIO)
> > +
> > + A write to the control register selects the index for one of
> > + the firmware configuration items (or "blobs") available on the
> > + fw_cfg device, which can subsequently be read from the data
> > + register.
> > +
> > + Each time the control register is written, an data offset
> > + internal to the fw_cfg device will be set to zero. This data
> > + offset impacts which portion of the selected fw_cfg blob is
> > + accessed by reading the data register, as explained below.
> > +
> > + --- Data Register ---
> > +
> > + Width: 8-bit (if IOport), or 8/16/32/64-bit (if MMIO)
> > + Access: Read-Only
> > + Endianness: string preserving
> > +
> > + The data register allows access to an array of bytes which
> > + represent the fw_cfg blob last selected by a write to the
> > + control register.
> > +
> > + Immediately following a write to the control register, the data
> > + offset will be set to zero. Each successful read access to the
> > + data register will increment the data offset by the appropriate
> > + access width.
> > +
> > + Each fw_cfg blob has a maximum associated data length. Once the
> > + data offset exceeds this maximum length, any subsequent reads
> > + via the data register will return 0x00.
> > +
> > + An N-byte wide read of the data register will return the next
> > + available N bytes of the selected fw_cfg blob, as a substring,
> > + in increasing address order, similar to memcpy(), zero-padded
> > + if necessary should the maximum data length of the selected
> > + item be reached, as described above.
> > +
> > + --- Per-arch Register Details ---
> > +
> > + -------------------------------------------------------------
> > + arch access base ctrl ctrl data data
> > + mode address offset endian offset width
> > + max.
> > + -------------------------------------------------------------
> > + x86 IOport 0x510 0 LE 1 8
> > + x86_64 IOport 0x510 0 LE 1 8
> > + arm MMIO 0x9020000 8 BE 0 64
> > + sun4u IOport 0x510 0 LE 1 8
> > + sun4m MMIO 0xd00000510 0 BE 2 8
> > + ppc/mac MMIO 0xf0000510 0 BE 2 8
> > + -------------------------------------------------------------
> > +
> > + NOTE: On platforms where the fw_cfg registers are exposed as
> > + IO ports, the data port number will always be one greater than
> > + the port number of the control register. I.e., the two ports
> > + are overlapping, and can not be mapped separately.
> > +
> > + === Firmware Configuration Items of Interest ===
> > +
> > + Originally, the index key, size, and formatting of blobs in
> > + fw_cfg was hard coded by mutual agreement between QEMU on the
> > + host side, and the BIOS running on the guest. Later on, a file
> > + transfer interface was added: by reading a special blob, the
> > + fw_cfg consumer can retrieve a list of records containing the
> > + name, selector key, and size of further fw_cfg blobs made
> > + available by the host. Below we describe three fw_cfg blobs
> > + of interest to the sysfs driver.
> > +
> > + --- Signature (Key 0x0000, FW_CFG_SIGNATURE) ---
> > +
> > + The presence of the fw_cfg device can be verified by selecting
> > + the signature blob by writing 0x0000 to the control register,
> > + and reading four bytes from the data register. If the fw_cfg
> > + device is present, the four bytes read will match the ASCII
> > + characters "QEMU".
>
> Why is this a binary sysfs file? It really sounds like you want a char
> device, so you can do ioctl commands on it, right?

I felt it necessary to explain the most important fw_cfg blobs before
getting into the sysfs layout proper. This blob isn't displayed in
sysfs, just used by the module_init function to confirm we have, in
fact, a valid fw_cfg device present on the system.

>
>
> > +
> > + --- Revision (Key 0x0001, FW_CFG_ID) ---
> > +
> > + A 32-bit little-endian unsigned integer, this item is used as
> > + an interface revision number.
> > +
> > + --- File Directory (Key 0x0019, FW_CFG_FILE_DIR) ---
> > +
> > + Any fw_cfg blobs stored at key 0x0020 FW_CFG_FILE_FIRST() or
> > + higher will have an associated entry in this "directory" blob,
> > + which facilitates the discovery of available items by software
> > + (e.g. BIOS) running on the guest. The format of the directory
> > + blob is shown below.
> > +
> > + NOTE: All integers are stored in big-endian format!
> > +
> > + /* the entire file directory "blob" */
> > + struct FWCfgFiles {
> > + uint32_t count; /* total number of entries */
> > + struct FWCfgFile f[]; /* entry array, see below */
> > + };
> > +
> > + /* an individual directory entry, 64 bytes total */
> > + struct FWCfgFile {
> > + uint32_t size; /* size of referenced blob */
> > + uint16_t select; /* blob selector key */
> > + uint16_t reserved;
> > + char name[56]; /* blob name, nul-term. ascii */
> > + };
> > +
> > + === SysFS fw_cfg Interface ===
> > +
> > + The fw_cfg sysfs interface described in this document is only
> > + intended to display discoverable blobs (i.e., those registered
> > + with the file directory), as there is no way to determine the
> > + presence or size of "legacy" blobs (with selector keys between
> > + 0x0002 and 0x0018) programmatically.
> > +
> > + All fw_cfg information is shown under:
> > +
> > + /sys/firmware/fw_cfg/
> > +
> > + The only legacy blob displayed is the fw_cfg device revision:
> > +
> > + /sys/firmware/fw_cfg/rev
> > +
> > + --- Discoverable fw_cfg blobs by selector key ---
> > +
> > + All discoverable blobs listed in the fw_cfg file directory are
> > + displayed as entries named after their unique selector key
> > + value, e.g.:
> > +
> > + /sys/firmware/fw_cfg/by_select/32
> > + /sys/firmware/fw_cfg/by_select/33
> > + /sys/firmware/fw_cfg/by_select/34
> > + ...
> > +
> > + Each such fw_cfg sysfs entry has the following values exported
> > + as attributes:
> > +
> > + name : The 56-byte nul-terminated ASCII string used as the
> > + blob's 'file name' in the fw_cfg directory.
> > + size : The length of the blob, as given in the fw_cfg
> > + directory.
> > + select : The value of the blob's selector key as given in the
> > + fw_cfg directory. This value is the same as used in
> > + the parent directory name.
> > + how the rest of the entry should be interpreted.
> > + raw : The raw bytes of the blob, obtained by selecting the
> > + entry via the control register, and reading a number
> > + of bytes equal to the blob size from the data
> > + register.
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index 99c69a3..f5cbe81 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -136,6 +136,16 @@ config QCOM_SCM
> > bool
> > depends on ARM || ARM64
> >
> > +config FW_CFG_SYSFS
> > + tristate "QEMU fw_cfg device support in sysfs"
> > + depends on SYSFS
> > + default n
> > + help
> > + Say Y or M here to enable the exporting of the QEMU firmware
> > + configuration (fw_cfg) file entries via sysfs. Entries are
> > + found under /sys/firmware/fw_cfg when this option is enabled
> > + and loaded.
> > +
> > source "drivers/firmware/broadcom/Kconfig"
> > source "drivers/firmware/google/Kconfig"
> > source "drivers/firmware/efi/Kconfig"
> > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> > index 4a4b897..b81b46e 100644
> > --- a/drivers/firmware/Makefile
> > +++ b/drivers/firmware/Makefile
> > @@ -13,6 +13,7 @@ obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> > obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> > obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> > obj-$(CONFIG_QCOM_SCM) += qcom_scm-32.o
> > +obj-$(CONFIG_FW_CFG_SYSFS) += fw_cfg.o
> > CFLAGS_qcom_scm-32.o :=$(call as-instr,.arch_extension sec,-DREQUIRES_SEC=1)
> >
> > obj-y += broadcom/
> > diff --git a/drivers/firmware/fw_cfg.c b/drivers/firmware/fw_cfg.c
> > new file mode 100644
> > index 0000000..be17411
> > --- /dev/null
> > +++ b/drivers/firmware/fw_cfg.c
> > @@ -0,0 +1,438 @@
> > +/*
> > + * drivers/firmware/fw_cfg.c
> > + *
> > + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> > + * sysfs (read-only, under "/sys/firmware/fw_cfg/...").
> > + *
> > + * Copyright 2015 Carnegie Mellon University
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/capability.h>
> > +#include <linux/slab.h>
> > +#include <linux/io.h>
> > +#include <linux/ioport.h>
> > +#include <linux/ctype.h>
> > +
> > +/* selector values for "well-known" fw_cfg entries */
> > +#define FW_CFG_SIGNATURE 0x00
> > +#define FW_CFG_ID 0x01
> > +#define FW_CFG_FILE_DIR 0x19
> > +
> > +/* size in bytes of fw_cfg signature */
> > +#define FW_CFG_SIG_SIZE 4
> > +
> > +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> > +#define FW_CFG_MAX_FILE_PATH 56
> > +
> > +/* fw_cfg file directory entry type */
> > +struct fw_cfg_file {
> > + uint32_t size;
> > + uint16_t select;
> > + uint16_t reserved;
>
> Those aren't valid kernel types, use u32, and u16 please.
>
> > + char name[FW_CFG_MAX_FILE_PATH];
> > +};
> > +
> > +/* fw_cfg device i/o access options type */
> > +struct fw_cfg_access {
> > + phys_addr_t start;
> > + uint8_t size;
> > + uint8_t ctrl_offset;
> > + uint8_t data_offset;
>
> u8.

Got it, s/uintXX_t/uXX/g in v2 :)

>
> > + bool is_mmio;
> > + const char *name;
> > +};
> > +
> > +/* fw_cfg device i/o access available options for known architectures */
> > +static struct fw_cfg_access fw_cfg_modes[] = {
> > + { 0x510, 2, 0, 1, false, "fw_cfg on i386, sun4u" },
> > + { 0x9020000, 10, 8, 0, true, "fw_cfg on arm" },
> > + { 0xd00000510, 3, 0, 2, true, "fw_cfg on sun4m" },
> > + { 0xf0000510, 3, 0, 2, true, "fw_cfg on ppc/mac" },
>
> named identifiers please.

OK, will do.

>
>
> > + { }
> > +};
> > +
> > +/* fw_cfg device i/o currently selected option set */
> > +static struct fw_cfg_access *fw_cfg_mode;
> > +
> > +/* fw_cfg device i/o register addresses */
> > +static void __iomem *fw_cfg_dev_base;
> > +static void __iomem *fw_cfg_dev_ctrl;
> > +static void __iomem *fw_cfg_dev_data;
> > +
> > +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> > +static DEFINE_MUTEX(fw_cfg_dev_lock);
> > +
> > +/* pick apropriate endianness for selector key */
> > +static inline uint16_t fw_cfg_sel_endianness(uint16_t select)
> > +{
> > + return fw_cfg_mode->is_mmio ? cpu_to_be16(select) : cpu_to_le16(select);
> > +}
> > +
> > +/* type for fw_cfg "directory scan" visitor/callback function */
> > +typedef int (*fw_cfg_file_callback)(const struct fw_cfg_file *f);
> > +
> > +/* run a given callback on each fw_cfg directory entry */
> > +static int fw_cfg_scan_dir(fw_cfg_file_callback callback)
> > +{
> > + int ret = 0;
> > + uint32_t count, i;
>
> u32. Please remove all the *_t variable types. and i should be an int
> here, right?

i goes from 0 up to count, where count is u32. So I figured i should
be u32 as well, but in practice int would work just as well (count is
highly unlikely to be large enough to go past INT_MAX). Let me know if
you still want me to make it an int.

>
> > + struct fw_cfg_file f;
> > +
> > + mutex_lock(&fw_cfg_dev_lock);
> > + iowrite16(fw_cfg_sel_endianness(FW_CFG_FILE_DIR), fw_cfg_dev_ctrl);
> > + ioread8_rep(fw_cfg_dev_data, &count, sizeof(count));
> > + for (i = 0; i < be32_to_cpu(count); i++) {
> > + ioread8_rep(fw_cfg_dev_data, &f, sizeof(f));
> > + ret = callback(&f);
> > + if (ret)
> > + break;
> > + }
> > + mutex_unlock(&fw_cfg_dev_lock);
> > + return ret;
> > +}
> > +
> > +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> > +static inline void fw_cfg_read_blob(uint16_t select,
> > + void *buf, loff_t pos, size_t count)
> > +{
> > + mutex_lock(&fw_cfg_dev_lock);
> > + iowrite16(fw_cfg_sel_endianness(select), fw_cfg_dev_ctrl);
> > + while (pos-- > 0)
> > + ioread8(fw_cfg_dev_data);
> > + ioread8_rep(fw_cfg_dev_data, buf, count);
> > + mutex_unlock(&fw_cfg_dev_lock);
> > +}
> > +
> > +/* clean up fw_cfg device i/o setup */
> > +static void fw_cfg_io_cleanup(void)
> > +{
> > + if (fw_cfg_mode->is_mmio) {
> > + iounmap(fw_cfg_dev_base);
> > + release_mem_region(fw_cfg_mode->start, fw_cfg_mode->size);
> > + } else {
> > + ioport_unmap(fw_cfg_dev_base);
> > + release_region(fw_cfg_mode->start, fw_cfg_mode->size);
> > + }
> > +}
> > +
> > +/* probe and map fw_cfg device */
> > +static int __init fw_cfg_io_probe(void)
> > +{
> > + char sig[FW_CFG_SIG_SIZE];
> > +
> > + for (fw_cfg_mode = &fw_cfg_modes[0];
> > + fw_cfg_mode->start; fw_cfg_mode++) {
> > +
> > + phys_addr_t start = fw_cfg_mode->start;
> > + uint8_t size = fw_cfg_mode->size;
> > +
> > + /* reserve and map mmio or ioport region */
> > + if (fw_cfg_mode->is_mmio) {
> > + if (!request_mem_region(start, size, fw_cfg_mode->name))
> > + continue;
> > + fw_cfg_dev_base = ioremap(start, size);
> > + if (!fw_cfg_dev_base) {
> > + release_mem_region(start, size);
> > + continue;
> > + }
> > + } else {
> > + if (!request_region(start, size, fw_cfg_mode->name))
> > + continue;
> > + fw_cfg_dev_base = ioport_map(start, size);
> > + if (!fw_cfg_dev_base) {
> > + release_region(start, size);
> > + continue;
> > + }
> > + }
> > +
> > + /* set control and data register addresses */
> > + fw_cfg_dev_ctrl = fw_cfg_dev_base + fw_cfg_mode->ctrl_offset;
> > + fw_cfg_dev_data = fw_cfg_dev_base + fw_cfg_mode->data_offset;
> > +
> > + /* verify fw_cfg device signature */
> > + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> > + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) == 0)
> > + /* success, we're done */
> > + return 0;
> > +
> > + /* clean up before probing next access mode */
> > + fw_cfg_io_cleanup();
> > + }
> > +
> > + return -ENODEV;
> > +}
> > +
> > +/* fw_cfg revision attribute, placed in top-level /sys/fw_cfg directory */
> > +static uint32_t fw_cfg_rev;
> > +
> > +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", fw_cfg_rev);
> > +}
> > +
> > +static const struct {
> > + struct attribute attr;
> > + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> > +} fw_cfg_rev_attr = {
> > + .attr = { .name = "rev", .mode = S_IRUSR },
> > + .show = fw_cfg_showrev,
> > +};
> > +
> > +/* fw_cfg_sysfs_entry type */
> > +struct fw_cfg_sysfs_entry {
> > + struct kobject kobj;
> > + struct fw_cfg_file f;
> > + struct list_head list;
> > +};
> > +
> > +/* get fw_cfg_sysfs_entry from kobject member */
> > +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> > +{
> > + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> > +}
> > +
> > +/* fw_cfg_sysfs_attribute type */
> > +struct fw_cfg_sysfs_attribute {
> > + struct attribute attr;
> > + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> > +};
> > +
> > +/* get fw_cfg_sysfs_attribute from attribute member */
> > +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> > +{
> > + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> > +}
> > +
> > +/* global cache of fw_cfg_sysfs_entry objects */
> > +static LIST_HEAD(fw_cfg_entry_cache);
> > +
> > +/* kobjects removed lazily by kernel, mutual exclusion needed */
> > +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> > +
> > +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_del(&entry->list);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static void fw_cfg_sysfs_cache_cleanup(void)
> > +{
> > + struct fw_cfg_sysfs_entry *entry, *next;
> > +
> > + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> > + /* will end up invoking fw_cfg_sysfs_cache_delist()
> > + * via each object's release() method (i.e. destructor) */
> > + kobject_put(&entry->kobj);
> > + }
> > +}
> > +
> > +/* default_attrs: per-entry attributes and show methods */
> > +
> > +#define FW_CFG_SYSFS_ATTR(_attr) \
> > +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> > + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> > + .show = fw_cfg_sysfs_show_##_attr, \
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.size);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_select(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.select);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%s\n", e->f.name);
> > +}
> > +
> > +static FW_CFG_SYSFS_ATTR(size);
> > +static FW_CFG_SYSFS_ATTR(select);
> > +static FW_CFG_SYSFS_ATTR(name);
> > +
> > +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> > + &fw_cfg_sysfs_attr_size.attr,
> > + &fw_cfg_sysfs_attr_select.attr,
> > + &fw_cfg_sysfs_attr_name.attr,
> > + NULL,
> > +};
> > +
> > +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> > +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> > + char *buf)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> > +
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EACCES;
>
> Shouldn't the file permissions handle this properly for you?
>
> > +
> > + return attr->show(entry, buf);
> > +}
> > +
> > +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> > + .show = fw_cfg_sysfs_attr_show,
> > +};
> > +
> > +/* release: destructor, to be called via kobject_put() */
> > +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + fw_cfg_sysfs_cache_delist(entry);
> > + kfree(entry);
> > +}
> > +
> > +/* kobj_type: ties together all properties required to register an entry */
> > +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> > + .default_attrs = fw_cfg_sysfs_entry_attrs,
> > + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> > + .release = fw_cfg_sysfs_release_entry,
> > +};
> > +
> > +/* raw-read method and attribute */
> > +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> > + struct bin_attribute *bin_attr,
> > + char *buf, loff_t pos, size_t count)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EACCES;
>
> Again, file permissions?

In retrospect, you're right. I used another, existing sysfs driver as
inspiration, and this particular redundancy slipped through the cracks
during cleanup. I'll get rid of it for v2.

>
> > +
> > + if (pos > entry->f.size)
> > + return -EINVAL;
> > +
> > + if (count > entry->f.size - pos)
> > + count = entry->f.size - pos;
> > +
> > + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> > + return count;
> > +}
> > +
> > +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> > + .attr = { .name = "raw", .mode = 0400 },
> > + .read = fw_cfg_sysfs_read_raw,
> > +};
> > +
> > +/* kobjects & kset representing top-level, by_select, and by_name folders */
> > +static struct kobject *fw_cfg_top_ko;
> > +static struct kobject *fw_cfg_sel_ko;
> > +
> > +/* callback function to register an individual fw_cfg file */
> > +static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
> > +{
> > + int err;
> > + struct fw_cfg_sysfs_entry *entry;
> > +
> > + /* allocate new entry */
> > + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> > + if (!entry)
> > + return -ENOMEM;
> > +
> > + /* set file entry information */
> > + entry->f.size = be32_to_cpu(f->size);
> > + entry->f.select = be16_to_cpu(f->select);
> > + strcpy(entry->f.name, f->name);
> > +
> > + /* register entry under "/sys/firmware/fw_cfg/by_select/" */
> > + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> > + fw_cfg_sel_ko, "%d", entry->f.select);
> > + if (err)
> > + goto err_register;
> > +
> > + /* add raw binary content access */
> > + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> > + if (err)
> > + goto err_add_raw;
> > +
> > + /* success, add entry to global cache */
> > + fw_cfg_sysfs_cache_enlist(entry);
> > + return 0;
> > +
> > +err_add_raw:
> > + kobject_del(&entry->kobj);
> > +err_register:
> > + kfree(entry);
> > + return err;
> > +}
> > +
> > +/* unregister top-level or by_select folder */
> > +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> > +{
> > + kobject_del(kobj);
> > + kobject_put(kobj);
> > +}
> > +
> > +static int __init fw_cfg_sysfs_init(void)
> > +{
> > + int err;
> > +
> > + /* probe for the fw_cfg "hardware" */
> > + err = fw_cfg_io_probe();
> > + if (err)
> > + return err;
> > +
> > + /* create /sys/firmware/fwcfg/ and its subdirectories */
> > + err = -ENOMEM;
> > + fw_cfg_top_ko = kobject_create_and_add("fw_cfg", firmware_kobj);
> > + if (!fw_cfg_top_ko)
> > + goto err_top;
> > + fw_cfg_sel_ko = kobject_create_and_add("by_select", fw_cfg_top_ko);
> > + if (!fw_cfg_sel_ko)
> > + goto err_sel;
> > +
> > + /* get revision number, add matching top-level attribute */
> > + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> > + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> > + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > + if (err)
> > + goto err_rev;
> > +
> > + /* process fw_cfg file directory entry, registering each file */
> > + err = fw_cfg_scan_dir(fw_cfg_register_file);
>
> Dealing with all of these "raw" kobjecs makes me nervous. Why can't you
> use 'struct device' for this instead, on the system bus?

I'll have to read up on 'struct device' to figure out why I would or
wouldn't want to do that instead -- sorry, still a n00b trying to
drink from the firehose :)

Mainly, qemu's fw_cfg is a read-only "device", so one wouldn't ever
care to try writing anything to it. /sys/firmware/... feels like a fit
because fw_cfg contains binary blobs originally meant to be used by
the bios (it's how SeaBIOS and OVMF pull smbios and acpi tables out of
the host and set them up in guest memory before booting the guest kernel,
as one of the many examples).

The most similar example (and the existing driver I used as an example
during implementation) is dmi-sysfs.c, which also exposes all the
metadata for each smbios table as numeric (or string) read-only attributes,
and has a "raw" attribute which allows dumping each table in its
entirety. Same thing here -- I want to expose the name, size, and
select key for each blob, but also allow access to the "payload", i.e.
the blob itself.

Thanks much for the feedback !
--Gabriel

2015-08-10 19:08:09

by Gabriel L. Somlo

[permalink] [raw]
Subject: Re: [PATCH 2/3] kobject: export kset_find_obj() to be used from modules

On Mon, Aug 10, 2015 at 11:54:00AM -0700, Greg KH wrote:
> On Mon, Aug 10, 2015 at 02:43:10PM -0400, Gabriel L. Somlo wrote:
> > On Mon, Aug 10, 2015 at 11:33:04AM -0700, Greg KH wrote:
> > > On Mon, Aug 10, 2015 at 12:31:19PM -0400, Gabriel L. Somlo wrote:
> > > > From: "Gabriel Somlo" <[email protected]>
> > > >
> > > > Signed-off-by: Gabriel Somlo <[email protected]>
> > > > ---
> > > > lib/kobject.c | 1 +
> > > > 1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/lib/kobject.c b/lib/kobject.c
> > > > index 3e3a5c3..f9754a0 100644
> > > > --- a/lib/kobject.c
> > > > +++ b/lib/kobject.c
> > > > @@ -1058,3 +1058,4 @@ EXPORT_SYMBOL(kobject_del);
> > > >
> > > > EXPORT_SYMBOL(kset_register);
> > > > EXPORT_SYMBOL(kset_unregister);
> > > > +EXPORT_SYMBOL(kset_find_obj);
> > >
> > > checkpatch doesn't like this for obvious reasons :)
> >
> > When I run checkpatch on this one, it comes back fine. Could you
> > please elaborate ?
> >
> > [staging]$ scripts/checkpatch.pl
> > 0002-kobject-export-kset_find_obj-to-be-used-from-modules.patch
> > total: 0 errors, 0 warnings, 4 lines checked
> >
> > 0002-kobject-export-kset_find_obj-to-be-used-from-modules.patch has no
> > obvious style problems and is ready for submission.
>
> EXPORT_SYMBOL() should be below the function definition itself, not at
> the bottom of the file. Odd that checkpatch doesn't catch this when
> done in this manner, it will if you run it on the whole file after your
> patch is applied (along with the existing problems.)

Got it, fix lined up for v2

> But this is a minor issue, see my other review first...

Yeah, but it was the low hanging fruit, so I replied to it first ;)

Thanks much,
--Gabriel

2015-08-10 19:15:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/3] firmware: introduce sysfs driver for QEMU's fw_cfg device

On Mon, Aug 10, 2015 at 03:02:22PM -0400, Gabriel L. Somlo wrote:
> Mainly, qemu's fw_cfg is a read-only "device", so one wouldn't ever
> care to try writing anything to it. /sys/firmware/... feels like a fit
> because fw_cfg contains binary blobs originally meant to be used by
> the bios (it's how SeaBIOS and OVMF pull smbios and acpi tables out of
> the host and set them up in guest memory before booting the guest kernel,
> as one of the many examples).
>
> The most similar example (and the existing driver I used as an example
> during implementation) is dmi-sysfs.c, which also exposes all the
> metadata for each smbios table as numeric (or string) read-only attributes,
> and has a "raw" attribute which allows dumping each table in its
> entirety. Same thing here -- I want to expose the name, size, and
> select key for each blob, but also allow access to the "payload", i.e.
> the blob itself.

That's great, and you can have these "blobs" be an attribute for a
struct device. Just using a "raw" kobject as you are is hard, as you
have seen, and messy. Making it a "real" device makes this all much
easier and simpler.

Now if you want to keep things in /sys/firmware/ that's another issue,
and would have to stay as a kobject. so maybe it does need to remain,
need to think about that...

thanks,

greg k-h