2015-08-11 18:48:23

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

From: "Gabriel Somlo" <[email protected]>

This patch set makes QEMU fw_cfg blobs available for viewing (read-only)
via SysFS.

New since v1:

1/3:

- renamed sysfs path components:

s/fw_cfg/qemu_fw_cfg/g, at Greg's suggestion

s/by_select/by_key/g since it feels a bit more intuitive

- attribute "select" renamed to "key", along the same train of
thought.

- renamed actual source file from "fw_cfg.c" to "qemu_fw_cfg.c"
and doc file in Documentation/ABI/testing to
"sysfs-firmware-qemu_fw_cfg"

- s/uintXX_t/uXX/g, at Greg's suggestion

- using named identifiers to initialize access mode table
(static struct fw_cfg_access fw_cfg_modes[]) per Greg's
suggestion

- removed redundant "capable(CAP_SYS_ADMIN)" checks (per Greg's
feedback)

2/3:

- EXPORT_SYMBOL(kset_find_obj) statement moved immediately
after kset_find_obj() definition.

Re. 'struct device' vs. straight-up kobject:

I dug around a bit, and found a bunch of stuff under /sys/devices
(such as e.g. /sys/devices/virtual/dmi and /sys/devices/platform/applesmc)

The first one is particularly interesting, as it presents a
somewhat different view of the same internal information displayed by
/sys/firmware/dmi/, only /sys/devices/virtual/dmi looks more "cooked",
in that it exposes individual fields of the several different DMI tables.

None of the subfolders of /sys/devices/... appears to facilitate access
to the "raw blobs" contained by the respective devices represented there.

Even applesmc (the other device with which I'm somewhat familiar) does
a lot of "cooking" of the values held by the device.

So, at the end of the day, I want the information I expose about fw_cfg
to look more like /sys/firmware/dmi than like /sys/devices/virtual/dmi
in that I have a bunch of cached metadata attributes for each of the
blobs, and want to offer access to the raw blob data, and there's no
rhyme or reason to the actual internals of any of the blobs, so offering
a "cooked" view like in /sys/devices/... doesn't feel like the best fit...

Obviously, I may be totally wrong... :)

Lastly, re. Greg's comment on patch 3/3:
> Shouldn't all of this be done in userspace with the symlinks and all?
> It seems like you are trying to duplicate the /dev/block/by-name and
> such. Policy decisions like symlinks and naming should be done there,
> in userspace, and not directly in sysfs if at all possible.

OK, so here's the "pseudo-bash" equivalent of what this patch does:

#upon successful qemu_fw_cfg.ko insertion

BY_KEY="/sys/firmware/qemu_fw_cfg/by_key"
BY_NAME="/sys/firmware/qemu_fw_cfg/by_name"

mkdir -p $BY_NAME

for KEY in $(ls $BY_KEY); do
FILE=$(cat $BY_KEY/$KEY/name)
[ mkdir -p $BY_NAME/$(dirname $FILE) ] || continue
pushd $BY_NAME/$(dirname $FILE)
ln -s $BY_KEY/$KEY $(basename $FILE)
popd
done

Note how it MAY fail to mkdir, in which case it skips to the next key,
or it MAY fail to create the symlink, in which case it just moves on
to the next key as well.

Again, the fw_cfg device doesn't have any notion of nested directories,
it just parrots back some random string someone choose as a name for a
given blob, but these strings typically contain slash-separated ASCII
which "looks like" it "could" be a path name :)

If there's a reasonable way to do this dynamically, immediately upon
insertion of the qemu_fw_cfg.ko module, patches 2/3 and 3/3 could be
dropped.

But before that, I'd appreciate a pointer to how/where this can be done
in userspace. Would it be available e.g. during early boot, early enough
to e.g. raw-dump an ASCII blob containing shell variable definitions (e.g.
to set hostname on the guest, etc ?)

Doing it as part of the qemu_fw_cfg.ko module itself has the advantage
of it being available immediately, without anything having to "watch"
for the module being inserted.

But if userspace is the standard, "canonical" place to do something like
this, I'd be happy to give it a try, once I find the right tree to bark
up on... :)

Thanks much,
Gabriel


Old "cover-letter" blurb for v1:
> Several different architectures supported by QEMU are set up with a
> "firmware configuration" (fw_cfg) device, used to pass configuration
> "blobs" into the guest by the host running QEMU.
>
> Historically, these config blobs were mostly of interest to the guest
> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
> the command line, which makes them potentially interesting to userspace
> (e.g. for passing early boot environment variables, etc.).
>
> In addition to cc-ing the people and lists indicated by get-maintainer.pl,
> I've added a few extra lists suggested by Matt Fleming on the qemu-devel
> list, as well as the qemu-devel list itself.
>
> Also cc-ing kernelnewbies, as this is my very first kenel contribution,
> so please go easy on me for whatever silly n00b mistakes I might have still
> missed, in spite of trying hard to do all my homework properly... :)
>
> The series consists of three patches:
>
> 1/3 - probes for the qemu fw_cfg device in locations known to work on
> the supported architectures, in decreasing order of "likelihood".
>
> While it *may* be possible to detect the presence of fw_cfg via
> acpi or dtb (on x86 and arm, respectively), there's no way I know
> of attempting that on sun4 and ppc/mac, so I've stuck with simply
> probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
> in fw_cfg.c. I could use some advice on how else that could be
> done more elegantly, if needed.
>
> Upon successfully detecting a present fw_cfg device, we set up
> /sys/firmware/fw_cfg/by_select entries for each blob available
> on the fw_cfg device.
>
> 2/3 - export kset_find_obj() (in lib/kobject.c) for use with modules,
> which will come in handy in patch #3, below.
>
> 3/3 - add a "user friendly" way of listing fw_cfg blobs by name rather
> than by unique selector key.
>
> Since fw_cfg blob names are traditionally set up to look like
> path names, it would be nice to mirror that fact when displaying
> them under /sys/firmware/fw_cfg in a "by_name" subdirectory.
>
> I'm using ksets to reflect subdirectories matching "dirname"
> tokens separated by '/' within each fw_cfg blob "filename",
> and symlinks into the "by_select" subdirectory for each "basename".
>
> Since the fw_cfg device doesn't enforce that blob names have
> well-behaved and non-conflicting names, it is possible (though
> unlikely) that there will be blobs named:
>
> "etc/foo/bar"
> and
> "etc/foo"
>
> where "foo" will try to be both a subdirectory AND a symlink under
> /sys/firmware/fw_cfg/by_name/etc/. In such an event, the latter
> fw_cfg blob will simply be skipped from having an entry created
> under the user-friendly "by_name" subdirectory, remaining listed
> only as an entry under the "by_select" subdirectory.
>
> I have tested this on x86_64 and armv7hl+lpae kernels. Builds and installs
> OK as both a module and linked directly into the kernel.

Gabriel Somlo (3):
firmware: introduce sysfs driver for QEMU's fw_cfg device
kobject: export kset_find_obj() to be used from modules
firmware: fw_cfg: create directory hierarchy for fw_cfg file names

.../ABI/testing/sysfs-firmware-qemu_fw_cfg | 210 ++++++++
drivers/firmware/Kconfig | 10 +
drivers/firmware/Makefile | 1 +
drivers/firmware/qemu_fw_cfg.c | 560 +++++++++++++++++++++
lib/kobject.c | 1 +
5 files changed, 782 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
create mode 100644 drivers/firmware/qemu_fw_cfg.c

--
2.4.3


2015-08-11 18:49:03

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH v2 1/3] firmware: introduce sysfs driver for QEMU's fw_cfg device

From: "Gabriel Somlo" <[email protected]>

Make fw_cfg entries of type "file" available via sysfs. Entries
are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
named after each entry's selector key. Filename, selector value,
and size read-only attributes are included for each entry. Also,
a "raw" attribute allows retrieval of the full binary content of
each entry.

This patch also provides a documentation file outlining the
guest-side "hardware" interface exposed by the QEMU fw_cfg device.

Signed-off-by: Gabriel Somlo <[email protected]>
---
.../ABI/testing/sysfs-firmware-qemu_fw_cfg | 168 ++++++++
drivers/firmware/Kconfig | 10 +
drivers/firmware/Makefile | 1 +
drivers/firmware/qemu_fw_cfg.c | 456 +++++++++++++++++++++
4 files changed, 635 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
create mode 100644 drivers/firmware/qemu_fw_cfg.c

diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
new file mode 100644
index 0000000..5bc2e3f
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
@@ -0,0 +1,168 @@
+What: /sys/firmware/qemu_fw_cfg/
+Date: August 2015
+Contact: Gabriel Somlo <[email protected]>
+Description:
+ Several different architectures supported by QEMU (x86, arm,
+ sun4*, ppc/mac) are provisioned with a firmware configuration
+ (fw_cfg) device, used by the host to provide configuration data
+ to the starting guest. While most of this data is meant for use
+ by the guest BIOS, starting with QEMU v2.4, guest VMs may be
+ started with arbitrary fw_cfg entries supplied directly on the
+ command line, which therefore may be of interest to userspace.
+
+ === Guest-side Hardware Interface ===
+
+ The fw_cfg device is available to guest VMs as a pair (control
+ and data) of registers, accessible as either a IO ports or as
+ MMIO addresses, depending on the architecture.
+
+ --- Control Register ---
+
+ Width: 16-bit
+ Access: Write-Only
+ Endianness: LE (if IOport) or BE (if MMIO)
+
+ A write to the control register selects the index for one of
+ the firmware configuration items (or "blobs") available on the
+ fw_cfg device, which can subsequently be read from the data
+ register.
+
+ Each time the control register is written, an data offset
+ internal to the fw_cfg device will be set to zero. This data
+ offset impacts which portion of the selected fw_cfg blob is
+ accessed by reading the data register, as explained below.
+
+ --- Data Register ---
+
+ Width: 8-bit (if IOport), or 8/16/32/64-bit (if MMIO)
+ Access: Read-Only
+ Endianness: string preserving
+
+ The data register allows access to an array of bytes which
+ represent the fw_cfg blob last selected by a write to the
+ control register.
+
+ Immediately following a write to the control register, the data
+ offset will be set to zero. Each successful read access to the
+ data register will increment the data offset by the appropriate
+ access width.
+
+ Each fw_cfg blob has a maximum associated data length. Once the
+ data offset exceeds this maximum length, any subsequent reads
+ via the data register will return 0x00.
+
+ An N-byte wide read of the data register will return the next
+ available N bytes of the selected fw_cfg blob, as a substring,
+ in increasing address order, similar to memcpy(), zero-padded
+ if necessary should the maximum data length of the selected
+ item be reached, as described above.
+
+ --- Per-arch Register Details ---
+
+ -------------------------------------------------------------
+ arch access base ctrl ctrl data data
+ mode address offset endian offset width
+ max.
+ -------------------------------------------------------------
+ x86 IOport 0x510 0 LE 1 8
+ x86_64 IOport 0x510 0 LE 1 8
+ arm MMIO 0x9020000 8 BE 0 64
+ sun4u IOport 0x510 0 LE 1 8
+ sun4m MMIO 0xd00000510 0 BE 2 8
+ ppc/mac MMIO 0xf0000510 0 BE 2 8
+ -------------------------------------------------------------
+
+ NOTE: On platforms where the fw_cfg registers are exposed as
+ IO ports, the data port number will always be one greater than
+ the port number of the control register. I.e., the two ports
+ are overlapping, and can not be mapped separately.
+
+ === Firmware Configuration Items of Interest ===
+
+ Originally, the index key, size, and formatting of blobs in
+ fw_cfg was hard coded by mutual agreement between QEMU on the
+ host side, and the BIOS running on the guest. Later on, a file
+ transfer interface was added: by reading a special blob, the
+ fw_cfg consumer can retrieve a list of records containing the
+ name, selector key, and size of further fw_cfg blobs made
+ available by the host. Below we describe three fw_cfg blobs
+ of interest to the sysfs driver.
+
+ --- Signature (Key 0x0000, FW_CFG_SIGNATURE) ---
+
+ The presence of the fw_cfg device can be verified by selecting
+ the signature blob by writing 0x0000 to the control register,
+ and reading four bytes from the data register. If the fw_cfg
+ device is present, the four bytes read will match the ASCII
+ characters "QEMU".
+
+ --- Revision (Key 0x0001, FW_CFG_ID) ---
+
+ A 32-bit little-endian unsigned integer, this item is used as
+ an interface revision number.
+
+ --- File Directory (Key 0x0019, FW_CFG_FILE_DIR) ---
+
+ Any fw_cfg blobs stored at key 0x0020 FW_CFG_FILE_FIRST() or
+ higher will have an associated entry in this "directory" blob,
+ which facilitates the discovery of available items by software
+ (e.g. BIOS) running on the guest. The format of the directory
+ blob is shown below.
+
+ NOTE: All integers are stored in big-endian format!
+
+ /* the entire file directory "blob" */
+ struct FWCfgFiles {
+ u32 count; /* total number of entries */
+ struct FWCfgFile f[]; /* entry array, see below */
+ };
+
+ /* an individual directory entry, 64 bytes total */
+ struct FWCfgFile {
+ u32 size; /* size of referenced blob */
+ u16 select; /* selector key for referenced blob */
+ u16 reserved;
+ char name[56]; /* blob name, nul-terminated ASCII */
+ };
+
+ === SysFS fw_cfg Interface ===
+
+ The fw_cfg sysfs interface described in this document is only
+ intended to display discoverable blobs (i.e., those registered
+ with the file directory), as there is no way to determine the
+ presence or size of "legacy" blobs (with selector keys between
+ 0x0002 and 0x0018) programmatically.
+
+ All fw_cfg information is shown under:
+
+ /sys/firmware/qemu_fw_cfg/
+
+ The only legacy blob displayed is the fw_cfg device revision:
+
+ /sys/firmware/qemu_fw_cfg/rev
+
+ --- Discoverable fw_cfg blobs by selector key ---
+
+ All discoverable blobs listed in the fw_cfg file directory are
+ displayed as entries named after their unique selector key
+ value, e.g.:
+
+ /sys/firmware/qemu_fw_cfg/by_key/32
+ /sys/firmware/qemu_fw_cfg/by_key/33
+ /sys/firmware/qemu_fw_cfg/by_key/34
+ ...
+
+ Each such fw_cfg sysfs entry has the following values exported
+ as attributes:
+
+ name : The 56-byte nul-terminated ASCII string used as the
+ blob's 'file name' in the fw_cfg directory.
+ size : The length of the blob, as given in the fw_cfg
+ directory.
+ key : The value of the blob's selector key as given in the
+ fw_cfg directory. This value is the same as used in
+ the parent directory name.
+ raw : The raw bytes of the blob, obtained by selecting the
+ entry via the control register, and reading a number
+ of bytes equal to the blob size from the data
+ register.
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 99c69a3..f5cbe81 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -136,6 +136,16 @@ config QCOM_SCM
bool
depends on ARM || ARM64

+config FW_CFG_SYSFS
+ tristate "QEMU fw_cfg device support in sysfs"
+ depends on SYSFS
+ default n
+ help
+ Say Y or M here to enable the exporting of the QEMU firmware
+ configuration (fw_cfg) file entries via sysfs. Entries are
+ found under /sys/firmware/fw_cfg when this option is enabled
+ and loaded.
+
source "drivers/firmware/broadcom/Kconfig"
source "drivers/firmware/google/Kconfig"
source "drivers/firmware/efi/Kconfig"
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index 4a4b897..706fdc5 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
obj-$(CONFIG_QCOM_SCM) += qcom_scm-32.o
+obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
CFLAGS_qcom_scm-32.o :=$(call as-instr,.arch_extension sec,-DREQUIRES_SEC=1)

obj-y += broadcom/
diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
new file mode 100644
index 0000000..cd26eee
--- /dev/null
+++ b/drivers/firmware/qemu_fw_cfg.c
@@ -0,0 +1,456 @@
+/*
+ * drivers/firmware/qemu_fw_cfg.c
+ *
+ * Expose entries from QEMU's firmware configuration (fw_cfg) device in
+ * sysfs (read-only, under "/sys/firmware/qemu_fw_cfg/...").
+ *
+ * Copyright 2015 Carnegie Mellon University
+ */
+
+#include <linux/module.h>
+#include <linux/capability.h>
+#include <linux/slab.h>
+#include <linux/io.h>
+#include <linux/ioport.h>
+#include <linux/ctype.h>
+
+/* selector key values for "well-known" fw_cfg entries */
+#define FW_CFG_SIGNATURE 0x00
+#define FW_CFG_ID 0x01
+#define FW_CFG_FILE_DIR 0x19
+
+/* size in bytes of fw_cfg signature */
+#define FW_CFG_SIG_SIZE 4
+
+/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
+#define FW_CFG_MAX_FILE_PATH 56
+
+/* fw_cfg file directory entry type */
+struct fw_cfg_file {
+ u32 size;
+ u16 select;
+ u16 reserved;
+ char name[FW_CFG_MAX_FILE_PATH];
+};
+
+/* fw_cfg device i/o access options type */
+struct fw_cfg_access {
+ const char *name;
+ phys_addr_t base;
+ u8 size;
+ u8 ctrl_offset;
+ u8 data_offset;
+ bool is_mmio;
+};
+
+/* table of fw_cfg device i/o access options for known architectures */
+static struct fw_cfg_access fw_cfg_modes[] = {
+ {
+ .name = "fw_cfg IOport on i386, sun4u",
+ .base = 0x510,
+ .size = 0x02,
+ .ctrl_offset = 0x00,
+ .data_offset = 0x01,
+ .is_mmio = false,
+ }, {
+ .name = "fw_cfg MMIO on arm",
+ .base = 0x9020000,
+ .size = 0x0a,
+ .ctrl_offset = 0x08,
+ .data_offset = 0x00,
+ .is_mmio = true,
+ }, {
+ .name = "fw_cfg MMIO on sun4m",
+ .base = 0xd00000510,
+ .size = 0x03,
+ .ctrl_offset = 0x00,
+ .data_offset = 0x02,
+ .is_mmio = true,
+ }, {
+ .name = "fw_cfg MMIO on ppc/mac",
+ .base = 0xf0000510,
+ .size = 0x03,
+ .ctrl_offset = 0x00,
+ .data_offset = 0x02,
+ .is_mmio = true,
+ }, { } /* END */
+};
+
+/* fw_cfg device i/o currently selected option set */
+static struct fw_cfg_access *fw_cfg_mode;
+
+/* fw_cfg device i/o register addresses */
+static void __iomem *fw_cfg_dev_base;
+static void __iomem *fw_cfg_dev_ctrl;
+static void __iomem *fw_cfg_dev_data;
+
+/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
+static DEFINE_MUTEX(fw_cfg_dev_lock);
+
+/* pick apropriate endianness for selector key */
+static inline u16 fw_cfg_sel_endianness(u16 key)
+{
+ return fw_cfg_mode->is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
+}
+
+/* type for fw_cfg "directory scan" visitor/callback function */
+typedef int (*fw_cfg_file_callback)(const struct fw_cfg_file *f);
+
+/* run a given callback on each fw_cfg directory entry */
+static int fw_cfg_scan_dir(fw_cfg_file_callback callback)
+{
+ int ret = 0;
+ u32 count, i;
+ struct fw_cfg_file f;
+
+ mutex_lock(&fw_cfg_dev_lock);
+ iowrite16(fw_cfg_sel_endianness(FW_CFG_FILE_DIR), fw_cfg_dev_ctrl);
+ ioread8_rep(fw_cfg_dev_data, &count, sizeof(count));
+ for (i = 0; i < be32_to_cpu(count); i++) {
+ ioread8_rep(fw_cfg_dev_data, &f, sizeof(f));
+ ret = callback(&f);
+ if (ret)
+ break;
+ }
+ mutex_unlock(&fw_cfg_dev_lock);
+ return ret;
+}
+
+/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
+static inline void fw_cfg_read_blob(u16 key,
+ void *buf, loff_t pos, size_t count)
+{
+ mutex_lock(&fw_cfg_dev_lock);
+ iowrite16(fw_cfg_sel_endianness(key), fw_cfg_dev_ctrl);
+ while (pos-- > 0)
+ ioread8(fw_cfg_dev_data);
+ ioread8_rep(fw_cfg_dev_data, buf, count);
+ mutex_unlock(&fw_cfg_dev_lock);
+}
+
+/* clean up fw_cfg device i/o setup */
+static void fw_cfg_io_cleanup(void)
+{
+ if (fw_cfg_mode->is_mmio) {
+ iounmap(fw_cfg_dev_base);
+ release_mem_region(fw_cfg_mode->base, fw_cfg_mode->size);
+ } else {
+ ioport_unmap(fw_cfg_dev_base);
+ release_region(fw_cfg_mode->base, fw_cfg_mode->size);
+ }
+}
+
+/* probe and map fw_cfg device */
+static int __init fw_cfg_io_probe(void)
+{
+ char sig[FW_CFG_SIG_SIZE];
+
+ for (fw_cfg_mode = &fw_cfg_modes[0];
+ fw_cfg_mode->base; fw_cfg_mode++) {
+
+ phys_addr_t base = fw_cfg_mode->base;
+ u8 size = fw_cfg_mode->size;
+
+ /* reserve and map mmio or ioport region */
+ if (fw_cfg_mode->is_mmio) {
+ if (!request_mem_region(base, size, fw_cfg_mode->name))
+ continue;
+ fw_cfg_dev_base = ioremap(base, size);
+ if (!fw_cfg_dev_base) {
+ release_mem_region(base, size);
+ continue;
+ }
+ } else {
+ if (!request_region(base, size, fw_cfg_mode->name))
+ continue;
+ fw_cfg_dev_base = ioport_map(base, size);
+ if (!fw_cfg_dev_base) {
+ release_region(base, size);
+ continue;
+ }
+ }
+
+ /* set control and data register addresses */
+ fw_cfg_dev_ctrl = fw_cfg_dev_base + fw_cfg_mode->ctrl_offset;
+ fw_cfg_dev_data = fw_cfg_dev_base + fw_cfg_mode->data_offset;
+
+ /* verify fw_cfg device signature */
+ fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
+ if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) == 0)
+ /* success, we're done */
+ return 0;
+
+ /* clean up before probing next access mode */
+ fw_cfg_io_cleanup();
+ }
+
+ return -ENODEV;
+}
+
+/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
+static u32 fw_cfg_rev;
+
+static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
+{
+ return sprintf(buf, "%u\n", fw_cfg_rev);
+}
+
+static const struct {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
+} fw_cfg_rev_attr = {
+ .attr = { .name = "rev", .mode = S_IRUSR },
+ .show = fw_cfg_showrev,
+};
+
+/* fw_cfg_sysfs_entry type */
+struct fw_cfg_sysfs_entry {
+ struct kobject kobj;
+ struct fw_cfg_file f;
+ struct list_head list;
+};
+
+/* get fw_cfg_sysfs_entry from kobject member */
+static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
+{
+ return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
+}
+
+/* fw_cfg_sysfs_attribute type */
+struct fw_cfg_sysfs_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
+};
+
+/* get fw_cfg_sysfs_attribute from attribute member */
+static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
+{
+ return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
+}
+
+/* global cache of fw_cfg_sysfs_entry objects */
+static LIST_HEAD(fw_cfg_entry_cache);
+
+/* kobjects removed lazily by kernel, mutual exclusion needed */
+static DEFINE_SPINLOCK(fw_cfg_cache_lock);
+
+static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
+{
+ spin_lock(&fw_cfg_cache_lock);
+ list_add_tail(&entry->list, &fw_cfg_entry_cache);
+ spin_unlock(&fw_cfg_cache_lock);
+}
+
+static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
+{
+ spin_lock(&fw_cfg_cache_lock);
+ list_del(&entry->list);
+ spin_unlock(&fw_cfg_cache_lock);
+}
+
+static void fw_cfg_sysfs_cache_cleanup(void)
+{
+ struct fw_cfg_sysfs_entry *entry, *next;
+
+ list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
+ /* will end up invoking fw_cfg_sysfs_cache_delist()
+ * via each object's release() method (i.e. destructor) */
+ kobject_put(&entry->kobj);
+ }
+}
+
+/* default_attrs: per-entry attributes and show methods */
+
+#define FW_CFG_SYSFS_ATTR(_attr) \
+struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
+ .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
+ .show = fw_cfg_sysfs_show_##_attr, \
+}
+
+static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
+{
+ return sprintf(buf, "%u\n", e->f.size);
+}
+
+static ssize_t fw_cfg_sysfs_show_key(struct fw_cfg_sysfs_entry *e, char *buf)
+{
+ return sprintf(buf, "%u\n", e->f.select);
+}
+
+static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
+{
+ return sprintf(buf, "%s\n", e->f.name);
+}
+
+static FW_CFG_SYSFS_ATTR(size);
+static FW_CFG_SYSFS_ATTR(key);
+static FW_CFG_SYSFS_ATTR(name);
+
+static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
+ &fw_cfg_sysfs_attr_size.attr,
+ &fw_cfg_sysfs_attr_key.attr,
+ &fw_cfg_sysfs_attr_name.attr,
+ NULL,
+};
+
+/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
+static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
+ char *buf)
+{
+ struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
+ struct fw_cfg_sysfs_attribute *attr = to_attr(a);
+
+ return attr->show(entry, buf);
+}
+
+static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
+ .show = fw_cfg_sysfs_attr_show,
+};
+
+/* release: destructor, to be called via kobject_put() */
+static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
+{
+ struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
+
+ fw_cfg_sysfs_cache_delist(entry);
+ kfree(entry);
+}
+
+/* kobj_type: ties together all properties required to register an entry */
+static struct kobj_type fw_cfg_sysfs_entry_ktype = {
+ .default_attrs = fw_cfg_sysfs_entry_attrs,
+ .sysfs_ops = &fw_cfg_sysfs_attr_ops,
+ .release = fw_cfg_sysfs_release_entry,
+};
+
+/* raw-read method and attribute */
+static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *bin_attr,
+ char *buf, loff_t pos, size_t count)
+{
+ struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
+
+ if (pos > entry->f.size)
+ return -EINVAL;
+
+ if (count > entry->f.size - pos)
+ count = entry->f.size - pos;
+
+ fw_cfg_read_blob(entry->f.select, buf, pos, count);
+ return count;
+}
+
+static struct bin_attribute fw_cfg_sysfs_attr_raw = {
+ .attr = { .name = "raw", .mode = 0400 },
+ .read = fw_cfg_sysfs_read_raw,
+};
+
+/* kobjects & kset representing top-level, by_key, and by_name folders */
+static struct kobject *fw_cfg_top_ko;
+static struct kobject *fw_cfg_sel_ko;
+
+/* callback function to register an individual fw_cfg file */
+static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
+{
+ int err;
+ struct fw_cfg_sysfs_entry *entry;
+
+ /* allocate new entry */
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return -ENOMEM;
+
+ /* set file entry information */
+ entry->f.size = be32_to_cpu(f->size);
+ entry->f.select = be16_to_cpu(f->select);
+ strcpy(entry->f.name, f->name);
+
+ /* register entry under "/sys/firmware/qemu_fw_cfg/by_key/" */
+ err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
+ fw_cfg_sel_ko, "%d", entry->f.select);
+ if (err)
+ goto err_register;
+
+ /* add raw binary content access */
+ err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
+ if (err)
+ goto err_add_raw;
+
+ /* success, add entry to global cache */
+ fw_cfg_sysfs_cache_enlist(entry);
+ return 0;
+
+err_add_raw:
+ kobject_del(&entry->kobj);
+err_register:
+ kfree(entry);
+ return err;
+}
+
+/* unregister top-level or by_key folder */
+static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
+{
+ kobject_del(kobj);
+ kobject_put(kobj);
+}
+
+static int __init fw_cfg_sysfs_init(void)
+{
+ int err;
+
+ /* probe for the fw_cfg "hardware" */
+ err = fw_cfg_io_probe();
+ if (err)
+ return err;
+
+ /* create /sys/firmware/qemu_fw_cfg/ and its subdirectories */
+ err = -ENOMEM;
+ fw_cfg_top_ko = kobject_create_and_add("qemu_fw_cfg", firmware_kobj);
+ if (!fw_cfg_top_ko)
+ goto err_top;
+ fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
+ if (!fw_cfg_sel_ko)
+ goto err_sel;
+
+ /* get revision number, add matching top-level attribute */
+ fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
+ fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
+ err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
+ if (err)
+ goto err_rev;
+
+ /* process fw_cfg file directory entry, registering each file */
+ err = fw_cfg_scan_dir(fw_cfg_register_file);
+ if (err)
+ goto err_scan;
+
+ /* success */
+ pr_debug("fw_cfg: loaded.\n");
+ return 0;
+
+err_scan:
+ fw_cfg_sysfs_cache_cleanup();
+ sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
+err_rev:
+ fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
+err_sel:
+ fw_cfg_kobj_cleanup(fw_cfg_top_ko);
+err_top:
+ fw_cfg_io_cleanup();
+ return err;
+}
+
+static void __exit fw_cfg_sysfs_exit(void)
+{
+ pr_debug("fw_cfg: unloading.\n");
+ fw_cfg_sysfs_cache_cleanup();
+ fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
+ fw_cfg_kobj_cleanup(fw_cfg_top_ko);
+ fw_cfg_io_cleanup();
+}
+
+module_init(fw_cfg_sysfs_init);
+module_exit(fw_cfg_sysfs_exit);
+MODULE_AUTHOR("Gabriel L. Somlo <[email protected]>");
+MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
+MODULE_LICENSE("GPL");
--
2.4.3

2015-08-11 18:48:25

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH v2 2/3] kobject: export kset_find_obj() to be used from modules

From: "Gabriel Somlo" <[email protected]>

Signed-off-by: Gabriel Somlo <[email protected]>
---
lib/kobject.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/lib/kobject.c b/lib/kobject.c
index 0554077..8f07202 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -847,6 +847,7 @@ struct kobject *kset_find_obj(struct kset *kset, const char *name)
spin_unlock(&kset->list_lock);
return ret;
}
+EXPORT_SYMBOL(kset_find_obj);

static void kset_release(struct kobject *kobj)
{
--
2.4.3

2015-08-11 18:48:21

by Gabriel L. Somlo

[permalink] [raw]
Subject: [PATCH v2 3/3] firmware: fw_cfg: create directory hierarchy for fw_cfg file names

From: "Gabriel Somlo" <[email protected]>

Each fw_cfg entry of type "file" has an associated 56-char,
nul-terminated ASCII string which represents its name. While
the fw_cfg device doesn't itself impose any specific naming
convention, QEMU developers have traditionally used path name
semantics (i.e. "etc/acpi/rsdp") to descriptively name the
various fw_cfg "blobs" passed into the guest.

This patch attempts, on a best effort basis, to create a
directory hierarchy representing the content of fw_cfg file
names, under /sys/firmware/qemu_fw_cfg/by_name.

Upon successful creation of all directories representing the
"dirname" portion of a fw_cfg file, a symlink will be created
to represent the "basename", pointing at the appropriate
/sys/firmware/qemu_fw_cfg/by_key entry. If a file name is not
suitable for this procedure (e.g., if its basename or dirname
components collide with an already existing dirname component
or basename, respectively) the corresponding fw_cfg blob is
skipped and will remain available in sysfs only by its selector
key value.

Signed-off-by: Gabriel Somlo <[email protected]>
---
.../ABI/testing/sysfs-firmware-qemu_fw_cfg | 42 +++++++++
drivers/firmware/qemu_fw_cfg.c | 104 +++++++++++++++++++++
2 files changed, 146 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
index 5bc2e3f..db0b10d 100644
--- a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
+++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
@@ -166,3 +166,45 @@ Description:
entry via the control register, and reading a number
of bytes equal to the blob size from the data
register.
+
+ --- Listing fw_cfg blobs by file name ---
+
+ While the fw_cfg device does not impose any specific naming
+ convention on the blobs registered in the file directory,
+ QEMU developers have traditionally used path name semantics
+ to give each blob a descriptive name. For example:
+
+ "bootorder"
+ "genroms/kvmvapic.bin"
+ "etc/e820"
+ "etc/boot-fail-wait"
+ "etc/system-states"
+ "etc/table-loader"
+ "etc/acpi/rsdp"
+ "etc/acpi/tables"
+ "etc/smbios/smbios-tables"
+ "etc/smbios/smbios-anchor"
+ ...
+
+ In addition to the listing by unique selector key described
+ above, the fw_cfg sysfs driver also attempts to build a tree
+ of directories matching the path name components of fw_cfg
+ blob names, ending in symlinks to the by_key entry for each
+ "basename", as illustrated below (assume current directory is
+ /sys/firmware):
+
+ qemu_fw_cfg/by_name/bootorder -> ../by_key/38
+ qemu_fw_cfg/by_name/etc/e820 -> ../../by_key/35
+ qemu_fw_cfg/by_name/etc/acpi/rsdp -> ../../../by_key/41
+ ...
+
+ Construction of the directory tree and symlinks is done on a
+ "best-effort" basis, as there is no guarantee that components
+ of fw_cfg blob names are always "well behaved". I.e., there is
+ the possibility that a symlink (basename) will conflict with
+ a dirname component of another fw_cfg blob, in which case the
+ creation of the offending /sys/firmware/qemu_fw_cfg/by_name
+ entry will be skipped.
+
+ The authoritative list of entries will continue to be found
+ under the /sys/firmware/qemu_fw_cfg/by_key directory.
diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
index cd26eee..b2fd222 100644
--- a/drivers/firmware/qemu_fw_cfg.c
+++ b/drivers/firmware/qemu_fw_cfg.c
@@ -345,9 +345,104 @@ static struct bin_attribute fw_cfg_sysfs_attr_raw = {
.read = fw_cfg_sysfs_read_raw,
};

+/*
+ * Create a kset subdirectory matching each '/' delimited dirname token
+ * in 'name', starting with sysfs kset/folder 'dir'; At the end, create
+ * a symlink directed at the given 'target'.
+ * NOTE: We do this on a best-effort basis, since 'name' is not guaranteed
+ * to be a well-behaved path name. Whenever a symlink vs. kset directory
+ * name collision occurs, the kernel will issue big scary warnings while
+ * refusing to add the offending link or directory. We follow up with our
+ * own, slightly less scary error messages explaining the situation :)
+ */
+static int __init fw_cfg_build_symlink(struct kset *dir,
+ struct kobject *target,
+ const char *name)
+{
+ int ret;
+ struct kset *subdir;
+ struct kobject *ko;
+ char *name_copy, *p, *tok;
+
+ if (!dir || !target || !name || !*name)
+ return -EINVAL;
+
+ /* clone a copy of name for parsing */
+ name_copy = p = kstrdup(name, GFP_KERNEL);
+ if (!name_copy)
+ return -ENOMEM;
+
+ /* create folders for each dirname token, then symlink for basename */
+ while ((tok = strsep(&p, "/")) && *tok) {
+
+ /* last (basename) token? If so, add symlink here */
+ if (!p || !*p) {
+ ret = sysfs_create_link(&dir->kobj, target, tok);
+ break;
+ }
+
+ /* does the current dir contain an item named after tok ? */
+ ko = kset_find_obj(dir, tok);
+ if (ko) {
+ /* drop reference added by kset_find_obj */
+ kobject_put(ko);
+
+ /* ko MUST be a kset - we're about to use it as one ! */
+ if (ko->ktype != dir->kobj.ktype) {
+ ret = -EINVAL;
+ break;
+ }
+
+ /* descend into already existing subdirectory */
+ dir = to_kset(ko);
+ } else {
+ /* create new subdirectory kset */
+ subdir = kzalloc(sizeof(struct kset), GFP_KERNEL);
+ if (!subdir) {
+ ret = -ENOMEM;
+ break;
+ }
+ subdir->kobj.kset = dir;
+ subdir->kobj.ktype = dir->kobj.ktype;
+ ret = kobject_set_name(&subdir->kobj, "%s", tok);
+ if (ret) {
+ kfree(subdir);
+ break;
+ }
+ ret = kset_register(subdir);
+ if (ret) {
+ kfree(subdir);
+ break;
+ }
+
+ /* descend into newly created subdirectory */
+ dir = subdir;
+ }
+ }
+
+ /* we're done with cloned copy of name */
+ kfree(name_copy);
+ return ret;
+}
+
+/* recursively unregister fw_cfg/by_name/ kset directory tree */
+static void fw_cfg_kset_unregister_recursive(struct kset *kset)
+{
+ struct kobject *k, *next;
+
+ list_for_each_entry_safe(k, next, &kset->list, entry)
+ /* all set members are ksets too, but check just in case... */
+ if (k->ktype == kset->kobj.ktype)
+ fw_cfg_kset_unregister_recursive(to_kset(k));
+
+ /* symlinks are cleanly and automatically removed with the directory */
+ kset_unregister(kset);
+}
+
/* kobjects & kset representing top-level, by_key, and by_name folders */
static struct kobject *fw_cfg_top_ko;
static struct kobject *fw_cfg_sel_ko;
+static struct kset *fw_cfg_fname_kset;

/* callback function to register an individual fw_cfg file */
static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
@@ -376,6 +471,9 @@ static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
if (err)
goto err_add_raw;

+ /* try adding "/sys/firmware/qemu_fw_cfg/by_name/" symlink */
+ fw_cfg_build_symlink(fw_cfg_fname_kset, &entry->kobj, entry->f.name);
+
/* success, add entry to global cache */
fw_cfg_sysfs_cache_enlist(entry);
return 0;
@@ -411,6 +509,9 @@ static int __init fw_cfg_sysfs_init(void)
fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
if (!fw_cfg_sel_ko)
goto err_sel;
+ fw_cfg_fname_kset = kset_create_and_add("by_name", NULL, fw_cfg_top_ko);
+ if (!fw_cfg_fname_kset)
+ goto err_name;

/* get revision number, add matching top-level attribute */
fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
@@ -432,6 +533,8 @@ err_scan:
fw_cfg_sysfs_cache_cleanup();
sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
err_rev:
+ fw_cfg_kset_unregister_recursive(fw_cfg_fname_kset);
+err_name:
fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
err_sel:
fw_cfg_kobj_cleanup(fw_cfg_top_ko);
@@ -444,6 +547,7 @@ static void __exit fw_cfg_sysfs_exit(void)
{
pr_debug("fw_cfg: unloading.\n");
fw_cfg_sysfs_cache_cleanup();
+ fw_cfg_kset_unregister_recursive(fw_cfg_fname_kset);
fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
fw_cfg_kobj_cleanup(fw_cfg_top_ko);
fw_cfg_io_cleanup();
--
2.4.3

2015-08-19 09:39:05

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

From: "Gabriel L. Somlo" <[email protected]>

Hi Gabriel,

> Several different architectures supported by QEMU are set up with a
> "firmware configuration" (fw_cfg) device, used to pass configuration
> "blobs" into the guest by the host running QEMU.
>
> Historically, these config blobs were mostly of interest to the guest
> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
> the command line, which makes them potentially interesting to userspace
> (e.g. for passing early boot environment variables, etc.).
>

Does 'potentially interesting' mean you have a use case? Could you elaborate?

> In addition to cc-ing the people and lists indicated by get-maintainer.pl,
> I've added a few extra lists suggested by Matt Fleming on the qemu-devel
> list, as well as the qemu-devel list itself.
>
> Also cc-ing kernelnewbies, as this is my very first kenel contribution,
> so please go easy on me for whatever silly n00b mistakes I might have still
> missed, in spite of trying hard to do all my homework properly... :)
>
> The series consists of three patches:
>
> 1/3 - probes for the qemu fw_cfg device in locations known to work on
> the supported architectures, in decreasing order of "likelihood".
>
> While it *may* be possible to detect the presence of fw_cfg via
> acpi or dtb (on x86 and arm, respectively), there's no way I know
> of attempting that on sun4 and ppc/mac, so I've stuck with simply
> probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
> in fw_cfg.c. I could use some advice on how else that could be
> done more elegantly, if needed.
>

Sorry, but this is really out of the question, at least on ARM, but surely on
other architectures as well. You can't just go around and probe random memory
addresses. Perhaps QEMU tolerates it, but on anything that resembles a real
system, this will immediately blow up. Also, what happens if the QEMU memory
map changes? Add more probes addresses?

It is not /that/ difficult to simply wire it up to the DT and ACPI
infrastructures, there are plenty of examples in the kernel tree how to
accomplish that. As a bonus, it removes all the arch specific knowledge
from your code, which means that if QEMU grows support for another DT or
ACPI based architecture, it will just work.

I am not sure how relevant sun4 and ppc/mac are for what you are trying to
accomplish, but perhaps it would be best to focus on x86 and ARM for now
and do it correctly. If the probing is actually needed, you can always add
it later.

--
Ard.

2015-08-19 09:42:09

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

(missed some cc's)

On 19 August 2015 at 11:38, Ard Biesheuvel <[email protected]> wrote:
> From: "Gabriel L. Somlo" <[email protected]>
>
> Hi Gabriel,
>
>> Several different architectures supported by QEMU are set up with a
>> "firmware configuration" (fw_cfg) device, used to pass configuration
>> "blobs" into the guest by the host running QEMU.
>>
>> Historically, these config blobs were mostly of interest to the guest
>> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
>> the command line, which makes them potentially interesting to userspace
>> (e.g. for passing early boot environment variables, etc.).
>>
>
> Does 'potentially interesting' mean you have a use case? Could you elaborate?
>
>> In addition to cc-ing the people and lists indicated by get-maintainer.pl,
>> I've added a few extra lists suggested by Matt Fleming on the qemu-devel
>> list, as well as the qemu-devel list itself.
>>
>> Also cc-ing kernelnewbies, as this is my very first kenel contribution,
>> so please go easy on me for whatever silly n00b mistakes I might have still
>> missed, in spite of trying hard to do all my homework properly... :)
>>
>> The series consists of three patches:
>>
>> 1/3 - probes for the qemu fw_cfg device in locations known to work on
>> the supported architectures, in decreasing order of "likelihood".
>>
>> While it *may* be possible to detect the presence of fw_cfg via
>> acpi or dtb (on x86 and arm, respectively), there's no way I know
>> of attempting that on sun4 and ppc/mac, so I've stuck with simply
>> probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
>> in fw_cfg.c. I could use some advice on how else that could be
>> done more elegantly, if needed.
>>
>
> Sorry, but this is really out of the question, at least on ARM, but surely on
> other architectures as well. You can't just go around and probe random memory
> addresses. Perhaps QEMU tolerates it, but on anything that resembles a real
> system, this will immediately blow up. Also, what happens if the QEMU memory
> map changes? Add more probes addresses?
>
> It is not /that/ difficult to simply wire it up to the DT and ACPI
> infrastructures, there are plenty of examples in the kernel tree how to
> accomplish that. As a bonus, it removes all the arch specific knowledge
> from your code, which means that if QEMU grows support for another DT or
> ACPI based architecture, it will just work.
>
> I am not sure how relevant sun4 and ppc/mac are for what you are trying to
> accomplish, but perhaps it would be best to focus on x86 and ARM for now
> and do it correctly. If the probing is actually needed, you can always add
> it later.
>
> --
> Ard.
>

2015-08-19 20:53:00

by Gabriel L. Somlo

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

Hi Ard,

On Wed, Aug 19, 2015 at 11:42:02AM +0200, Ard Biesheuvel wrote:
> (missed some cc's)
>
> On 19 August 2015 at 11:38, Ard Biesheuvel <[email protected]> wrote:
> > From: "Gabriel L. Somlo" <[email protected]>
> >> Several different architectures supported by QEMU are set up with a
> >> "firmware configuration" (fw_cfg) device, used to pass configuration
> >> "blobs" into the guest by the host running QEMU.
> >>
> >> Historically, these config blobs were mostly of interest to the guest
> >> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
> >> the command line, which makes them potentially interesting to userspace
> >> (e.g. for passing early boot environment variables, etc.).
> >>
> >
> > Does 'potentially interesting' mean you have a use case? Could you elaborate?

My personal one would be something like:

cat > guestinfo.txt << EOT
KEY1="val1"
KEY2="val2"
...
EOT

qemu-system-x86_64 ... -fw-cfg name="opt/guestinfo",file=./guestinfo.txt ...

Then, from inside the guest:

. /sys/firmware/qemu_fw_cfg/by_name/opt/guestinfo/raw

do_something_with $KEY1 $KEY2
...

But I'm thinking this is only one of the many positive things one
could do with the ability to access random host-supplied blobs from
guest userspace :)

> >> 1/3 - probes for the qemu fw_cfg device in locations known to work on
> >> the supported architectures, in decreasing order of "likelihood".
> >>
> >> While it *may* be possible to detect the presence of fw_cfg via
> >> acpi or dtb (on x86 and arm, respectively), there's no way I know
> >> of attempting that on sun4 and ppc/mac, so I've stuck with simply
> >> probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
> >> in fw_cfg.c. I could use some advice on how else that could be
> >> done more elegantly, if needed.
> >>
> >
> > Sorry, but this is really out of the question, at least on ARM, but surely on
> > other architectures as well. You can't just go around and probe random memory
> > addresses. Perhaps QEMU tolerates it, but on anything that resembles a real
> > system, this will immediately blow up. Also, what happens if the QEMU memory
> > map changes? Add more probes addresses?
> >
> > It is not /that/ difficult to simply wire it up to the DT and ACPI
> > infrastructures, there are plenty of examples in the kernel tree how to
> > accomplish that. As a bonus, it removes all the arch specific knowledge
> > from your code, which means that if QEMU grows support for another DT or
> > ACPI based architecture, it will just work.

I was *hoping* a successful call to request_[mem_]region() will be
enough in the way of asking for permission before probing for the
fw_cfg registers, but I realize that might still not be polite enough :)

DT on ARM is fine, and I'm certainly happy to learn how to do it (even
though my main focus is, for now, x86). The unfortunate thing though
is that on x86, fw_cfg is *not* AFAICT in ACPI, so I'd have to detour into
first adding it in on the host side, before I can rewrite the guest side
driver to look it up in there :)

> > I am not sure how relevant sun4 and ppc/mac are for what you are trying to
> > accomplish, but perhaps it would be best to focus on x86 and ARM for now
> > and do it correctly. If the probing is actually needed, you can always add
> > it later.

I guess that's the direction things seem to be headed, although it would
make me a bit sad to leave out sun and ppc right from the very beginning :)


Thanks,
--Gabriel

PS. If you have one .c file in the kernel which does any of the DT-on-arm
boilerplate I'm supposed to immitate, I'd appreciate the shortcut :)

PS2. Do you happen to be in Seattle right now ? :)

2015-08-19 23:04:39

by Leif Lindholm

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

On Wed, Aug 19, 2015 at 04:49:15PM -0400, Gabriel L. Somlo wrote:
> Hi Ard,
>
> On Wed, Aug 19, 2015 at 11:42:02AM +0200, Ard Biesheuvel wrote:
> > (missed some cc's)
> >
> > On 19 August 2015 at 11:38, Ard Biesheuvel <[email protected]> wrote:
> > > From: "Gabriel L. Somlo" <[email protected]>
> > >> Several different architectures supported by QEMU are set up with a
> > >> "firmware configuration" (fw_cfg) device, used to pass configuration
> > >> "blobs" into the guest by the host running QEMU.
> > >>
> > >> Historically, these config blobs were mostly of interest to the guest
> > >> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
> > >> the command line, which makes them potentially interesting to userspace
> > >> (e.g. for passing early boot environment variables, etc.).
> > >>
> > >
> > > Does 'potentially interesting' mean you have a use case? Could you elaborate?
>
> My personal one would be something like:
>
> cat > guestinfo.txt << EOT
> KEY1="val1"
> KEY2="val2"
> ...
> EOT
>
> qemu-system-x86_64 ... -fw-cfg name="opt/guestinfo",file=./guestinfo.txt ...
>
> Then, from inside the guest:
>
> . /sys/firmware/qemu_fw_cfg/by_name/opt/guestinfo/raw
>
> do_something_with $KEY1 $KEY2
> ...
>
> But I'm thinking this is only one of the many positive things one
> could do with the ability to access random host-supplied blobs from
> guest userspace :)

> > >> 1/3 - probes for the qemu fw_cfg device in locations known to work on
> > >> the supported architectures, in decreasing order of "likelihood".
> > >>
> > >> While it *may* be possible to detect the presence of fw_cfg via
> > >> acpi or dtb (on x86 and arm, respectively), there's no way I know
> > >> of attempting that on sun4 and ppc/mac, so I've stuck with simply
> > >> probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
> > >> in fw_cfg.c. I could use some advice on how else that could be
> > >> done more elegantly, if needed.
> > >>
> > >
> > > Sorry, but this is really out of the question, at least on ARM, but surely on
> > > other architectures as well. You can't just go around and probe random memory
> > > addresses. Perhaps QEMU tolerates it, but on anything that resembles a real
> > > system, this will immediately blow up. Also, what happens if the QEMU memory
> > > map changes? Add more probes addresses?
> > >
> > > It is not /that/ difficult to simply wire it up to the DT and ACPI
> > > infrastructures, there are plenty of examples in the kernel tree how to
> > > accomplish that. As a bonus, it removes all the arch specific knowledge
> > > from your code, which means that if QEMU grows support for another DT or
> > > ACPI based architecture, it will just work.
>
> I was *hoping* a successful call to request_[mem_]region() will be
> enough in the way of asking for permission before probing for the
> fw_cfg registers, but I realize that might still not be polite enough :)

Either way, it would make sense to not probe in locations that
couldn't possibly work on the current platform. The cleanest way would
probably be a per-architecture probe function (or structure). But even
then, it needs to only probe when it is safe to do so.

> DT on ARM is fine, and I'm certainly happy to learn how to do it (even
> though my main focus is, for now, x86). The unfortunate thing though
> is that on x86, fw_cfg is *not* AFAICT in ACPI, so I'd have to detour into
> first adding it in on the host side, before I can rewrite the guest side
> driver to look it up in there :)

It is probaly the only non-hackish way to do it for arm*.

> > > I am not sure how relevant sun4 and ppc/mac are for what you are trying to
> > > accomplish, but perhaps it would be best to focus on x86 and ARM for now
> > > and do it correctly. If the probing is actually needed, you can always add
> > > it later.
>
> I guess that's the direction things seem to be headed, although it would
> make me a bit sad to leave out sun and ppc right from the very beginning :)
>
> PS. If you have one .c file in the kernel which does any of the DT-on-arm
> boilerplate I'm supposed to immitate, I'd appreciate the shortcut :)
>
> PS2. Do you happen to be in Seattle right now ? :)

Unfortunately, neither Ard nor myself is there. But Mark Rutland
should be around and someone useful to talk to about this.

/
Leif

2015-08-20 05:21:52

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

On 19 August 2015 at 22:49, Gabriel L. Somlo <[email protected]> wrote:
> Hi Ard,
>
> On Wed, Aug 19, 2015 at 11:42:02AM +0200, Ard Biesheuvel wrote:
>> (missed some cc's)
>>
>> On 19 August 2015 at 11:38, Ard Biesheuvel <[email protected]> wrote:
>> > From: "Gabriel L. Somlo" <[email protected]>
>> >> Several different architectures supported by QEMU are set up with a
>> >> "firmware configuration" (fw_cfg) device, used to pass configuration
>> >> "blobs" into the guest by the host running QEMU.
>> >>
>> >> Historically, these config blobs were mostly of interest to the guest
>> >> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
>> >> the command line, which makes them potentially interesting to userspace
>> >> (e.g. for passing early boot environment variables, etc.).
>> >>
>> >
>> > Does 'potentially interesting' mean you have a use case? Could you elaborate?
>
> My personal one would be something like:
>
> cat > guestinfo.txt << EOT
> KEY1="val1"
> KEY2="val2"
> ...
> EOT
>
> qemu-system-x86_64 ... -fw-cfg name="opt/guestinfo",file=./guestinfo.txt ...
>
> Then, from inside the guest:
>
> . /sys/firmware/qemu_fw_cfg/by_name/opt/guestinfo/raw
>
> do_something_with $KEY1 $KEY2
> ...
>
> But I'm thinking this is only one of the many positive things one
> could do with the ability to access random host-supplied blobs from
> guest userspace :)
>

'random host-supplied blobs' sounds awfully like files in a file
system to me, and that is already supported by QEMU and works with any
guest OS unmodified. If you are in control of the command line, surely
you can add a -drive xxx,fat:path/to/blobs -device xxx pair that
simply turns up as a volume.

>> >> 1/3 - probes for the qemu fw_cfg device in locations known to work on
>> >> the supported architectures, in decreasing order of "likelihood".
>> >>
>> >> While it *may* be possible to detect the presence of fw_cfg via
>> >> acpi or dtb (on x86 and arm, respectively), there's no way I know
>> >> of attempting that on sun4 and ppc/mac, so I've stuck with simply
>> >> probing (the fw_cfg_modes[] structure and fw_cfg_io_probe() function)
>> >> in fw_cfg.c. I could use some advice on how else that could be
>> >> done more elegantly, if needed.
>> >>
>> >
>> > Sorry, but this is really out of the question, at least on ARM, but surely on
>> > other architectures as well. You can't just go around and probe random memory
>> > addresses. Perhaps QEMU tolerates it, but on anything that resembles a real
>> > system, this will immediately blow up. Also, what happens if the QEMU memory
>> > map changes? Add more probes addresses?
>> >
>> > It is not /that/ difficult to simply wire it up to the DT and ACPI
>> > infrastructures, there are plenty of examples in the kernel tree how to
>> > accomplish that. As a bonus, it removes all the arch specific knowledge
>> > from your code, which means that if QEMU grows support for another DT or
>> > ACPI based architecture, it will just work.
>
> I was *hoping* a successful call to request_[mem_]region() will be
> enough in the way of asking for permission before probing for the
> fw_cfg registers, but I realize that might still not be polite enough :)
>

No, all request_mem_region() does is check whether the region in
question is not occupied yet by another driver. So your probing could
access unpopulated memory space, or MMIO space owned by a peripheral
whose driver is not loaded. Neither are allowable, I'm afraid.

> DT on ARM is fine, and I'm certainly happy to learn how to do it (even
> though my main focus is, for now, x86). The unfortunate thing though
> is that on x86, fw_cfg is *not* AFAICT in ACPI, so I'd have to detour into
> first adding it in on the host side, before I can rewrite the guest side
> driver to look it up in there :)
>
>> > I am not sure how relevant sun4 and ppc/mac are for what you are trying to
>> > accomplish, but perhaps it would be best to focus on x86 and ARM for now
>> > and do it correctly. If the probing is actually needed, you can always add
>> > it later.
>
> I guess that's the direction things seem to be headed, although it would
> make me a bit sad to leave out sun and ppc right from the very beginning :)
>

Sorry to be blunt, but I am not convinced there is a need for this
driver anyway.

> PS. If you have one .c file in the kernel which does any of the DT-on-arm
> boilerplate I'm supposed to immitate, I'd appreciate the shortcut :)
>

Check out drivers/tty/serial/amba-pl011.c

> PS2. Do you happen to be in Seattle right now ? :)

Nope :-)

2015-08-21 03:51:41

by Gabriel L. Somlo

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

On Thu, Aug 20, 2015 at 07:21:48AM +0200, Ard Biesheuvel wrote:
> On 19 August 2015 at 22:49, Gabriel L. Somlo <[email protected]> wrote:
> >> > From: "Gabriel L. Somlo" <[email protected]>
> >> >> Several different architectures supported by QEMU are set up with a
> >> >> "firmware configuration" (fw_cfg) device, used to pass configuration
> >> >> "blobs" into the guest by the host running QEMU.
> >> >>
> >> >> Historically, these config blobs were mostly of interest to the guest
> >> >> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
> >> >> the command line, which makes them potentially interesting to userspace
> >> >> (e.g. for passing early boot environment variables, etc.).
> >> >>
> >> >
> >> > Does 'potentially interesting' mean you have a use case? Could you elaborate?
> >
> > My personal one would be something like:
> >
> > cat > guestinfo.txt << EOT
> > KEY1="val1"
> > KEY2="val2"
> > ...
> > EOT
> >
> > qemu-system-x86_64 ... -fw-cfg name="opt/guestinfo",file=./guestinfo.txt ...
> >
> > Then, from inside the guest:
> >
> > . /sys/firmware/qemu_fw_cfg/by_name/opt/guestinfo/raw
> >
> > do_something_with $KEY1 $KEY2
> > ...
> >
> > But I'm thinking this is only one of the many positive things one
> > could do with the ability to access random host-supplied blobs from
> > guest userspace :)
> >
>
> 'random host-supplied blobs' sounds awfully like files in a file
> system to me, and that is already supported by QEMU and works with any
> guest OS unmodified. If you are in control of the command line, surely
> you can add a -drive xxx,fat:path/to/blobs -device xxx pair that
> simply turns up as a volume.

That did come up, here's the start of original thread on the qemu mailing
list from a while back:

https://lists.gnu.org/archive/html/qemu-devel/2015-02/msg00371.html

To recap, the main advantages to transfering data this way are:

1. Asynchronous

The host can simply pass data via the qemu command line, and
not have to care if/when the guest is ready to accept the
data (i.e. has made it far enough to e.g. start a guest agent)

2. Out-of-band

I don't have to take over a user-visible element such as a
disk drive. Same reason VSOCK (or VMWare VMCI for that matter)
exist and are NOT actual Ethernet/TCP-IP network interfaces :)

> > DT on ARM is fine, and I'm certainly happy to learn how to do it (even
> > though my main focus is, for now, x86). The unfortunate thing though
> > is that on x86, fw_cfg is *not* AFAICT in ACPI, so I'd have to detour into
> > first adding it in on the host side, before I can rewrite the guest side
> > driver to look it up in there :)
> >
> >> > I am not sure how relevant sun4 and ppc/mac are for what you are trying to
> >> > accomplish, but perhaps it would be best to focus on x86 and ARM for now
> >> > and do it correctly. If the probing is actually needed, you can always add
> >> > it later.
> >
> > I guess that's the direction things seem to be headed, although it would
> > make me a bit sad to leave out sun and ppc right from the very beginning :)
> >
>
> Sorry to be blunt, but I am not convinced there is a need for this
> driver anyway.

See above (hopefully I'm being sufficiently persuasive :) )

In VMWare one would fetch similar "guestinfo" variables via something like

FOO=$(vmware-tools --getinfo "FOO")

but I thought exposing fw_cfg in /sys/firmware/... would make access to
*any* blobs (including, but not limited to my particular use case) even
easier and more generic than that.

> > PS. If you have one .c file in the kernel which does any of the DT-on-arm
> > boilerplate I'm supposed to immitate, I'd appreciate the shortcut :)
> >
>
> Check out drivers/tty/serial/amba-pl011.c

I'll check it out.

Thanks much!
--Gabriel

2015-08-24 07:56:32

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

On 21 August 2015 at 05:47, Gabriel L. Somlo <[email protected]> wrote:
> On Thu, Aug 20, 2015 at 07:21:48AM +0200, Ard Biesheuvel wrote:
>> On 19 August 2015 at 22:49, Gabriel L. Somlo <[email protected]> wrote:
>> >> > From: "Gabriel L. Somlo" <[email protected]>
>> >> >> Several different architectures supported by QEMU are set up with a
>> >> >> "firmware configuration" (fw_cfg) device, used to pass configuration
>> >> >> "blobs" into the guest by the host running QEMU.
>> >> >>
>> >> >> Historically, these config blobs were mostly of interest to the guest
>> >> >> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
>> >> >> the command line, which makes them potentially interesting to userspace
>> >> >> (e.g. for passing early boot environment variables, etc.).
>> >> >>
>> >> >
>> >> > Does 'potentially interesting' mean you have a use case? Could you elaborate?
>> >
>> > My personal one would be something like:
>> >
>> > cat > guestinfo.txt << EOT
>> > KEY1="val1"
>> > KEY2="val2"
>> > ...
>> > EOT
>> >
>> > qemu-system-x86_64 ... -fw-cfg name="opt/guestinfo",file=./guestinfo.txt ...
>> >
>> > Then, from inside the guest:
>> >
>> > . /sys/firmware/qemu_fw_cfg/by_name/opt/guestinfo/raw
>> >
>> > do_something_with $KEY1 $KEY2
>> > ...
>> >
>> > But I'm thinking this is only one of the many positive things one
>> > could do with the ability to access random host-supplied blobs from
>> > guest userspace :)
>> >
>>
>> 'random host-supplied blobs' sounds awfully like files in a file
>> system to me, and that is already supported by QEMU and works with any
>> guest OS unmodified. If you are in control of the command line, surely
>> you can add a -drive xxx,fat:path/to/blobs -device xxx pair that
>> simply turns up as a volume.
>
> That did come up, here's the start of original thread on the qemu mailing
> list from a while back:
>
> https://lists.gnu.org/archive/html/qemu-devel/2015-02/msg00371.html
>
> To recap, the main advantages to transfering data this way are:
>
> 1. Asynchronous
>
> The host can simply pass data via the qemu command line, and
> not have to care if/when the guest is ready to accept the
> data (i.e. has made it far enough to e.g. start a guest agent)
>

How does that not apply to a file system?

> 2. Out-of-band
>
> I don't have to take over a user-visible element such as a
> disk drive. Same reason VSOCK (or VMWare VMCI for that matter)
> exist and are NOT actual Ethernet/TCP-IP network interfaces :)
>

OK that makes sense. Note that I am not the one you need to convince
that this is a good idea, but I would still like to understand better
why your use case requires this. Could you explain?

--
Ard.

2015-08-26 18:15:31

by Christopher Covington

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] SysFS driver for QEMU fw_cfg device

Hi Gabriel,

On 08/19/2015 04:49 PM, Gabriel L. Somlo wrote:
> Hi Ard,
>
> On Wed, Aug 19, 2015 at 11:42:02AM +0200, Ard Biesheuvel wrote:
>> (missed some cc's)
>>
>> On 19 August 2015 at 11:38, Ard Biesheuvel <[email protected]> wrote:
>>> From: "Gabriel L. Somlo" <[email protected]>
>>>> Several different architectures supported by QEMU are set up with a
>>>> "firmware configuration" (fw_cfg) device, used to pass configuration
>>>> "blobs" into the guest by the host running QEMU.
>>>>
>>>> Historically, these config blobs were mostly of interest to the guest
>>>> BIOS, but since QEMU v2.4 it is possible to insert arbitrary blobs via
>>>> the command line, which makes them potentially interesting to userspace
>>>> (e.g. for passing early boot environment variables, etc.).
>>>>
>>>
>>> Does 'potentially interesting' mean you have a use case? Could you elaborate?
>
> My personal one would be something like:
>
> cat > guestinfo.txt << EOT
> KEY1="val1"
> KEY2="val2"
> ...
> EOT
>
> qemu-system-x86_64 ... -fw-cfg name="opt/guestinfo",file=./guestinfo.txt ...
>
> Then, from inside the guest:
>
> . /sys/firmware/qemu_fw_cfg/by_name/opt/guestinfo/raw
>
> do_something_with $KEY1 $KEY2
> ...
>
> But I'm thinking this is only one of the many positive things one
> could do with the ability to access random host-supplied blobs from
> guest userspace :)

I do this with kernel parameters:

host:
qemu-system-aarch64 -append="KEY1=val1 KEY2=val2"

guest:
KEY1=`sed -nr s/.*KEY1=([^ ]+).*/\1/ /proc/cmdline`
KEY2=`sed -nr s/.*KEY2=([^ ]+).*/\1/ /proc/cmdline`

do_something_with $KEY1 $KEY2

In practice it's just script=hostfile, where hostfile is available to the
guest via a 9P passthrough filesystem mount.

While quite architecture specific, I've also previously used an
"angel-cmdline" tool for similar purposes. Peter's recent semihosting patches
support such a tool for AArch64. (On AArch32 upstream QEMU disallows
semihosting from userspace.)

Before I had 9P on all the simulators I regularly ran, I used a semihosting
based "angel-load" tool.

Regards,
Christopher Covington

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project