2020-01-10 16:05:31

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 00/22] tracing: bootconfig: Boot-time tracing and Extra boot config

Hello,

This is the 6th version of the series for the boot-time tracing.

Previous version is here.

https://lkml.kernel.org/r/157736902773.11126.2531161235817081873.stgit@devnote2

Thanks Steve for reivew. I fixed issues in this version.

- [1/22] Remove "!!" from xbc_node_is_value().
Redefine xbc_node_is_key() as "!xbc_node_is_value()".
Fix a memory leak and a bug in __xbc_parse_value().
Add xbc_destroy_all() to clean up the parsed data.
Fix to treat comment right after value as a newline.

- [3/22] Fix memory leaks.
Fix to cleanup old bootconfig on memory before load new one.
Show applying message.
Suppress parse error with wrong data in initrd for delete_xbc().

- [4/22] Add some testcases for value parser
Add a test case for checking delete old bootconfig

- [9/22] Add a note about comment after value.

- [21/22] Fix to depend on CONFIG_DYNAMIC_FTRACE instead
of CONFIG_FUNCTION_TRACER.

This series can be applied on v5.5-rc5 or directly available at;

https://github.com/mhiramat/linux.git ftrace-boottrace-v6


Extra Boot Config
=================

Extra boot config allows admin to pass a tree-structured key-value
list when booting up the kernel. This expands the kernel command
line in an efficient way.

Each key is described as a dot-jointed-words. And user can write
the key-words in tree stlye. (In this version, the tailing ';'
becomes optional. See Documentation/admin-guide/bootconfig.rst)

For example,

feature.option.foo = 1
feature.option.bar = 2

can be also written in

feature.option {
foo = 1
bar = 2
}

or more compact,

feature.option{foo=1;bar=2}

(Note that in both style, the same words are merged automatically
and make a single tree)
All values are treated as a string, or array of strings, e.g.

feature.options = "foo", "bar"

User can see the loaded key-value list via /proc/bootconfig.
The size is limited upto 32KB and 1024 key-words and values
in total.

Boot with a Boot Config
=======================

This version doesn't require to modify boot loaders anymore.
The boot config is loaded with initrd, and there is new "bootconfig"
command under tools/bootconfig.
To add (append) a bootconfig file to an initrd, you can use the
bootconfig command like:

# tools/bootconfig/bootconfig -a your-config /boot/initrd.img-X.Y.Z

This verifies the configuration file too.


Boot-time Tracing
=================

Boot-time tracing supports following boot configs. Please read
Documentation/trace/boottime-trace.rst for details.

- kernel.dump_on_oops [= MODE]
- kernel.traceoff_on_warning
- kernel.tp_printk
- kernel.fgraph_filters = FILTER[, FILTER2...]
- kernel.fgraph_notraces = FILTER[, FILTER2...]
- kernel.fgraph_max_depth = MAX_DEPTH
- ftrace.[instance.INSTANCE.]options = OPT1[,OPT2...]
- ftrace.[instance.INSTANCE.]trace_clock = CLOCK
- ftrace.[instance.INSTANCE.]buffer_size = SIZE
- ftrace.[instance.INSTANCE.]alloc_snapshot
- ftrace.[instance.INSTANCE.]cpumask = CPUMASK
- ftrace.[instance.INSTANCE.]events = EVENT[, EVENT2...]
- ftrace.[instance.INSTANCE.]tracer = TRACER
- ftrace.[instance.INSTANCE.]ftrace.filters
- ftrace.[instance.INSTANCE.]ftrace.notraces
- ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
- ftrace.[instance.INSTANCE.]event.GROUP.EVENT.actions = ACTION[, ACTION2...]
- ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
- ftrace.[instance.INSTANCE.]event.kprobes.EVENT.probes = PROBE[, PROBE2...]
- ftrace.[instance.INSTANCE.]event.synthetic.EVENT.fields = FIELD[, FIELD2...]

Kernel and Init Command Line
============================

Boot config also supports kernel and init command line parameters
except for early kernel parameters.

In boot config, all key-values start with "kernel." are automatically
merged into user passed boot command line, and key-values which
start with "init." are also passed to init. These options are visible
on /proc/cmdline.

For example,

kernel {
audit = on
audit_backlog_limit = 256
}
init.systemd.unified_cgroup_hierarchy = 1


Usage
=====

With this series, we can setup new kprobe and synthetic events, more
complicated event filters and trigger actions including histogram
via supplemental kernel cmdline.

We can add filter and actions for each event, define kprobe events,
and synthetic events with histogram like below.

ftrace.event {
task.task_newtask {
filter = "pid < 128"
enable
}
kprobes.vfs_read {
probes = "vfs_read $arg1 $arg2"
filter = "common_pid < 200"
enable
}
synthetic.initcall_latency {
fields = "unsigned long func", "u64 lat"
actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
}
initcall.initcall_start {
actions = "hist:keys=func:ts0=common_timestamp.usecs"
}
initcall.initcall_finish {
actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
}
}

Also, this supports "instance" node, which allows us to run several
tracers for different purpose at once. For example, one tracer is for
tracing functions start with "user_", and others tracing "kernel_",
you can write boot config as:

ftrace.instance {
foo {
tracer = "function"
ftrace-filters = "user_*"
}
bar {
tracer = "function"
ftrace-filters = "function_*"
}
}

The instance node also accepts event nodes so that each instance
can customize its event tracing.

This boot-time trace also supports ftrace kernel parameters.
For example, following kernel parameters

trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"

it can be written in boot config like below.

ftrace {
options = sym-addr
events = "initcall:*"
tp-printk
buffer-size = 1MB
ftrace-filters = "vfs*"
}

However, since the initialization timing is different, if you need
to trace very early boot, please use normal kernel parameters.

Some Notes
==========

- To align the legacy command line rule, I made the quotes (double
quotes or single quotes) not able to be escaped.
Also, this rejects non-printable chars (except for space). Actually
legacy cmdline accepts any of them, but it might confuse users if
they put a control code by mistake. Imagine that they put a "\b"
on it...

- Since it is not easy to write boot-time tracing without any bug
in bootconfig, a user-helper command will be needed.
That command will generate a boot config file from current ftrace
settings, or try to apply given boot config setting to the ftrace.

Thank you,

---

Masami Hiramatsu (22):
bootconfig: Add Extra Boot Config support
bootconfig: Load boot config from the tail of initrd
tools: bootconfig: Add bootconfig command
tools: bootconfig: Add bootconfig test script
proc: bootconfig: Add /proc/bootconfig to show boot config list
init/main.c: Alloc initcall_command_line in do_initcall() and free it
bootconfig: init: Allow admin to use bootconfig for kernel command line
bootconfig: init: Allow admin to use bootconfig for init command line
Documentation: bootconfig: Add a doc for extended boot config
tracing: Apply soft-disabled and filter to tracepoints printk
tracing: kprobes: Output kprobe event to printk buffer
tracing: kprobes: Register to dynevent earlier stage
tracing: Accept different type for synthetic event fields
tracing: Add NULL trace-array check in print_synth_event()
tracing/boot: Add boot-time tracing
tracing/boot: Add per-event settings
tracing/boot Add kprobe event support
tracing/boot: Add synthetic event support
tracing/boot: Add instance node support
tracing/boot: Add cpu_mask option support
tracing/boot: Add function tracer filter options
Documentation: tracing: Add boot-time tracing document


Documentation/admin-guide/bootconfig.rst | 186 +++++
Documentation/admin-guide/index.rst | 1
Documentation/trace/boottime-trace.rst | 184 +++++
Documentation/trace/index.rst | 1
MAINTAINERS | 9
fs/proc/Makefile | 1
fs/proc/bootconfig.c | 89 ++
include/linux/bootconfig.h | 224 ++++++
include/linux/trace_events.h | 1
init/Kconfig | 12
init/main.c | 213 +++++
kernel/trace/Kconfig | 9
kernel/trace/Makefile | 1
kernel/trace/trace.c | 63 +-
kernel/trace/trace_boot.c | 353 +++++++++
kernel/trace/trace_events.c | 1
kernel/trace/trace_events_hist.c | 14
kernel/trace/trace_events_trigger.c | 2
kernel/trace/trace_kprobe.c | 81 +-
lib/Kconfig | 3
lib/Makefile | 2
lib/bootconfig.c | 803 ++++++++++++++++++++
tools/Makefile | 11
tools/bootconfig/.gitignore | 1
tools/bootconfig/Makefile | 23 +
tools/bootconfig/include/linux/bootconfig.h | 7
tools/bootconfig/include/linux/bug.h | 12
tools/bootconfig/include/linux/ctype.h | 7
tools/bootconfig/include/linux/errno.h | 7
tools/bootconfig/include/linux/kernel.h | 18
tools/bootconfig/include/linux/printk.h | 17
tools/bootconfig/include/linux/string.h | 32 +
tools/bootconfig/main.c | 354 +++++++++
.../samples/bad-array-space-comment.bconf | 5
tools/bootconfig/samples/bad-array.bconf | 2
tools/bootconfig/samples/bad-dotword.bconf | 4
tools/bootconfig/samples/bad-empty.bconf | 1
tools/bootconfig/samples/bad-keyerror.bconf | 2
tools/bootconfig/samples/bad-longkey.bconf | 1
tools/bootconfig/samples/bad-manywords.bconf | 1
tools/bootconfig/samples/bad-no-keyword.bconf | 2
tools/bootconfig/samples/bad-nonprintable.bconf | 2
tools/bootconfig/samples/bad-spaceword.bconf | 2
tools/bootconfig/samples/bad-tree.bconf | 5
tools/bootconfig/samples/bad-value.bconf | 3
tools/bootconfig/samples/escaped.bconf | 3
.../samples/good-array-space-comment.bconf | 4
.../samples/good-comment-after-value.bconf | 1
tools/bootconfig/samples/good-printables.bconf | 2
tools/bootconfig/samples/good-simple.bconf | 11
tools/bootconfig/samples/good-single.bconf | 4
.../samples/good-space-after-value.bconf | 1
tools/bootconfig/samples/good-tree.bconf | 12
tools/bootconfig/test-bootconfig.sh | 105 +++
54 files changed, 2836 insertions(+), 79 deletions(-)
create mode 100644 Documentation/admin-guide/bootconfig.rst
create mode 100644 Documentation/trace/boottime-trace.rst
create mode 100644 fs/proc/bootconfig.c
create mode 100644 include/linux/bootconfig.h
create mode 100644 kernel/trace/trace_boot.c
create mode 100644 lib/bootconfig.c
create mode 100644 tools/bootconfig/.gitignore
create mode 100644 tools/bootconfig/Makefile
create mode 100644 tools/bootconfig/include/linux/bootconfig.h
create mode 100644 tools/bootconfig/include/linux/bug.h
create mode 100644 tools/bootconfig/include/linux/ctype.h
create mode 100644 tools/bootconfig/include/linux/errno.h
create mode 100644 tools/bootconfig/include/linux/kernel.h
create mode 100644 tools/bootconfig/include/linux/printk.h
create mode 100644 tools/bootconfig/include/linux/string.h
create mode 100644 tools/bootconfig/main.c
create mode 100644 tools/bootconfig/samples/bad-array-space-comment.bconf
create mode 100644 tools/bootconfig/samples/bad-array.bconf
create mode 100644 tools/bootconfig/samples/bad-dotword.bconf
create mode 100644 tools/bootconfig/samples/bad-empty.bconf
create mode 100644 tools/bootconfig/samples/bad-keyerror.bconf
create mode 100644 tools/bootconfig/samples/bad-longkey.bconf
create mode 100644 tools/bootconfig/samples/bad-manywords.bconf
create mode 100644 tools/bootconfig/samples/bad-no-keyword.bconf
create mode 100644 tools/bootconfig/samples/bad-nonprintable.bconf
create mode 100644 tools/bootconfig/samples/bad-spaceword.bconf
create mode 100644 tools/bootconfig/samples/bad-tree.bconf
create mode 100644 tools/bootconfig/samples/bad-value.bconf
create mode 100644 tools/bootconfig/samples/escaped.bconf
create mode 100644 tools/bootconfig/samples/good-array-space-comment.bconf
create mode 100644 tools/bootconfig/samples/good-comment-after-value.bconf
create mode 100644 tools/bootconfig/samples/good-printables.bconf
create mode 100644 tools/bootconfig/samples/good-simple.bconf
create mode 100644 tools/bootconfig/samples/good-single.bconf
create mode 100644 tools/bootconfig/samples/good-space-after-value.bconf
create mode 100644 tools/bootconfig/samples/good-tree.bconf
create mode 100755 tools/bootconfig/test-bootconfig.sh

--
Masami Hiramatsu (Linaro) <[email protected]>


2020-01-10 16:05:55

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 02/22] bootconfig: Load boot config from the tail of initrd

Load the extended boot config data from the tail of initrd
image. If there is an SKC data there, it has
[(u32)size][(u32)checksum] header (in really, this is a
footer) at the end of initrd. If the checksum (simple sum
of bytes) is match, this starts parsing it from there.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v4:
- Rename skc to bootconfig.
---
init/Kconfig | 1 +
init/main.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 55 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 63450d3bbf12..ffd240fb88c3 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1217,6 +1217,7 @@ endif

config BOOT_CONFIG
bool "Boot config support"
+ depends on BLK_DEV_INITRD
select LIBXBC
default y
help
diff --git a/init/main.c b/init/main.c
index 2cd736059416..59c418a57f92 100644
--- a/init/main.c
+++ b/init/main.c
@@ -28,6 +28,7 @@
#include <linux/initrd.h>
#include <linux/memblock.h>
#include <linux/acpi.h>
+#include <linux/bootconfig.h>
#include <linux/console.h>
#include <linux/nmi.h>
#include <linux/percpu.h>
@@ -245,6 +246,58 @@ static int __init loglevel(char *str)

early_param("loglevel", loglevel);

+#ifdef CONFIG_BOOT_CONFIG
+u32 boot_config_checksum(unsigned char *p, u32 size)
+{
+ u32 ret = 0;
+
+ while (size--)
+ ret += *p++;
+
+ return ret;
+}
+
+static void __init setup_boot_config(void)
+{
+ u32 size, csum;
+ char *data, *copy;
+ u32 *hdr;
+
+ if (!initrd_end)
+ return;
+
+ hdr = (u32 *)(initrd_end - 8);
+ size = hdr[0];
+ csum = hdr[1];
+
+ if (size >= XBC_DATA_MAX)
+ return;
+
+ data = ((void *)hdr) - size;
+ if ((unsigned long)data < initrd_start)
+ return;
+
+ if (boot_config_checksum((unsigned char *)data, size) != csum)
+ return;
+
+ copy = memblock_alloc(size + 1, SMP_CACHE_BYTES);
+ if (!copy) {
+ pr_err("Failed to allocate memory for boot config\n");
+ return;
+ }
+
+ memcpy(copy, data, size);
+ copy[size] = '\0';
+
+ if (xbc_init(copy) < 0)
+ pr_err("Failed to parse boot config\n");
+ else
+ pr_info("Load boot config: %d bytes\n", size);
+}
+#else
+#define setup_boot_config() do { } while (0)
+#endif
+
/* Change NUL term back to "=", to make "param" the whole string. */
static int __init repair_env_string(char *param, char *val,
const char *unused, void *arg)
@@ -595,6 +648,7 @@ asmlinkage __visible void __init start_kernel(void)
pr_notice("%s", linux_banner);
early_security_init();
setup_arch(&command_line);
+ setup_boot_config();
setup_command_line(command_line);
setup_nr_cpu_ids();
setup_per_cpu_areas();

2020-01-10 16:05:59

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 01/22] bootconfig: Add Extra Boot Config support

Extra Boot Config (XBC) allows admin to pass a tree-structured
boot configuration file when boot up the kernel. This extends
the kernel command line in an efficient way.

Boot config will contain some key-value commands, e.g.

key.word = value1
another.key.word = value2

It can fold same keys with braces, also you can write array
data. For example,

key {
word1 {
setting1 = data
setting2
}
word2.array = "val1", "val2"
}

User can access these key-value pair and tree structure via
SKC APIs.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v6:
- Remove "!!" from xbc_node_is_value().
- Redefine xbc_node_is_key() as "!xbc_node_is_value()".
- Fix a memory leak and a bug in __xbc_parse_value() (Thanks Steve!).
- Add xbc_destroy_all() to clean up the parsed data.
- Fix to treat comment right after value as a newline.
Changes in v5:
- Fix help comment and indent (Thanks Randy!)
- Restrict available characters in values (only printables and spaces.)
- Drop "escape" backslash support.
Changes in v4:
- Rename suppremental kernel command line to extended boot.
config so that easy to understand what it is.
- Clean up given data if failed to parse it.
- Add comment support (start with #)
- Return -ENOENT if given data has no node.
- Ensure the key max depth and keylen are under limitation.
- Add single quotes support.
- Allow newline and closing brace to terminate key-value.
- Add xbc_node_compose_key_after().
- Move kconfig to generic setup.
- Expand the max number of node to 1024.
---
MAINTAINERS | 6
include/linux/bootconfig.h | 224 ++++++++++++
init/Kconfig | 11 +
lib/Kconfig | 3
lib/Makefile | 2
lib/bootconfig.c | 803 ++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 1049 insertions(+)
create mode 100644 include/linux/bootconfig.h
create mode 100644 lib/bootconfig.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 8982c6e013b3..1ef065234cff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15773,6 +15773,12 @@ W: http://www.stlinux.com
S: Supported
F: drivers/net/ethernet/stmicro/stmmac/

+EXTRA BOOT CONFIG
+M: Masami Hiramatsu <[email protected]>
+S: Maintained
+F: lib/bootconfig.c
+F: include/linux/bootconfig.h
+
SUN3/3X
M: Sam Creasey <[email protected]>
W: http://sammy.net/sun3/
diff --git a/include/linux/bootconfig.h b/include/linux/bootconfig.h
new file mode 100644
index 000000000000..7e18c939663e
--- /dev/null
+++ b/include/linux/bootconfig.h
@@ -0,0 +1,224 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_XBC_H
+#define _LINUX_XBC_H
+/*
+ * Extra Boot Config
+ * Copyright (C) 2019 Linaro Ltd.
+ * Author: Masami Hiramatsu <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+
+/* XBC tree node */
+struct xbc_node {
+ u16 next;
+ u16 child;
+ u16 parent;
+ u16 data;
+} __attribute__ ((__packed__));
+
+#define XBC_KEY 0
+#define XBC_VALUE (1 << 15)
+/* Maximum size of boot config is 32KB - 1 */
+#define XBC_DATA_MAX (XBC_VALUE - 1)
+
+#define XBC_NODE_MAX 1024
+#define XBC_KEYLEN_MAX 256
+#define XBC_DEPTH_MAX 16
+
+/* Node tree access raw APIs */
+struct xbc_node * __init xbc_root_node(void);
+int __init xbc_node_index(struct xbc_node *node);
+struct xbc_node * __init xbc_node_get_parent(struct xbc_node *node);
+struct xbc_node * __init xbc_node_get_child(struct xbc_node *node);
+struct xbc_node * __init xbc_node_get_next(struct xbc_node *node);
+const char * __init xbc_node_get_data(struct xbc_node *node);
+
+/**
+ * xbc_node_is_value() - Test the node is a value node
+ * @node: An XBC node.
+ *
+ * Test the @node is a value node and return true if a value node, false if not.
+ */
+static inline __init bool xbc_node_is_value(struct xbc_node *node)
+{
+ return node->data & XBC_VALUE;
+}
+
+/**
+ * xbc_node_is_key() - Test the node is a key node
+ * @node: An XBC node.
+ *
+ * Test the @node is a key node and return true if a key node, false if not.
+ */
+static inline __init bool xbc_node_is_key(struct xbc_node *node)
+{
+ return !xbc_node_is_value(node);
+}
+
+/**
+ * xbc_node_is_array() - Test the node is an arraied value node
+ * @node: An XBC node.
+ *
+ * Test the @node is an arraied value node.
+ */
+static inline __init bool xbc_node_is_array(struct xbc_node *node)
+{
+ return xbc_node_is_value(node) && node->next != 0;
+}
+
+/**
+ * xbc_node_is_leaf() - Test the node is a leaf key node
+ * @node: An XBC node.
+ *
+ * Test the @node is a leaf key node which is a key node and has a value node
+ * or no child. Returns true if it is a leaf node, or false if not.
+ */
+static inline __init bool xbc_node_is_leaf(struct xbc_node *node)
+{
+ return xbc_node_is_key(node) &&
+ (!node->child || xbc_node_is_value(xbc_node_get_child(node)));
+}
+
+/* Tree-based key-value access APIs */
+struct xbc_node * __init xbc_node_find_child(struct xbc_node *parent,
+ const char *key);
+
+const char * __init xbc_node_find_value(struct xbc_node *parent,
+ const char *key,
+ struct xbc_node **vnode);
+
+struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
+ struct xbc_node *leaf);
+
+const char * __init xbc_node_find_next_key_value(struct xbc_node *root,
+ struct xbc_node **leaf);
+
+/**
+ * xbc_find_value() - Find a value which matches the key
+ * @key: Search key
+ * @vnode: A container pointer of XBC value node.
+ *
+ * Search a value whose key matches @key from whole of XBC tree and return
+ * the value if found. Found value node is stored in *@vnode.
+ * Note that this can return 0-length string and store NULL in *@vnode for
+ * key-only (non-value) entry.
+ */
+static inline const char * __init
+xbc_find_value(const char *key, struct xbc_node **vnode)
+{
+ return xbc_node_find_value(NULL, key, vnode);
+}
+
+/**
+ * xbc_find_node() - Find a node which matches the key
+ * @key: Search key
+ *
+ * Search a (key) node whose key matches @key from whole of XBC tree and
+ * return the node if found. If not found, returns NULL.
+ */
+static inline struct xbc_node * __init xbc_find_node(const char *key)
+{
+ return xbc_node_find_child(NULL, key);
+}
+
+/**
+ * xbc_array_for_each_value() - Iterate value nodes on an array
+ * @anode: An XBC arraied value node
+ * @value: A value
+ *
+ * Iterate array value nodes and values starts from @anode. This is expected to
+ * be used with xbc_find_value() and xbc_node_find_value(), so that user can
+ * process each array entry node.
+ */
+#define xbc_array_for_each_value(anode, value) \
+ for (value = xbc_node_get_data(anode); anode != NULL ; \
+ anode = xbc_node_get_next(anode), \
+ value = anode ? xbc_node_get_data(anode) : NULL)
+
+/**
+ * xbc_node_for_each_child() - Iterate child nodes
+ * @parent: An XBC node.
+ * @child: Iterated XBC node.
+ *
+ * Iterate child nodes of @parent. Each child nodes are stored to @child.
+ */
+#define xbc_node_for_each_child(parent, child) \
+ for (child = xbc_node_get_child(parent); child != NULL ; \
+ child = xbc_node_get_next(child))
+
+/**
+ * xbc_node_for_each_array_value() - Iterate array entries of geven key
+ * @node: An XBC node.
+ * @key: A key string searched under @node
+ * @anode: Iterated XBC node of array entry.
+ * @value: Iterated value of array entry.
+ *
+ * Iterate array entries of given @key under @node. Each array entry node
+ * is stroed to @anode and @value. If the @node doesn't have @key node,
+ * it does nothing.
+ * Note that even if the found key node has only one value (not array)
+ * this executes block once. Hoever, if the found key node has no value
+ * (key-only node), this does nothing. So don't use this for testing the
+ * key-value pair existence.
+ */
+#define xbc_node_for_each_array_value(node, key, anode, value) \
+ for (value = xbc_node_find_value(node, key, &anode); value != NULL; \
+ anode = xbc_node_get_next(anode), \
+ value = anode ? xbc_node_get_data(anode) : NULL)
+
+/**
+ * xbc_node_for_each_key_value() - Iterate key-value pairs under a node
+ * @node: An XBC node.
+ * @knode: Iterated key node
+ * @value: Iterated value string
+ *
+ * Iterate key-value pairs under @node. Each key node and value string are
+ * stored in @knode and @value respectively.
+ */
+#define xbc_node_for_each_key_value(node, knode, value) \
+ for (knode = NULL, value = xbc_node_find_next_key_value(node, &knode);\
+ knode != NULL; value = xbc_node_find_next_key_value(node, &knode))
+
+/**
+ * xbc_for_each_key_value() - Iterate key-value pairs
+ * @knode: Iterated key node
+ * @value: Iterated value string
+ *
+ * Iterate key-value pairs in whole XBC tree. Each key node and value string
+ * are stored in @knode and @value respectively.
+ */
+#define xbc_for_each_key_value(knode, value) \
+ xbc_node_for_each_key_value(NULL, knode, value)
+
+/* Compose partial key */
+int __init xbc_node_compose_key_after(struct xbc_node *root,
+ struct xbc_node *node, char *buf, size_t size);
+
+/**
+ * xbc_node_compose_key() - Compose full key string of the XBC node
+ * @node: An XBC node.
+ * @buf: A buffer to store the key.
+ * @size: The size of the @buf.
+ *
+ * Compose the full-length key of the @node into @buf. Returns the total
+ * length of the key stored in @buf. Or returns -EINVAL if @node is NULL,
+ * and -ERANGE if the key depth is deeper than max depth.
+ */
+static inline int __init xbc_node_compose_key(struct xbc_node *node,
+ char *buf, size_t size)
+{
+ return xbc_node_compose_key_after(NULL, node, buf, size);
+}
+
+/* XBC node initializer */
+int __init xbc_init(char *buf);
+
+/* XBC cleanup data structures */
+void __init xbc_destroy_all(void);
+
+/* Debug dump functions */
+void __init xbc_debug_dump(void);
+
+#endif
diff --git a/init/Kconfig b/init/Kconfig
index a34064a031a5..63450d3bbf12 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1215,6 +1215,17 @@ source "usr/Kconfig"

endif

+config BOOT_CONFIG
+ bool "Boot config support"
+ select LIBXBC
+ default y
+ help
+ Extra boot config allows system admin to pass a config file as
+ complemental extension of kernel cmdline when booting.
+ The boot config file is usually attached at the end of initramfs.
+
+ If unsure, say Y.
+
choice
prompt "Compiler optimization level"
default CC_OPTIMIZE_FOR_PERFORMANCE
diff --git a/lib/Kconfig b/lib/Kconfig
index 6e790dc55c5b..10012b646009 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -566,6 +566,9 @@ config DIMLIB
config LIBFDT
bool

+config LIBXBC
+ bool
+
config OID_REGISTRY
tristate
help
diff --git a/lib/Makefile b/lib/Makefile
index 93217d44237f..75a64d2552a2 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -228,6 +228,8 @@ $(foreach file, $(libfdt_files), \
$(eval CFLAGS_$(file) = -I $(srctree)/scripts/dtc/libfdt))
lib-$(CONFIG_LIBFDT) += $(libfdt_files)

+lib-$(CONFIG_LIBXBC) += bootconfig.o
+
obj-$(CONFIG_RBTREE_TEST) += rbtree_test.o
obj-$(CONFIG_INTERVAL_TREE_TEST) += interval_tree_test.o

diff --git a/lib/bootconfig.c b/lib/bootconfig.c
new file mode 100644
index 000000000000..055014e233a5
--- /dev/null
+++ b/lib/bootconfig.c
@@ -0,0 +1,803 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Extra Boot Config
+ * Masami Hiramatsu <[email protected]>
+ */
+
+#define pr_fmt(fmt) "bootconfig: " fmt
+
+#include <linux/bug.h>
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/printk.h>
+#include <linux/bootconfig.h>
+#include <linux/string.h>
+
+/*
+ * Extra Boot Config (XBC) is given as tree-structured ascii text of
+ * key-value pairs on memory.
+ * xbc_parse() parses the text to build a simple tree. Each tree node is
+ * simply a key word or a value. A key node may have a next key node or/and
+ * a child node (both key and value). A value node may have a next value
+ * node (for array).
+ */
+
+static struct xbc_node xbc_nodes[XBC_NODE_MAX] __initdata;
+static int xbc_node_num __initdata;
+static char *xbc_data __initdata;
+static size_t xbc_data_size __initdata;
+static struct xbc_node *last_parent __initdata;
+
+static int __init xbc_parse_error(const char *msg, const char *p)
+{
+ int pos = p - xbc_data;
+
+ pr_err("Parse error at pos %d: %s\n", pos, msg);
+ return -EINVAL;
+}
+
+/**
+ * xbc_root_node() - Get the root node of extended boot config
+ *
+ * Return the address of root node of extended boot config. If the
+ * extended boot config is not initiized, return NULL.
+ */
+struct xbc_node * __init xbc_root_node(void)
+{
+ if (unlikely(!xbc_data))
+ return NULL;
+
+ return xbc_nodes;
+}
+
+/**
+ * xbc_node_index() - Get the index of XBC node
+ * @node: A target node of getting index.
+ *
+ * Return the index number of @node in XBC node list.
+ */
+int __init xbc_node_index(struct xbc_node *node)
+{
+ return node - &xbc_nodes[0];
+}
+
+/**
+ * xbc_node_get_parent() - Get the parent XBC node
+ * @node: An XBC node.
+ *
+ * Return the parent node of @node. If the node is top node of the tree,
+ * return NULL.
+ */
+struct xbc_node * __init xbc_node_get_parent(struct xbc_node *node)
+{
+ return node->parent == XBC_NODE_MAX ? NULL : &xbc_nodes[node->parent];
+}
+
+/**
+ * xbc_node_get_child() - Get the child XBC node
+ * @node: An XBC node.
+ *
+ * Return the first child node of @node. If the node has no child, return
+ * NULL.
+ */
+struct xbc_node * __init xbc_node_get_child(struct xbc_node *node)
+{
+ return node->child ? &xbc_nodes[node->child] : NULL;
+}
+
+/**
+ * xbc_node_get_next() - Get the next sibling XBC node
+ * @node: An XBC node.
+ *
+ * Return the NEXT sibling node of @node. If the node has no next sibling,
+ * return NULL. Note that even if this returns NULL, it doesn't mean @node
+ * has no siblings. (You also has to check whether the parent's child node
+ * is @node or not.)
+ */
+struct xbc_node * __init xbc_node_get_next(struct xbc_node *node)
+{
+ return node->next ? &xbc_nodes[node->next] : NULL;
+}
+
+/**
+ * xbc_node_get_data() - Get the data of XBC node
+ * @node: An XBC node.
+ *
+ * Return the data (which is always a null terminated string) of @node.
+ * If the node has invalid data, warn and return NULL.
+ */
+const char * __init xbc_node_get_data(struct xbc_node *node)
+{
+ int offset = node->data & ~XBC_VALUE;
+
+ if (WARN_ON(offset >= xbc_data_size))
+ return NULL;
+
+ return xbc_data + offset;
+}
+
+static bool __init
+xbc_node_match_prefix(struct xbc_node *node, const char **prefix)
+{
+ const char *p = xbc_node_get_data(node);
+ int len = strlen(p);
+
+ if (strncmp(*prefix, p, len))
+ return false;
+
+ p = *prefix + len;
+ if (*p == '.')
+ p++;
+ else if (*p != '\0')
+ return false;
+ *prefix = p;
+
+ return true;
+}
+
+/**
+ * xbc_node_find_child() - Find a child node which matches given key
+ * @parent: An XBC node.
+ * @key: A key string.
+ *
+ * Search a node under @parent which matches @key. The @key can contain
+ * several words jointed with '.'. If @parent is NULL, this searches the
+ * node from whole tree. Return NULL if no node is matched.
+ */
+struct xbc_node * __init
+xbc_node_find_child(struct xbc_node *parent, const char *key)
+{
+ struct xbc_node *node;
+
+ if (parent)
+ node = xbc_node_get_child(parent);
+ else
+ node = xbc_root_node();
+
+ while (node && xbc_node_is_key(node)) {
+ if (!xbc_node_match_prefix(node, &key))
+ node = xbc_node_get_next(node);
+ else if (*key != '\0')
+ node = xbc_node_get_child(node);
+ else
+ break;
+ }
+
+ return node;
+}
+
+/**
+ * xbc_node_find_value() - Find a value node which matches given key
+ * @parent: An XBC node.
+ * @key: A key string.
+ * @vnode: A container pointer of found XBC node.
+ *
+ * Search a value node under @parent whose (parent) key node matches @key,
+ * store it in *@vnode, and returns the value string.
+ * The @key can contain several words jointed with '.'. If @parent is NULL,
+ * this searches the node from whole tree. Return the value string if a
+ * matched key found, return NULL if no node is matched.
+ * Note that this returns 0-length string and stores NULL in *@vnode if the
+ * key has no value. And also it will return the value of the first entry if
+ * the value is an array.
+ */
+const char * __init
+xbc_node_find_value(struct xbc_node *parent, const char *key,
+ struct xbc_node **vnode)
+{
+ struct xbc_node *node = xbc_node_find_child(parent, key);
+
+ if (!node || !xbc_node_is_key(node))
+ return NULL;
+
+ node = xbc_node_get_child(node);
+ if (node && !xbc_node_is_value(node))
+ return NULL;
+
+ if (vnode)
+ *vnode = node;
+
+ return node ? xbc_node_get_data(node) : "";
+}
+
+/**
+ * xbc_node_compose_key_after() - Compose partial key string of the XBC node
+ * @root: Root XBC node
+ * @node: Target XBC node.
+ * @buf: A buffer to store the key.
+ * @size: The size of the @buf.
+ *
+ * Compose the partial key of the @node into @buf, which is starting right
+ * after @root (@root is not included.) If @root is NULL, this returns full
+ * key words of @node.
+ * Returns the total length of the key stored in @buf. Returns -EINVAL
+ * if @node is NULL or @root is not the ancestor of @node or @root is @node,
+ * or returns -ERANGE if the key depth is deeper than max depth.
+ * This is expected to be used with xbc_find_node() to list up all (child)
+ * keys under given key.
+ */
+int __init xbc_node_compose_key_after(struct xbc_node *root,
+ struct xbc_node *node,
+ char *buf, size_t size)
+{
+ u16 keys[XBC_DEPTH_MAX];
+ int depth = 0, ret = 0, total = 0;
+
+ if (!node || node == root)
+ return -EINVAL;
+
+ if (xbc_node_is_value(node))
+ node = xbc_node_get_parent(node);
+
+ while (node && node != root) {
+ keys[depth++] = xbc_node_index(node);
+ if (depth == XBC_DEPTH_MAX)
+ return -ERANGE;
+ node = xbc_node_get_parent(node);
+ }
+ if (!node && root)
+ return -EINVAL;
+
+ while (--depth >= 0) {
+ node = xbc_nodes + keys[depth];
+ ret = snprintf(buf, size, "%s%s", xbc_node_get_data(node),
+ depth ? "." : "");
+ if (ret < 0)
+ return ret;
+ if (ret > size) {
+ size = 0;
+ } else {
+ size -= ret;
+ buf += ret;
+ }
+ total += ret;
+ }
+
+ return total;
+}
+
+/**
+ * xbc_node_find_next_leaf() - Find the next leaf node under given node
+ * @root: An XBC root node
+ * @node: An XBC node which starts from.
+ *
+ * Search the next leaf node (which means the terminal key node) of @node
+ * under @root node (including @root node itself).
+ * Return the next node or NULL if next leaf node is not found.
+ */
+struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
+ struct xbc_node *node)
+{
+ if (unlikely(!xbc_data))
+ return NULL;
+
+ if (!node) { /* First try */
+ node = root;
+ if (!node)
+ node = xbc_nodes;
+ } else {
+ if (node == root) /* @root was a leaf, no child node. */
+ return NULL;
+
+ while (!node->next) {
+ node = xbc_node_get_parent(node);
+ if (node == root)
+ return NULL;
+ /* User passed a node which is not uder parent */
+ if (WARN_ON(!node))
+ return NULL;
+ }
+ node = xbc_node_get_next(node);
+ }
+
+ while (node && !xbc_node_is_leaf(node))
+ node = xbc_node_get_child(node);
+
+ return node;
+}
+
+/**
+ * xbc_node_find_next_key_value() - Find the next key-value pair nodes
+ * @root: An XBC root node
+ * @leaf: A container pointer of XBC node which starts from.
+ *
+ * Search the next leaf node (which means the terminal key node) of *@leaf
+ * under @root node. Returns the value and update *@leaf if next leaf node
+ * is found, or NULL if no next leaf node is found.
+ * Note that this returns 0-length string if the key has no value, or
+ * the value of the first entry if the value is an array.
+ */
+const char * __init xbc_node_find_next_key_value(struct xbc_node *root,
+ struct xbc_node **leaf)
+{
+ /* tip must be passed */
+ if (WARN_ON(!leaf))
+ return NULL;
+
+ *leaf = xbc_node_find_next_leaf(root, *leaf);
+ if (!*leaf)
+ return NULL;
+ if ((*leaf)->child)
+ return xbc_node_get_data(xbc_node_get_child(*leaf));
+ else
+ return ""; /* No value key */
+}
+
+/* XBC parse and tree build */
+
+static struct xbc_node * __init xbc_add_node(char *data, u32 flag)
+{
+ struct xbc_node *node;
+ unsigned long offset;
+
+ if (xbc_node_num == XBC_NODE_MAX)
+ return NULL;
+
+ node = &xbc_nodes[xbc_node_num++];
+ offset = data - xbc_data;
+ node->data = (u16)offset;
+ if (WARN_ON(offset >= XBC_DATA_MAX))
+ return NULL;
+ node->data |= flag;
+ node->child = 0;
+ node->next = 0;
+
+ return node;
+}
+
+static inline __init struct xbc_node *xbc_last_sibling(struct xbc_node *node)
+{
+ while (node->next)
+ node = xbc_node_get_next(node);
+
+ return node;
+}
+
+static struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
+{
+ struct xbc_node *sib, *node = xbc_add_node(data, flag);
+
+ if (node) {
+ if (!last_parent) {
+ node->parent = XBC_NODE_MAX;
+ sib = xbc_last_sibling(xbc_nodes);
+ sib->next = xbc_node_index(node);
+ } else {
+ node->parent = xbc_node_index(last_parent);
+ if (!last_parent->child) {
+ last_parent->child = xbc_node_index(node);
+ } else {
+ sib = xbc_node_get_child(last_parent);
+ sib = xbc_last_sibling(sib);
+ sib->next = xbc_node_index(node);
+ }
+ }
+ }
+
+ return node;
+}
+
+static inline __init struct xbc_node *xbc_add_child(char *data, u32 flag)
+{
+ struct xbc_node *node = xbc_add_sibling(data, flag);
+
+ if (node)
+ last_parent = node;
+
+ return node;
+}
+
+static inline __init bool xbc_valid_keyword(char *key)
+{
+ if (key[0] == '\0')
+ return false;
+
+ while (isalnum(*key) || *key == '-' || *key == '_')
+ key++;
+
+ return *key == '\0';
+}
+
+static char *skip_comment(char *p)
+{
+ char *ret;
+
+ ret = strchr(p, '\n');
+ if (!ret)
+ ret = p + strlen(p);
+ else
+ ret++;
+
+ return ret;
+}
+
+static char *skip_spaces_until_newline(char *p)
+{
+ while (isspace(*p) && *p != '\n')
+ p++;
+ return p;
+}
+
+static int __init __xbc_open_brace(void)
+{
+ /* Mark the last key as open brace */
+ last_parent->next = XBC_NODE_MAX;
+
+ return 0;
+}
+
+static int __init __xbc_close_brace(char *p)
+{
+ struct xbc_node *node;
+
+ if (!last_parent || last_parent->next != XBC_NODE_MAX)
+ return xbc_parse_error("Unexpected closing brace", p);
+
+ node = last_parent;
+ node->next = 0;
+ do {
+ node = xbc_node_get_parent(node);
+ } while (node && node->next != XBC_NODE_MAX);
+ last_parent = node;
+
+ return 0;
+}
+
+/*
+ * Return delimiter or error, no node added. As same as lib/cmdline.c,
+ * you can use " around spaces, but can't escape " for value.
+ */
+static int __init __xbc_parse_value(char **__v, char **__n)
+{
+ char *p, *v = *__v;
+ int c, quotes = 0;
+
+ v = skip_spaces(v);
+ while (*v == '#') {
+ v = skip_comment(v);
+ v = skip_spaces(v);
+ }
+ if (*v == '"' || *v == '\'') {
+ quotes = *v;
+ v++;
+ }
+ p = v - 1;
+ while ((c = *++p)) {
+ if (!isprint(c) && !isspace(c))
+ return xbc_parse_error("Non printable value", p);
+ if (quotes) {
+ if (c != quotes)
+ continue;
+ quotes = 0;
+ *p++ = '\0';
+ p = skip_spaces_until_newline(p);
+ c = *p;
+ if (c && !strchr(",;\n#}", c))
+ return xbc_parse_error("No value delimiter", p);
+ if (*p)
+ p++;
+ break;
+ }
+ if (strchr(",;\n#}", c)) {
+ v = strim(v);
+ *p++ = '\0';
+ break;
+ }
+ }
+ if (quotes)
+ return xbc_parse_error("No closing quotes", p);
+ if (c == '#') {
+ p = skip_comment(p);
+ c = '\n'; /* A comment must be treated as a newline */
+ }
+ *__n = p;
+ *__v = v;
+
+ return c;
+}
+
+static int __init xbc_parse_array(char **__v)
+{
+ struct xbc_node *node;
+ char *next;
+ int c = 0;
+
+ do {
+ c = __xbc_parse_value(__v, &next);
+ if (c < 0)
+ return c;
+
+ node = xbc_add_sibling(*__v, XBC_VALUE);
+ if (!node)
+ return -ENOMEM;
+ *__v = next;
+ } while (c == ',');
+ node->next = 0;
+
+ return c;
+}
+
+static inline __init
+struct xbc_node *find_match_node(struct xbc_node *node, char *k)
+{
+ while (node) {
+ if (!strcmp(xbc_node_get_data(node), k))
+ break;
+ node = xbc_node_get_next(node);
+ }
+ return node;
+}
+
+static int __init __xbc_add_key(char *k)
+{
+ struct xbc_node *node;
+
+ if (!xbc_valid_keyword(k))
+ return xbc_parse_error("Invalid keyword", k);
+
+ if (unlikely(xbc_node_num == 0))
+ goto add_node;
+
+ if (!last_parent) /* the first level */
+ node = find_match_node(xbc_nodes, k);
+ else
+ node = find_match_node(xbc_node_get_child(last_parent), k);
+
+ if (node)
+ last_parent = node;
+ else {
+add_node:
+ node = xbc_add_child(k, XBC_KEY);
+ if (!node)
+ return -ENOMEM;
+ }
+ return 0;
+}
+
+static int __init __xbc_parse_keys(char *k)
+{
+ char *p;
+ int ret;
+
+ k = strim(k);
+ while ((p = strchr(k, '.'))) {
+ *p++ = '\0';
+ ret = __xbc_add_key(k);
+ if (ret)
+ return ret;
+ k = p;
+ }
+
+ return __xbc_add_key(k);
+}
+
+static int __init xbc_parse_kv(char **k, char *v)
+{
+ struct xbc_node *prev_parent = last_parent;
+ struct xbc_node *node;
+ char *next;
+ int c, ret;
+
+ ret = __xbc_parse_keys(*k);
+ if (ret)
+ return ret;
+
+ c = __xbc_parse_value(&v, &next);
+ if (c < 0)
+ return c;
+
+ node = xbc_add_sibling(v, XBC_VALUE);
+ if (!node)
+ return -ENOMEM;
+
+ if (c == ',') { /* Array */
+ c = xbc_parse_array(&next);
+ if (c < 0)
+ return c;
+ }
+
+ last_parent = prev_parent;
+
+ if (c == '}') {
+ ret = __xbc_close_brace(next - 1);
+ if (ret < 0)
+ return ret;
+ }
+
+ *k = next;
+
+ return 0;
+}
+
+static int __init xbc_parse_key(char **k, char *n)
+{
+ struct xbc_node *prev_parent = last_parent;
+ int ret;
+
+ *k = strim(*k);
+ if (**k != '\0') {
+ ret = __xbc_parse_keys(*k);
+ if (ret)
+ return ret;
+ last_parent = prev_parent;
+ }
+ *k = n;
+
+ return 0;
+}
+
+static int __init xbc_open_brace(char **k, char *n)
+{
+ int ret;
+
+ ret = __xbc_parse_keys(*k);
+ if (ret)
+ return ret;
+ *k = n;
+
+ return __xbc_open_brace();
+}
+
+static int __init xbc_close_brace(char **k, char *n)
+{
+ int ret;
+
+ ret = xbc_parse_key(k, n);
+ if (ret)
+ return ret;
+ /* k is updated in xbc_parse_key() */
+
+ return __xbc_close_brace(n - 1);
+}
+
+static int __init xbc_verify_tree(void)
+{
+ int i, depth, len, wlen;
+ struct xbc_node *n, *m;
+
+ /* Empty tree */
+ if (xbc_node_num == 0)
+ return -ENOENT;
+
+ for (i = 0; i < xbc_node_num; i++) {
+ if (xbc_nodes[i].next > xbc_node_num) {
+ return xbc_parse_error("No closing brace",
+ xbc_node_get_data(xbc_nodes + i));
+ }
+ }
+
+ /* Key tree limitation check */
+ n = &xbc_nodes[0];
+ depth = 1;
+ len = 0;
+
+ while (n) {
+ wlen = strlen(xbc_node_get_data(n)) + 1;
+ len += wlen;
+ if (len > XBC_KEYLEN_MAX)
+ return xbc_parse_error("Too long key length",
+ xbc_node_get_data(n));
+
+ m = xbc_node_get_child(n);
+ if (m && xbc_node_is_key(m)) {
+ n = m;
+ depth++;
+ if (depth > XBC_DEPTH_MAX)
+ return xbc_parse_error("Too many key words",
+ xbc_node_get_data(n));
+ continue;
+ }
+ len -= wlen;
+ m = xbc_node_get_next(n);
+ while (!m) {
+ n = xbc_node_get_parent(n);
+ if (!n)
+ break;
+ len -= strlen(xbc_node_get_data(n)) + 1;
+ depth--;
+ m = xbc_node_get_next(n);
+ }
+ n = m;
+ }
+
+ return 0;
+}
+
+/**
+ * xbc_destroy_all() - Clean up all parsed bootconfig
+ *
+ * This clears all data structures of parsed bootconfig on memory.
+ * If you need to reuse xbc_init() with new boot config, you can
+ * use this.
+ */
+void __init xbc_destroy_all(void)
+{
+ xbc_data = NULL;
+ xbc_data_size = 0;
+ xbc_node_num = 0;
+ memset(xbc_nodes, 0, sizeof(xbc_nodes));
+}
+
+/**
+ * xbc_init() - Parse given XBC file and build XBC internal tree
+ * @buf: boot config text
+ *
+ * This parses the boot config text in @buf. @buf must be a
+ * null terminated string and smaller than XBC_DATA_MAX.
+ * Return 0 if succeeded, or -errno if there is any error.
+ */
+int __init xbc_init(char *buf)
+{
+ char *p, *q;
+ int ret, c;
+
+ if (xbc_data)
+ return -EBUSY;
+
+ ret = strlen(buf);
+ if (ret > XBC_DATA_MAX - 1 || ret == 0)
+ return -ERANGE;
+
+ xbc_data = buf;
+ xbc_data_size = ret + 1;
+ last_parent = NULL;
+
+ p = buf;
+ do {
+ q = strpbrk(p, "{}=;\n#");
+ if (!q) {
+ p = skip_spaces(p);
+ if (*p != '\0')
+ ret = xbc_parse_error("No delimiter", p);
+ break;
+ }
+
+ c = *q;
+ *q++ = '\0';
+ switch (c) {
+ case '=':
+ ret = xbc_parse_kv(&p, q);
+ break;
+ case '{':
+ ret = xbc_open_brace(&p, q);
+ break;
+ case '#':
+ q = skip_comment(q);
+ /* fall through */
+ case ';':
+ case '\n':
+ ret = xbc_parse_key(&p, q);
+ break;
+ case '}':
+ ret = xbc_close_brace(&p, q);
+ break;
+ }
+ } while (!ret);
+
+ if (!ret)
+ ret = xbc_verify_tree();
+
+ if (ret < 0)
+ xbc_destroy_all();
+
+ return ret;
+}
+
+/**
+ * xbc_debug_dump() - Dump current XBC node list
+ *
+ * Dump the current XBC node list on printk buffer for debug.
+ */
+void __init xbc_debug_dump(void)
+{
+ int i;
+
+ for (i = 0; i < xbc_node_num; i++) {
+ pr_debug("[%d] %s (%s) .next=%d, .child=%d .parent=%d\n", i,
+ xbc_node_get_data(xbc_nodes + i),
+ xbc_node_is_value(xbc_nodes + i) ? "value" : "key",
+ xbc_nodes[i].next, xbc_nodes[i].child,
+ xbc_nodes[i].parent);
+ }
+}

2020-01-10 16:06:22

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 05/22] proc: bootconfig: Add /proc/bootconfig to show boot config list

Add /proc/bootconfig which shows the list of key-value pairs
in boot config. Since after boot, all boot configs and tree
are removed, this interface just keep a copy of key-value
pairs in text.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v4:
- Remove ; in the end of lines.
- Rename /proc/supp_cmdline to /proc/bootconfig
- Simplify code.
---
MAINTAINERS | 1 +
fs/proc/Makefile | 1 +
fs/proc/bootconfig.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 91 insertions(+)
create mode 100644 fs/proc/bootconfig.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 836209be1faa..d0da06bdf3d8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15777,6 +15777,7 @@ EXTRA BOOT CONFIG
M: Masami Hiramatsu <[email protected]>
S: Maintained
F: lib/bootconfig.c
+F: fs/proc/bootconfig.c
F: include/linux/bootconfig.h
F: tools/bootconfig/*

diff --git a/fs/proc/Makefile b/fs/proc/Makefile
index ead487e80510..bd08616ed8ba 100644
--- a/fs/proc/Makefile
+++ b/fs/proc/Makefile
@@ -33,3 +33,4 @@ proc-$(CONFIG_PROC_KCORE) += kcore.o
proc-$(CONFIG_PROC_VMCORE) += vmcore.o
proc-$(CONFIG_PRINTK) += kmsg.o
proc-$(CONFIG_PROC_PAGE_MONITOR) += page.o
+proc-$(CONFIG_BOOT_CONFIG) += bootconfig.o
diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c
new file mode 100644
index 000000000000..9955d75c0585
--- /dev/null
+++ b/fs/proc/bootconfig.c
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * /proc/bootconfig - Extra boot configuration
+ */
+#include <linux/fs.h>
+#include <linux/init.h>
+#include <linux/printk.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/bootconfig.h>
+#include <linux/slab.h>
+
+static char *saved_boot_config;
+
+static int boot_config_proc_show(struct seq_file *m, void *v)
+{
+ if (saved_boot_config)
+ seq_puts(m, saved_boot_config);
+ return 0;
+}
+
+/* Rest size of buffer */
+#define rest(dst, end) ((end) > (dst) ? (end) - (dst) : 0)
+
+/* Return the needed total length if @size is 0 */
+static int __init copy_xbc_key_value_list(char *dst, size_t size)
+{
+ struct xbc_node *leaf, *vnode;
+ const char *val;
+ char *key, *end = dst + size;
+ int ret = 0;
+
+ key = kzalloc(XBC_KEYLEN_MAX, GFP_KERNEL);
+
+ xbc_for_each_key_value(leaf, val) {
+ ret = xbc_node_compose_key(leaf, key, XBC_KEYLEN_MAX);
+ if (ret < 0)
+ break;
+ ret = snprintf(dst, rest(dst, end), "%s = ", key);
+ if (ret < 0)
+ break;
+ dst += ret;
+ vnode = xbc_node_get_child(leaf);
+ if (vnode && xbc_node_is_array(vnode)) {
+ xbc_array_for_each_value(vnode, val) {
+ ret = snprintf(dst, rest(dst, end), "\"%s\"%s",
+ val, vnode->next ? ", " : "\n");
+ if (ret < 0)
+ goto out;
+ dst += ret;
+ }
+ } else {
+ ret = snprintf(dst, rest(dst, end), "\"%s\"\n", val);
+ if (ret < 0)
+ break;
+ dst += ret;
+ }
+ }
+out:
+ kfree(key);
+
+ return ret < 0 ? ret : dst - (end - size);
+}
+
+static int __init proc_boot_config_init(void)
+{
+ int len;
+
+ len = copy_xbc_key_value_list(NULL, 0);
+ if (len < 0)
+ return len;
+
+ if (len > 0) {
+ saved_boot_config = kzalloc(len + 1, GFP_KERNEL);
+ if (!saved_boot_config)
+ return -ENOMEM;
+
+ len = copy_xbc_key_value_list(saved_boot_config, len + 1);
+ if (len < 0) {
+ kfree(saved_boot_config);
+ return len;
+ }
+ }
+
+ proc_create_single("bootconfig", 0, NULL, boot_config_proc_show);
+
+ return 0;
+}
+fs_initcall(proc_boot_config_init);

2020-01-10 16:06:39

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 07/22] bootconfig: init: Allow admin to use bootconfig for kernel command line

Since the current kernel command line is too short to describe
many options which supported by kernel, allow user to use boot
config to setup (add) the command line options.

All kernel parameters under "kernel." keywords will be used
for setting up extra kernel command line.

For example,

kernel {
audit = on
audit_backlog_limit = 256
}

Note that you can not specify some early parameters
(like console etc.) by this method, since it is
loaded after early parameters parsed.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
init/main.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 101 insertions(+), 5 deletions(-)

diff --git a/init/main.c b/init/main.c
index 0b4e0c8ccf16..c0017d9d16e7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -137,6 +137,8 @@ char __initdata boot_command_line[COMMAND_LINE_SIZE];
char *saved_command_line;
/* Command line for parameter parsing */
static char *static_command_line;
+/* Untouched extra command line */
+static char *extra_command_line;

static char *execute_command;
static char *ramdisk_execute_command;
@@ -245,6 +247,83 @@ static int __init loglevel(char *str)
early_param("loglevel", loglevel);

#ifdef CONFIG_BOOT_CONFIG
+
+char xbc_namebuf[XBC_KEYLEN_MAX] __initdata;
+
+#define rest(dst, end) ((end) > (dst) ? (end) - (dst) : 0)
+
+static int __init xbc_snprint_cmdline(char *buf, size_t size,
+ struct xbc_node *root)
+{
+ struct xbc_node *knode, *vnode;
+ char *end = buf + size;
+ char c = '\"';
+ const char *val;
+ int ret;
+
+ xbc_node_for_each_key_value(root, knode, val) {
+ ret = xbc_node_compose_key_after(root, knode,
+ xbc_namebuf, XBC_KEYLEN_MAX);
+ if (ret < 0)
+ return ret;
+
+ vnode = xbc_node_get_child(knode);
+ ret = snprintf(buf, rest(buf, end), "%s%c", xbc_namebuf,
+ vnode ? '=' : ' ');
+ if (ret < 0)
+ return ret;
+ buf += ret;
+ if (!vnode)
+ continue;
+
+ c = '\"';
+ xbc_array_for_each_value(vnode, val) {
+ ret = snprintf(buf, rest(buf, end), "%c%s", c, val);
+ if (ret < 0)
+ return ret;
+ buf += ret;
+ c = ',';
+ }
+ if (rest(buf, end) > 2)
+ strcpy(buf, "\" ");
+ buf += 2;
+ }
+
+ return buf - (end - size);
+}
+#undef rest
+
+/* Make an extra command line under given key word */
+static char * __init xbc_make_cmdline(const char *key)
+{
+ struct xbc_node *root;
+ char *new_cmdline;
+ int ret, len = 0;
+
+ root = xbc_find_node(key);
+ if (!root)
+ return NULL;
+
+ /* Count required buffer size */
+ len = xbc_snprint_cmdline(NULL, 0, root);
+ if (len <= 0)
+ return NULL;
+
+ new_cmdline = memblock_alloc(len + 1, SMP_CACHE_BYTES);
+ if (!new_cmdline) {
+ pr_err("Failed to allocate memory for extra kernel cmdline.\n");
+ return NULL;
+ }
+
+ ret = xbc_snprint_cmdline(new_cmdline, len + 1, root);
+ if (ret < 0 || ret > len) {
+ pr_err("Failed to print extra kernel cmdline.\n");
+ return NULL;
+ }
+
+ return new_cmdline;
+}
+
u32 boot_config_checksum(unsigned char *p, u32 size)
{
u32 ret = 0;
@@ -289,8 +368,11 @@ static void __init setup_boot_config(void)

if (xbc_init(copy) < 0)
pr_err("Failed to parse boot config\n");
- else
+ else {
pr_info("Load boot config: %d bytes\n", size);
+ /* keys starting with "kernel." are passed via cmdline */
+ extra_command_line = xbc_make_cmdline("kernel");
+ }
}
#else
#define setup_boot_config() do { } while (0)
@@ -425,7 +507,12 @@ static inline void smp_prepare_cpus(unsigned int maxcpus) { }
*/
static void __init setup_command_line(char *command_line)
{
- size_t len = strlen(boot_command_line) + 1;
+ size_t len, xlen = 0;
+
+ if (extra_command_line)
+ xlen = strlen(extra_command_line);
+
+ len = xlen + strlen(boot_command_line) + 1;

saved_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
if (!saved_command_line)
@@ -435,8 +522,17 @@ static void __init setup_command_line(char *command_line)
if (!static_command_line)
panic("%s: Failed to allocate %zu bytes\n", __func__, len);

- strcpy(saved_command_line, boot_command_line);
- strcpy(static_command_line, command_line);
+ if (xlen) {
+ /*
+ * We have to put extra_command_line before boot command
+ * lines because there could be dashes (separator of init
+ * command line) in the command lines.
+ */
+ strcpy(saved_command_line, extra_command_line);
+ strcpy(static_command_line, extra_command_line);
+ }
+ strcpy(saved_command_line + xlen, boot_command_line);
+ strcpy(static_command_line + xlen, command_line);
}

/*
@@ -652,7 +748,7 @@ asmlinkage __visible void __init start_kernel(void)
build_all_zonelists(NULL);
page_alloc_init();

- pr_notice("Kernel command line: %s\n", boot_command_line);
+ pr_notice("Kernel command line: %s\n", saved_command_line);
/* parameters may set static keys */
jump_label_init();
parse_early_param();

2020-01-10 16:06:42

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 04/22] tools: bootconfig: Add bootconfig test script

Add a bootconfig test script to ensure the tool and
boot config parser are working correctly.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v6:
- Add some testcases for value parser
- Add a test case for checking delete old bootconfig
Changes in v5:
- Show test target bootconfig name
- Add printables testcases
- Add bad array testcase
---
tools/bootconfig/Makefile | 3 +
.../samples/bad-array-space-comment.bconf | 5 +
tools/bootconfig/samples/bad-array.bconf | 2
tools/bootconfig/samples/bad-dotword.bconf | 4 +
tools/bootconfig/samples/bad-empty.bconf | 1
tools/bootconfig/samples/bad-keyerror.bconf | 2
tools/bootconfig/samples/bad-longkey.bconf | 1
tools/bootconfig/samples/bad-manywords.bconf | 1
tools/bootconfig/samples/bad-no-keyword.bconf | 2
tools/bootconfig/samples/bad-nonprintable.bconf | 2
tools/bootconfig/samples/bad-spaceword.bconf | 2
tools/bootconfig/samples/bad-tree.bconf | 5 +
tools/bootconfig/samples/bad-value.bconf | 3 +
tools/bootconfig/samples/escaped.bconf | 3 +
.../samples/good-array-space-comment.bconf | 4 +
.../samples/good-comment-after-value.bconf | 1
tools/bootconfig/samples/good-printables.bconf | 2
tools/bootconfig/samples/good-simple.bconf | 11 ++
tools/bootconfig/samples/good-single.bconf | 4 +
.../samples/good-space-after-value.bconf | 1
tools/bootconfig/samples/good-tree.bconf | 12 ++
tools/bootconfig/test-bootconfig.sh | 105 ++++++++++++++++++++
22 files changed, 176 insertions(+)
create mode 100644 tools/bootconfig/samples/bad-array-space-comment.bconf
create mode 100644 tools/bootconfig/samples/bad-array.bconf
create mode 100644 tools/bootconfig/samples/bad-dotword.bconf
create mode 100644 tools/bootconfig/samples/bad-empty.bconf
create mode 100644 tools/bootconfig/samples/bad-keyerror.bconf
create mode 100644 tools/bootconfig/samples/bad-longkey.bconf
create mode 100644 tools/bootconfig/samples/bad-manywords.bconf
create mode 100644 tools/bootconfig/samples/bad-no-keyword.bconf
create mode 100644 tools/bootconfig/samples/bad-nonprintable.bconf
create mode 100644 tools/bootconfig/samples/bad-spaceword.bconf
create mode 100644 tools/bootconfig/samples/bad-tree.bconf
create mode 100644 tools/bootconfig/samples/bad-value.bconf
create mode 100644 tools/bootconfig/samples/escaped.bconf
create mode 100644 tools/bootconfig/samples/good-array-space-comment.bconf
create mode 100644 tools/bootconfig/samples/good-comment-after-value.bconf
create mode 100644 tools/bootconfig/samples/good-printables.bconf
create mode 100644 tools/bootconfig/samples/good-simple.bconf
create mode 100644 tools/bootconfig/samples/good-single.bconf
create mode 100644 tools/bootconfig/samples/good-space-after-value.bconf
create mode 100644 tools/bootconfig/samples/good-tree.bconf
create mode 100755 tools/bootconfig/test-bootconfig.sh

diff --git a/tools/bootconfig/Makefile b/tools/bootconfig/Makefile
index 681b7aef3e44..a6146ac64458 100644
--- a/tools/bootconfig/Makefile
+++ b/tools/bootconfig/Makefile
@@ -16,5 +16,8 @@ bootconfig: ../../lib/bootconfig.c main.c $(HEADER)
install: $(PROGS)
install bootconfig $(DESTDIR)$(bindir)

+test: bootconfig
+ ./test-bootconfig.sh
+
clean:
$(RM) -f *.o bootconfig
diff --git a/tools/bootconfig/samples/bad-array-space-comment.bconf b/tools/bootconfig/samples/bad-array-space-comment.bconf
new file mode 100644
index 000000000000..fda19e47d0db
--- /dev/null
+++ b/tools/bootconfig/samples/bad-array-space-comment.bconf
@@ -0,0 +1,5 @@
+key = # comment
+ "value1", # comment1
+ "value2" # comment2
+,
+ "value3"
diff --git a/tools/bootconfig/samples/bad-array.bconf b/tools/bootconfig/samples/bad-array.bconf
new file mode 100644
index 000000000000..0174af019d7f
--- /dev/null
+++ b/tools/bootconfig/samples/bad-array.bconf
@@ -0,0 +1,2 @@
+# Array must be comma separated.
+key = "value1" "value2"
diff --git a/tools/bootconfig/samples/bad-dotword.bconf b/tools/bootconfig/samples/bad-dotword.bconf
new file mode 100644
index 000000000000..ba5557b2bdd3
--- /dev/null
+++ b/tools/bootconfig/samples/bad-dotword.bconf
@@ -0,0 +1,4 @@
+# do not start keyword with .
+key {
+ .word = 1
+}
diff --git a/tools/bootconfig/samples/bad-empty.bconf b/tools/bootconfig/samples/bad-empty.bconf
new file mode 100644
index 000000000000..2ba3f6cc6a47
--- /dev/null
+++ b/tools/bootconfig/samples/bad-empty.bconf
@@ -0,0 +1 @@
+# Wrong boot config: comment only
diff --git a/tools/bootconfig/samples/bad-keyerror.bconf b/tools/bootconfig/samples/bad-keyerror.bconf
new file mode 100644
index 000000000000..b6e247a099d0
--- /dev/null
+++ b/tools/bootconfig/samples/bad-keyerror.bconf
@@ -0,0 +1,2 @@
+# key word can not contain ","
+key,word
diff --git a/tools/bootconfig/samples/bad-longkey.bconf b/tools/bootconfig/samples/bad-longkey.bconf
new file mode 100644
index 000000000000..eb97369f91a8
--- /dev/null
+++ b/tools/bootconfig/samples/bad-longkey.bconf
@@ -0,0 +1 @@
+key_word_is_too_long01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
diff --git a/tools/bootconfig/samples/bad-manywords.bconf b/tools/bootconfig/samples/bad-manywords.bconf
new file mode 100644
index 000000000000..8db81967c48a
--- /dev/null
+++ b/tools/bootconfig/samples/bad-manywords.bconf
@@ -0,0 +1 @@
+key1.is2.too3.long4.5.6.7.8.9.10.11.12.13.14.15.16.17
diff --git a/tools/bootconfig/samples/bad-no-keyword.bconf b/tools/bootconfig/samples/bad-no-keyword.bconf
new file mode 100644
index 000000000000..eff26808566c
--- /dev/null
+++ b/tools/bootconfig/samples/bad-no-keyword.bconf
@@ -0,0 +1,2 @@
+# No keyword
+{}
diff --git a/tools/bootconfig/samples/bad-nonprintable.bconf b/tools/bootconfig/samples/bad-nonprintable.bconf
new file mode 100644
index 000000000000..3bb1a2864e52
--- /dev/null
+++ b/tools/bootconfig/samples/bad-nonprintable.bconf
@@ -0,0 +1,2 @@
+# Non printable
+key = ""
diff --git a/tools/bootconfig/samples/bad-spaceword.bconf b/tools/bootconfig/samples/bad-spaceword.bconf
new file mode 100644
index 000000000000..90c703d32a9a
--- /dev/null
+++ b/tools/bootconfig/samples/bad-spaceword.bconf
@@ -0,0 +1,2 @@
+# No space between words
+key . word
diff --git a/tools/bootconfig/samples/bad-tree.bconf b/tools/bootconfig/samples/bad-tree.bconf
new file mode 100644
index 000000000000..5a6038edcd55
--- /dev/null
+++ b/tools/bootconfig/samples/bad-tree.bconf
@@ -0,0 +1,5 @@
+# brace is not closing
+tree {
+ node {
+ value = 1
+}
diff --git a/tools/bootconfig/samples/bad-value.bconf b/tools/bootconfig/samples/bad-value.bconf
new file mode 100644
index 000000000000..a1217fed86cc
--- /dev/null
+++ b/tools/bootconfig/samples/bad-value.bconf
@@ -0,0 +1,3 @@
+# Quotes error
+value = "data
+
diff --git a/tools/bootconfig/samples/escaped.bconf b/tools/bootconfig/samples/escaped.bconf
new file mode 100644
index 000000000000..9f72043b3216
--- /dev/null
+++ b/tools/bootconfig/samples/escaped.bconf
@@ -0,0 +1,3 @@
+key1 = "A\B\C"
+key2 = '\'\''
+key3 = "\\"
diff --git a/tools/bootconfig/samples/good-array-space-comment.bconf b/tools/bootconfig/samples/good-array-space-comment.bconf
new file mode 100644
index 000000000000..45b938dc0695
--- /dev/null
+++ b/tools/bootconfig/samples/good-array-space-comment.bconf
@@ -0,0 +1,4 @@
+key = # comment
+ "value1", # comment1
+ "value2" , # comment2
+ "value3"
diff --git a/tools/bootconfig/samples/good-comment-after-value.bconf b/tools/bootconfig/samples/good-comment-after-value.bconf
new file mode 100644
index 000000000000..0d92a853df72
--- /dev/null
+++ b/tools/bootconfig/samples/good-comment-after-value.bconf
@@ -0,0 +1 @@
+key = "value" # comment
diff --git a/tools/bootconfig/samples/good-printables.bconf b/tools/bootconfig/samples/good-printables.bconf
new file mode 100644
index 000000000000..91b90073c0f8
--- /dev/null
+++ b/tools/bootconfig/samples/good-printables.bconf
@@ -0,0 +1,2 @@
+key = "
+
!#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
diff --git a/tools/bootconfig/samples/good-simple.bconf b/tools/bootconfig/samples/good-simple.bconf
new file mode 100644
index 000000000000..37dd6d21c176
--- /dev/null
+++ b/tools/bootconfig/samples/good-simple.bconf
@@ -0,0 +1,11 @@
+# A good simple bootconfig
+
+key.word1 = 1
+key.word2=2
+key.word3 = 3;
+
+key {
+word4 = 4 }
+
+key { word5 = 5; word6 = 6 }
+
diff --git a/tools/bootconfig/samples/good-single.bconf b/tools/bootconfig/samples/good-single.bconf
new file mode 100644
index 000000000000..98e55ad8b711
--- /dev/null
+++ b/tools/bootconfig/samples/good-single.bconf
@@ -0,0 +1,4 @@
+# single key style
+key = 1
+key2 = 2
+key3 = "alpha", "beta"
diff --git a/tools/bootconfig/samples/good-space-after-value.bconf b/tools/bootconfig/samples/good-space-after-value.bconf
new file mode 100644
index 000000000000..56c15cbc5741
--- /dev/null
+++ b/tools/bootconfig/samples/good-space-after-value.bconf
@@ -0,0 +1 @@
+key = "value"
diff --git a/tools/bootconfig/samples/good-tree.bconf b/tools/bootconfig/samples/good-tree.bconf
new file mode 100644
index 000000000000..f2ddefc8b52a
--- /dev/null
+++ b/tools/bootconfig/samples/good-tree.bconf
@@ -0,0 +1,12 @@
+key {
+ word {
+ tree {
+ value = "0"}
+ }
+ word2 {
+ tree {
+ value = 1,2 }
+ }
+}
+other.tree {
+ value = 2; value2 = 3;}
diff --git a/tools/bootconfig/test-bootconfig.sh b/tools/bootconfig/test-bootconfig.sh
new file mode 100755
index 000000000000..87725e8723f8
--- /dev/null
+++ b/tools/bootconfig/test-bootconfig.sh
@@ -0,0 +1,105 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-only
+
+echo "Boot config test script"
+
+BOOTCONF=./bootconfig
+INITRD=`mktemp initrd-XXXX`
+TEMPCONF=`mktemp temp-XXXX.bconf`
+NG=0
+
+cleanup() {
+ rm -f $INITRD $TEMPCONF
+ exit $NG
+}
+
+trap cleanup EXIT TERM
+
+NO=1
+
+xpass() { # pass test command
+ echo "test case $NO ($3)... "
+ if ! ($@ && echo "\t\t[OK]"); then
+ echo "\t\t[NG]"; NG=$((NG + 1))
+ fi
+ NO=$((NO + 1))
+}
+
+xfail() { # fail test command
+ echo "test case $NO ($3)... "
+ if ! (! $@ && echo "\t\t[OK]"); then
+ echo "\t\t[NG]"; NG=$((NG + 1))
+ fi
+ NO=$((NO + 1))
+}
+
+echo "Basic command test"
+xpass $BOOTCONF $INITRD
+
+echo "Delete command should success without bootconfig"
+xpass $BOOTCONF -d $INITRD
+
+dd if=/dev/zero of=$INITRD bs=4096 count=1
+echo "key = value;" > $TEMPCONF
+bconf_size=$(stat -c %s $TEMPCONF)
+initrd_size=$(stat -c %s $INITRD)
+
+echo "Apply command test"
+xpass $BOOTCONF -a $TEMPCONF $INITRD
+new_size=$(stat -c %s $INITRD)
+
+echo "File size check"
+xpass test $new_size -eq $(expr $bconf_size + $initrd_size + 9)
+
+echo "Apply command repeat test"
+xpass $BOOTCONF -a $TEMPCONF $INITRD
+
+echo "File size check"
+xpass test $new_size -eq $(stat -c %s $INITRD)
+
+echo "Delete command check"
+xpass $BOOTCONF -d $INITRD
+
+echo "File size check"
+new_size=$(stat -c %s $INITRD)
+xpass test $new_size -eq $initrd_size
+
+echo "Max node number check"
+
+echo -n > $TEMPCONF
+for i in `seq 1 1024` ; do
+ echo "node$i" >> $TEMPCONF
+done
+xpass $BOOTCONF -a $TEMPCONF $INITRD
+
+echo "badnode" >> $TEMPCONF
+xfail $BOOTCONF -a $TEMPCONF $INITRD
+
+echo "Max filesize check"
+
+# Max size is 32767 (including terminal byte)
+echo -n "data = \"" > $TEMPCONF
+dd if=/dev/urandom bs=768 count=32 | base64 -w0 >> $TEMPCONF
+echo "\"" >> $TEMPCONF
+xfail $BOOTCONF -a $TEMPCONF $INITRD
+
+truncate -s 32764 $TEMPCONF
+echo "\"" >> $TEMPCONF # add 2 bytes + terminal ('\"\n\0')
+xpass $BOOTCONF -a $TEMPCONF $INITRD
+
+echo "=== expected failure cases ==="
+for i in samples/bad-* ; do
+ xfail $BOOTCONF -a $i $INITRD
+done
+
+echo "=== expected success cases ==="
+for i in samples/good-* ; do
+ xpass $BOOTCONF -a $i $INITRD
+done
+
+echo
+if [ $NG -eq 0 ]; then
+ echo "All tests passed"
+else
+ echo "$NG tests failed"
+fi

2020-01-10 16:06:55

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

Since the current kernel command line is too short to describe
long and many options for init (e.g. systemd command line options),
this allows admin to use boot config for init command line.

All init command line under "init." keywords will be passed to
init.

For example,

init.systemd {
unified_cgroup_hierarchy = 1
debug_shell
default_timeout_start_sec = 60
}

Signed-off-by: Masami Hiramatsu <[email protected]>
---
init/main.c | 31 ++++++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/init/main.c b/init/main.c
index c0017d9d16e7..dd7da62d99a5 100644
--- a/init/main.c
+++ b/init/main.c
@@ -139,6 +139,8 @@ char *saved_command_line;
static char *static_command_line;
/* Untouched extra command line */
static char *extra_command_line;
+/* Extra init arguments */
+static char *extra_init_args;

static char *execute_command;
static char *ramdisk_execute_command;
@@ -372,6 +374,8 @@ static void __init setup_boot_config(void)
pr_info("Load boot config: %d bytes\n", size);
/* keys starting with "kernel." are passed via cmdline */
extra_command_line = xbc_make_cmdline("kernel");
+ /* Also, "init." keys are init arguments */
+ extra_init_args = xbc_make_cmdline("init");
}
}
#else
@@ -507,16 +511,18 @@ static inline void smp_prepare_cpus(unsigned int maxcpus) { }
*/
static void __init setup_command_line(char *command_line)
{
- size_t len, xlen = 0;
+ size_t len, xlen = 0, ilen = 0;

if (extra_command_line)
xlen = strlen(extra_command_line);
+ if (extra_init_args)
+ ilen = strlen(extra_init_args) + 4; /* for " -- " */

len = xlen + strlen(boot_command_line) + 1;

- saved_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
+ saved_command_line = memblock_alloc(len + ilen, SMP_CACHE_BYTES);
if (!saved_command_line)
- panic("%s: Failed to allocate %zu bytes\n", __func__, len);
+ panic("%s: Failed to allocate %zu bytes\n", __func__, len + ilen);

static_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
if (!static_command_line)
@@ -533,6 +539,22 @@ static void __init setup_command_line(char *command_line)
}
strcpy(saved_command_line + xlen, boot_command_line);
strcpy(static_command_line + xlen, command_line);
+
+ if (ilen) {
+ /*
+ * Append supplemental init boot args to saved_command_line
+ * so that user can check what command line options passed
+ * to init.
+ */
+ len = strlen(saved_command_line);
+ if (!strstr(boot_command_line, " -- ")) {
+ strcpy(saved_command_line + len, " -- ");
+ len += 4;
+ } else
+ saved_command_line[len++] = ' ';
+
+ strcpy(saved_command_line + len, extra_init_args);
+ }
}

/*
@@ -759,6 +781,9 @@ asmlinkage __visible void __init start_kernel(void)
if (!IS_ERR_OR_NULL(after_dashes))
parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
NULL, set_init_arg);
+ if (extra_init_args)
+ parse_args("Setting extra init args", extra_init_args,
+ NULL, 0, -1, -1, NULL, set_init_arg);

/*
* These use large bootmem allocations and must precede

2020-01-10 16:07:03

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 03/22] tools: bootconfig: Add bootconfig command

Add "bootconfig" command which operates the bootconfig
config-data on initrd image.

User can add/delete/verify the boot config on initrd
image using this command.

e.g.
Add a boot config to initrd image
# bootconfig -a myboot.conf /boot/initrd.img

Remove it.
# bootconfig -d /boot/initrd.img

Or verify (and show) it.
# bootconfig /boot/initrd.img

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v6:
- Fix memory leaks.
- Fix to cleanup old bootconfig on memory before load new one.
- Show applying message.
- Suppress parse error with wrong data in initrd for delete_xbc().
Changes in v5:
- Fix Makefile to compile all C files always.
- Remove unused pattern from Makefile.
---
MAINTAINERS | 1
tools/Makefile | 11 -
tools/bootconfig/.gitignore | 1
tools/bootconfig/Makefile | 20 ++
tools/bootconfig/include/linux/bootconfig.h | 7 +
tools/bootconfig/include/linux/bug.h | 12 +
tools/bootconfig/include/linux/ctype.h | 7 +
tools/bootconfig/include/linux/errno.h | 7 +
tools/bootconfig/include/linux/kernel.h | 18 +
tools/bootconfig/include/linux/printk.h | 17 +
tools/bootconfig/include/linux/string.h | 32 ++
tools/bootconfig/main.c | 354 +++++++++++++++++++++++++++
12 files changed, 482 insertions(+), 5 deletions(-)
create mode 100644 tools/bootconfig/.gitignore
create mode 100644 tools/bootconfig/Makefile
create mode 100644 tools/bootconfig/include/linux/bootconfig.h
create mode 100644 tools/bootconfig/include/linux/bug.h
create mode 100644 tools/bootconfig/include/linux/ctype.h
create mode 100644 tools/bootconfig/include/linux/errno.h
create mode 100644 tools/bootconfig/include/linux/kernel.h
create mode 100644 tools/bootconfig/include/linux/printk.h
create mode 100644 tools/bootconfig/include/linux/string.h
create mode 100644 tools/bootconfig/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 1ef065234cff..836209be1faa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15778,6 +15778,7 @@ M: Masami Hiramatsu <[email protected]>
S: Maintained
F: lib/bootconfig.c
F: include/linux/bootconfig.h
+F: tools/bootconfig/*

SUN3/3X
M: Sam Creasey <[email protected]>
diff --git a/tools/Makefile b/tools/Makefile
index 7e42f7b8bfa7..bd778812e915 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -28,6 +28,7 @@ help:
@echo ' pci - PCI tools'
@echo ' perf - Linux performance measurement and analysis tool'
@echo ' selftests - various kernel selftests'
+ @echo ' bootconfig - boot config tool'
@echo ' spi - spi tools'
@echo ' tmon - thermal monitoring and tuning tool'
@echo ' turbostat - Intel CPU idle stats and freq reporting tool'
@@ -63,7 +64,7 @@ acpi: FORCE
cpupower: FORCE
$(call descend,power/$@)

-cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
+cgroup firewire hv guest bootconfig spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
$(call descend,$@)

liblockdep: FORCE
@@ -96,7 +97,7 @@ kvm_stat: FORCE
$(call descend,kvm/$@)

all: acpi cgroup cpupower gpio hv firewire liblockdep \
- perf selftests spi turbostat usb \
+ perf selftests bootconfig spi turbostat usb \
virtio vm bpf x86_energy_perf_policy \
tmon freefall iio objtool kvm_stat wmi \
pci debugging
@@ -107,7 +108,7 @@ acpi_install:
cpupower_install:
$(call descend,power/$(@:_install=),install)

-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
+cgroup_install firewire_install gpio_install hv_install iio_install perf_install bootconfig_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
$(call descend,$(@:_install=),install)

liblockdep_install:
@@ -141,7 +142,7 @@ acpi_clean:
cpupower_clean:
$(call descend,power/cpupower,clean)

-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
+cgroup_clean hv_clean firewire_clean bootconfig_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
$(call descend,$(@:_clean=),clean)

liblockdep_clean:
@@ -176,7 +177,7 @@ build_clean:
$(call descend,build,clean)

clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
- perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
+ perf_clean selftests_clean turbostat_clean bootconfig_clean spi_clean usb_clean virtio_clean \
vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
diff --git a/tools/bootconfig/.gitignore b/tools/bootconfig/.gitignore
new file mode 100644
index 000000000000..e7644dfaa4a7
--- /dev/null
+++ b/tools/bootconfig/.gitignore
@@ -0,0 +1 @@
+bootconfig
diff --git a/tools/bootconfig/Makefile b/tools/bootconfig/Makefile
new file mode 100644
index 000000000000..681b7aef3e44
--- /dev/null
+++ b/tools/bootconfig/Makefile
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for bootconfig command
+
+bindir ?= /usr/bin
+
+HEADER = include/linux/bootconfig.h
+CFLAGS = -Wall -g -I./include
+
+PROGS = bootconfig
+
+all: $(PROGS)
+
+bootconfig: ../../lib/bootconfig.c main.c $(HEADER)
+ $(CC) $(filter %.c,$^) $(CFLAGS) -o $@
+
+install: $(PROGS)
+ install bootconfig $(DESTDIR)$(bindir)
+
+clean:
+ $(RM) -f *.o bootconfig
diff --git a/tools/bootconfig/include/linux/bootconfig.h b/tools/bootconfig/include/linux/bootconfig.h
new file mode 100644
index 000000000000..078cbd2ba651
--- /dev/null
+++ b/tools/bootconfig/include/linux/bootconfig.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _BOOTCONFIG_LINUX_BOOTCONFIG_H
+#define _BOOTCONFIG_LINUX_BOOTCONFIG_H
+
+#include "../../../../include/linux/bootconfig.h"
+
+#endif
diff --git a/tools/bootconfig/include/linux/bug.h b/tools/bootconfig/include/linux/bug.h
new file mode 100644
index 000000000000..7b65a389c0dd
--- /dev/null
+++ b/tools/bootconfig/include/linux/bug.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SKC_LINUX_BUG_H
+#define _SKC_LINUX_BUG_H
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#define WARN_ON(cond) \
+ ((cond) ? printf("Internal warning(%s:%d, %s): %s\n", \
+ __FILE__, __LINE__, __func__, #cond) : 0)
+
+#endif
diff --git a/tools/bootconfig/include/linux/ctype.h b/tools/bootconfig/include/linux/ctype.h
new file mode 100644
index 000000000000..c56ecc136448
--- /dev/null
+++ b/tools/bootconfig/include/linux/ctype.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SKC_LINUX_CTYPE_H
+#define _SKC_LINUX_CTYPE_H
+
+#include <ctype.h>
+
+#endif
diff --git a/tools/bootconfig/include/linux/errno.h b/tools/bootconfig/include/linux/errno.h
new file mode 100644
index 000000000000..5d9f91ec2fda
--- /dev/null
+++ b/tools/bootconfig/include/linux/errno.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SKC_LINUX_ERRNO_H
+#define _SKC_LINUX_ERRNO_H
+
+#include <asm/errno.h>
+
+#endif
diff --git a/tools/bootconfig/include/linux/kernel.h b/tools/bootconfig/include/linux/kernel.h
new file mode 100644
index 000000000000..2d93320aa374
--- /dev/null
+++ b/tools/bootconfig/include/linux/kernel.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SKC_LINUX_KERNEL_H
+#define _SKC_LINUX_KERNEL_H
+
+#include <stdlib.h>
+#include <stdbool.h>
+
+#include <linux/printk.h>
+
+typedef unsigned short u16;
+typedef unsigned int u32;
+
+#define unlikely(cond) (cond)
+
+#define __init
+#define __initdata
+
+#endif
diff --git a/tools/bootconfig/include/linux/printk.h b/tools/bootconfig/include/linux/printk.h
new file mode 100644
index 000000000000..017bcd6912a5
--- /dev/null
+++ b/tools/bootconfig/include/linux/printk.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SKC_LINUX_PRINTK_H
+#define _SKC_LINUX_PRINTK_H
+
+#include <stdio.h>
+
+/* controllable printf */
+extern int pr_output;
+#define printk(fmt, ...) \
+ (pr_output ? printf(fmt, __VA_ARGS__) : 0)
+
+#define pr_err printk
+#define pr_warn printk
+#define pr_info printk
+#define pr_debug printk
+
+#endif
diff --git a/tools/bootconfig/include/linux/string.h b/tools/bootconfig/include/linux/string.h
new file mode 100644
index 000000000000..8267af75153a
--- /dev/null
+++ b/tools/bootconfig/include/linux/string.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _SKC_LINUX_STRING_H
+#define _SKC_LINUX_STRING_H
+
+#include <string.h>
+
+/* Copied from lib/string.c */
+static inline char *skip_spaces(const char *str)
+{
+ while (isspace(*str))
+ ++str;
+ return (char *)str;
+}
+
+static inline char *strim(char *s)
+{
+ size_t size;
+ char *end;
+
+ size = strlen(s);
+ if (!size)
+ return s;
+
+ end = s + size - 1;
+ while (end >= s && isspace(*end))
+ end--;
+ *(end + 1) = '\0';
+
+ return skip_spaces(s);
+}
+
+#endif
diff --git a/tools/bootconfig/main.c b/tools/bootconfig/main.c
new file mode 100644
index 000000000000..66c8d47ceeea
--- /dev/null
+++ b/tools/bootconfig/main.c
@@ -0,0 +1,354 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Boot config tool for initrd image
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+
+#include <linux/kernel.h>
+#include <linux/bootconfig.h>
+
+int pr_output = 1;
+
+static int xbc_show_array(struct xbc_node *node)
+{
+ const char *val;
+ int i = 0;
+
+ xbc_array_for_each_value(node, val) {
+ printf("\"%s\"%s", val, node->next ? ", " : ";\n");
+ i++;
+ }
+ return i;
+}
+
+static void xbc_show_compact_tree(void)
+{
+ struct xbc_node *node, *cnode;
+ int depth = 0, i;
+
+ node = xbc_root_node();
+ while (node && xbc_node_is_key(node)) {
+ for (i = 0; i < depth; i++)
+ printf("\t");
+ cnode = xbc_node_get_child(node);
+ while (cnode && xbc_node_is_key(cnode) && !cnode->next) {
+ printf("%s.", xbc_node_get_data(node));
+ node = cnode;
+ cnode = xbc_node_get_child(node);
+ }
+ if (cnode && xbc_node_is_key(cnode)) {
+ printf("%s {\n", xbc_node_get_data(node));
+ depth++;
+ node = cnode;
+ continue;
+ } else if (cnode && xbc_node_is_value(cnode)) {
+ printf("%s = ", xbc_node_get_data(node));
+ if (cnode->next)
+ xbc_show_array(cnode);
+ else
+ printf("\"%s\";\n", xbc_node_get_data(cnode));
+ } else {
+ printf("%s;\n", xbc_node_get_data(node));
+ }
+
+ if (node->next) {
+ node = xbc_node_get_next(node);
+ continue;
+ }
+ while (!node->next) {
+ node = xbc_node_get_parent(node);
+ if (!node)
+ return;
+ if (!xbc_node_get_child(node)->next)
+ continue;
+ depth--;
+ for (i = 0; i < depth; i++)
+ printf("\t");
+ printf("}\n");
+ }
+ node = xbc_node_get_next(node);
+ }
+}
+
+/* Simple real checksum */
+int checksum(unsigned char *buf, int len)
+{
+ int i, sum = 0;
+
+ for (i = 0; i < len; i++)
+ sum += buf[i];
+
+ return sum;
+}
+
+#define PAGE_SIZE 4096
+
+int load_xbc_fd(int fd, char **buf, int size)
+{
+ int ret;
+
+ *buf = malloc(size + 1);
+ if (!*buf)
+ return -ENOMEM;
+
+ ret = read(fd, *buf, size);
+ if (ret < 0)
+ return -errno;
+ (*buf)[size] = '\0';
+
+ return ret;
+}
+
+/* Return the read size or -errno */
+int load_xbc_file(const char *path, char **buf)
+{
+ struct stat stat;
+ int fd, ret;
+
+ fd = open(path, O_RDONLY);
+ if (fd < 0)
+ return -errno;
+ ret = fstat(fd, &stat);
+ if (ret < 0)
+ return -errno;
+
+ ret = load_xbc_fd(fd, buf, stat.st_size);
+
+ close(fd);
+
+ return ret;
+}
+
+int load_xbc_from_initrd(int fd, char **buf)
+{
+ struct stat stat;
+ int ret;
+ u32 size = 0, csum = 0, rcsum;
+
+ ret = fstat(fd, &stat);
+ if (ret < 0)
+ return -errno;
+
+ if (stat.st_size < 8)
+ return 0;
+
+ if (lseek(fd, -8, SEEK_END) < 0) {
+ printf("Faile to lseek: %d\n", -errno);
+ return -errno;
+ }
+
+ if (read(fd, &size, sizeof(u32)) < 0)
+ return -errno;
+
+ if (read(fd, &csum, sizeof(u32)) < 0)
+ return -errno;
+
+ /* Wrong size, maybe no boot config here */
+ if (stat.st_size < size + 8)
+ return 0;
+
+ if (lseek(fd, stat.st_size - 8 - size, SEEK_SET) < 0) {
+ printf("Faile to lseek: %d\n", -errno);
+ return -errno;
+ }
+
+ ret = load_xbc_fd(fd, buf, size);
+ if (ret < 0)
+ return ret;
+
+ /* Wrong Checksum, maybe no boot config here */
+ rcsum = checksum((unsigned char *)*buf, size);
+ if (csum != rcsum) {
+ printf("checksum error: %d != %d\n", csum, rcsum);
+ return 0;
+ }
+
+ ret = xbc_init(*buf);
+ /* Wrong data, maybe no boot config here */
+ if (ret < 0)
+ return 0;
+
+ return size;
+}
+
+int show_xbc(const char *path)
+{
+ int ret, fd;
+ char *buf = NULL;
+
+ fd = open(path, O_RDONLY);
+ if (fd < 0) {
+ printf("Failed to open initrd %s: %d\n", path, fd);
+ return -errno;
+ }
+
+ ret = load_xbc_from_initrd(fd, &buf);
+ if (ret < 0)
+ printf("Failed to load a boot config from initrd: %d\n", ret);
+ else
+ xbc_show_compact_tree();
+
+ close(fd);
+ free(buf);
+
+ return ret;
+}
+
+int delete_xbc(const char *path)
+{
+ struct stat stat;
+ int ret = 0, fd, size;
+ char *buf = NULL;
+
+ fd = open(path, O_RDWR);
+ if (fd < 0) {
+ printf("Failed to open initrd %s: %d\n", path, fd);
+ return -errno;
+ }
+
+ /*
+ * Suppress error messages in xbc_init() because it can be just a
+ * data which concidentally matches the size and checksum footer.
+ */
+ pr_output = 0;
+ size = load_xbc_from_initrd(fd, &buf);
+ pr_output = 1;
+ if (size < 0) {
+ ret = size;
+ printf("Failed to load a boot config from initrd: %d\n", ret);
+ } else if (size > 0) {
+ ret = fstat(fd, &stat);
+ if (!ret)
+ ret = ftruncate(fd, stat.st_size - size - 8);
+ if (ret)
+ ret = -errno;
+ } /* Ignore if there is no boot config in initrd */
+
+ close(fd);
+ free(buf);
+
+ return ret;
+}
+
+int apply_xbc(const char *path, const char *xbc_path)
+{
+ u32 size, csum;
+ char *buf, *data;
+ int ret, fd;
+
+ ret = load_xbc_file(xbc_path, &buf);
+ if (ret < 0) {
+ printf("Failed to load %s : %d\n", xbc_path, ret);
+ return ret;
+ }
+ size = strlen(buf) + 1;
+ csum = checksum((unsigned char *)buf, size);
+
+ /* Prepare xbc_path data */
+ data = malloc(size + 8);
+ if (!data)
+ return -ENOMEM;
+ strcpy(data, buf);
+ *(u32 *)(data + size) = size;
+ *(u32 *)(data + size + 4) = csum;
+
+ /* Check the data format */
+ ret = xbc_init(buf);
+ if (ret < 0) {
+ printf("Failed to parse %s: %d\n", xbc_path, ret);
+ free(data);
+ free(buf);
+ return ret;
+ }
+ printf("Apply %s to %s\n", xbc_path, path);
+ printf("\tSize: %u bytes\n", (unsigned int)size);
+ printf("\tChecksum: %d\n", (unsigned int)csum);
+
+ /* TODO: Check the options by schema */
+ xbc_destroy_all();
+ free(buf);
+
+ /* Remove old boot config if exists */
+ ret = delete_xbc(path);
+ if (ret < 0) {
+ printf("Failed to delete previous boot config: %d\n", ret);
+ return ret;
+ }
+
+ /* Apply new one */
+ fd = open(path, O_RDWR | O_APPEND);
+ if (fd < 0) {
+ printf("Failed to open %s: %d\n", path, fd);
+ return fd;
+ }
+ /* TODO: Ensure the @path is initramfs/initrd image */
+ ret = write(fd, data, size + 8);
+ if (ret < 0) {
+ printf("Failed to apply a boot config: %d\n", ret);
+ return ret;
+ }
+ close(fd);
+ free(data);
+
+ return 0;
+}
+
+int usage(void)
+{
+ printf("Usage: bootconfig [OPTIONS] <INITRD>\n"
+ " Apply, delete or show boot config to initrd.\n"
+ " Options:\n"
+ " -a <config>: Apply boot config to initrd\n"
+ " -d : Delete boot config file from initrd\n\n"
+ " If no option is given, show current applied boot config.\n");
+ return -1;
+}
+
+int main(int argc, char **argv)
+{
+ char *path = NULL;
+ char *apply = NULL;
+ bool delete = false;
+ int opt;
+
+ while ((opt = getopt(argc, argv, "hda:")) != -1) {
+ switch (opt) {
+ case 'd':
+ delete = true;
+ break;
+ case 'a':
+ apply = optarg;
+ break;
+ case 'h':
+ default:
+ return usage();
+ }
+ }
+
+ if (apply && delete) {
+ printf("Error: You can not specify both -a and -d at once.\n");
+ return usage();
+ }
+
+ if (optind >= argc) {
+ printf("Error: No initrd is specified.\n");
+ return usage();
+ }
+
+ path = argv[optind];
+
+ if (apply)
+ return apply_xbc(path, apply);
+ else if (delete)
+ return delete_xbc(path);
+
+ return show_xbc(path);
+}
+

2020-01-10 16:07:14

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 10/22] tracing: Apply soft-disabled and filter to tracepoints printk

Apply soft-disabled and the filter rule of the trace events to
the printk output of tracepoints (a.k.a. tp_printk kernel parameter)
as same as trace buffer output.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index ddb7e7f5fe8d..43f0f255ad66 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2610,6 +2610,7 @@ static DEFINE_MUTEX(tracepoint_printk_mutex);
static void output_printk(struct trace_event_buffer *fbuffer)
{
struct trace_event_call *event_call;
+ struct trace_event_file *file;
struct trace_event *event;
unsigned long flags;
struct trace_iterator *iter = tracepoint_print_iter;
@@ -2623,6 +2624,12 @@ static void output_printk(struct trace_event_buffer *fbuffer)
!event_call->event.funcs->trace)
return;

+ file = fbuffer->trace_file;
+ if (test_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &file->flags) ||
+ (unlikely(file->flags & EVENT_FILE_FL_FILTERED) &&
+ !filter_match_preds(file->filter, fbuffer->entry)))
+ return;
+
event = &fbuffer->trace_file->event_call->event;

spin_lock_irqsave(&tracepoint_iter_lock, flags);

2020-01-10 16:08:07

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 06/22] init/main.c: Alloc initcall_command_line in do_initcall() and free it

Since initcall_command_line is used as a temporary buffer,
it could be freed after usage. Allocate it in do_initcall()
and free it after used.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
init/main.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/init/main.c b/init/main.c
index 59c418a57f92..0b4e0c8ccf16 100644
--- a/init/main.c
+++ b/init/main.c
@@ -137,8 +137,6 @@ char __initdata boot_command_line[COMMAND_LINE_SIZE];
char *saved_command_line;
/* Command line for parameter parsing */
static char *static_command_line;
-/* Command line for per-initcall parameter parsing */
-static char *initcall_command_line;

static char *execute_command;
static char *ramdisk_execute_command;
@@ -433,10 +431,6 @@ static void __init setup_command_line(char *command_line)
if (!saved_command_line)
panic("%s: Failed to allocate %zu bytes\n", __func__, len);

- initcall_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
- if (!initcall_command_line)
- panic("%s: Failed to allocate %zu bytes\n", __func__, len);
-
static_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
if (!static_command_line)
panic("%s: Failed to allocate %zu bytes\n", __func__, len);
@@ -1044,13 +1038,12 @@ static const char *initcall_level_names[] __initdata = {
"late",
};

-static void __init do_initcall_level(int level)
+static void __init do_initcall_level(int level, char *command_line)
{
initcall_entry_t *fn;

- strcpy(initcall_command_line, saved_command_line);
parse_args(initcall_level_names[level],
- initcall_command_line, __start___param,
+ command_line, __start___param,
__stop___param - __start___param,
level, level,
NULL, &repair_env_string);
@@ -1063,9 +1056,20 @@ static void __init do_initcall_level(int level)
static void __init do_initcalls(void)
{
int level;
+ size_t len = strlen(saved_command_line) + 1;
+ char *command_line;
+
+ command_line = kzalloc(len, GFP_KERNEL);
+ if (!command_line)
+ panic("%s: Failed to allocate %zu bytes\n", __func__, len);
+
+ for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++) {
+ /* Parser modifies command_line, restore it each time */
+ strcpy(command_line, saved_command_line);
+ do_initcall_level(level, command_line);
+ }

- for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++)
- do_initcall_level(level);
+ kfree(command_line);
}

/*

2020-01-10 16:08:10

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 15/22] tracing/boot: Add boot-time tracing

Setup tracing options via extra boot config in addition to kernel
command line.

This adds following commands support. These are applied to
the global trace instance.

- ftrace.options = OPT1[,OPT2...]
Enable given ftrace options.

- ftrace.trace_clock = CLOCK
Set given CLOCK to ftrace's trace_clock.

- ftrace.buffer_size = SIZE
Configure ftrace buffer size to SIZE. You can use "KB" or "MB"
for that SIZE.

- ftrace.events = EVENT[, EVENT2...]
Enable given events on boot. You can use a wild card in EVENT.

- ftrace.tracer = TRACER
Set TRACER to current tracer on boot. (e.g. function)

Note that this is NOT replacing the kernel parameters, because
this boot config based setting is later than that. If you want to
trace earlier boot events, you still need kernel parameters.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v4:
- Remove parameter which is not related to instance.
- Use bootconfig.
---
kernel/trace/Kconfig | 9 ++++
kernel/trace/Makefile | 1
kernel/trace/trace.c | 10 ++--
kernel/trace/trace_boot.c | 113 +++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 128 insertions(+), 5 deletions(-)
create mode 100644 kernel/trace/trace_boot.c

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 25a0fcfa7a5d..75326d8ab1af 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -814,6 +814,15 @@ config GCOV_PROFILE_FTRACE
Note that on a kernel compiled with this config, ftrace will
run significantly slower.

+config BOOTTIME_TRACING
+ bool "Boot-time Tracing support"
+ depends on BOOT_CONFIG && TRACING
+ default y
+ help
+ Enable developer to setup ftrace subsystem via supplemental
+ kernel cmdline at boot time for debugging (tracing) driver
+ initialization and boot process.
+
endif # FTRACE

endif # TRACING_SUPPORT
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 0e63db62225f..395e2db9c742 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -83,6 +83,7 @@ endif
obj-$(CONFIG_DYNAMIC_EVENTS) += trace_dynevent.o
obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
+obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o

obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 970bac1299b5..c2e1b33aec17 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -162,7 +162,7 @@ union trace_eval_map_item {
static union trace_eval_map_item *trace_eval_maps;
#endif /* CONFIG_TRACE_EVAL_MAP_FILE */

-static int tracing_set_tracer(struct trace_array *tr, const char *buf);
+int tracing_set_tracer(struct trace_array *tr, const char *buf);
static void ftrace_trace_userstack(struct ring_buffer *buffer,
unsigned long flags, int pc);

@@ -4747,7 +4747,7 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
return 0;
}

-static int trace_set_options(struct trace_array *tr, char *option)
+int trace_set_options(struct trace_array *tr, char *option)
{
char *cmp;
int neg = 0;
@@ -5647,8 +5647,8 @@ static int __tracing_resize_ring_buffer(struct trace_array *tr,
return ret;
}

-static ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
- unsigned long size, int cpu_id)
+ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
+ unsigned long size, int cpu_id)
{
int ret = size;

@@ -5727,7 +5727,7 @@ static void add_tracer_options(struct trace_array *tr, struct tracer *t)
create_trace_option_files(tr, t);
}

-static int tracing_set_tracer(struct trace_array *tr, const char *buf)
+int tracing_set_tracer(struct trace_array *tr, const char *buf)
{
struct tracer *t;
#ifdef CONFIG_TRACER_MAX_TRACE
diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
new file mode 100644
index 000000000000..4b41310184df
--- /dev/null
+++ b/kernel/trace/trace_boot.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * trace_boot.c
+ * Tracing kernel boot-time
+ */
+
+#define pr_fmt(fmt) "trace_boot: " fmt
+
+#include <linux/ftrace.h>
+#include <linux/init.h>
+#include <linux/bootconfig.h>
+
+#include "trace.h"
+
+#define MAX_BUF_LEN 256
+
+extern int trace_set_options(struct trace_array *tr, char *option);
+extern int tracing_set_tracer(struct trace_array *tr, const char *buf);
+extern ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
+ unsigned long size, int cpu_id);
+
+static void __init
+trace_boot_set_ftrace_options(struct trace_array *tr, struct xbc_node *node)
+{
+ struct xbc_node *anode;
+ const char *p;
+ char buf[MAX_BUF_LEN];
+ unsigned long v = 0;
+
+ /* Common ftrace options */
+ xbc_node_for_each_array_value(node, "options", anode, p) {
+ if (strlcpy(buf, p, ARRAY_SIZE(buf)) >= ARRAY_SIZE(buf)) {
+ pr_err("String is too long: %s\n", p);
+ continue;
+ }
+
+ if (trace_set_options(tr, buf) < 0)
+ pr_err("Failed to set option: %s\n", buf);
+ }
+
+ p = xbc_node_find_value(node, "trace_clock", NULL);
+ if (p && *p != '\0') {
+ if (tracing_set_clock(tr, p) < 0)
+ pr_err("Failed to set trace clock: %s\n", p);
+ }
+
+ p = xbc_node_find_value(node, "buffer_size", NULL);
+ if (p && *p != '\0') {
+ v = memparse(p, NULL);
+ if (v < PAGE_SIZE)
+ pr_err("Buffer size is too small: %s\n", p);
+ if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0)
+ pr_err("Failed to resize trace buffer to %s\n", p);
+ }
+}
+
+#ifdef CONFIG_EVENT_TRACING
+extern int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set);
+
+static void __init
+trace_boot_enable_events(struct trace_array *tr, struct xbc_node *node)
+{
+ struct xbc_node *anode;
+ char buf[MAX_BUF_LEN];
+ const char *p;
+
+ xbc_node_for_each_array_value(node, "events", anode, p) {
+ if (strlcpy(buf, p, ARRAY_SIZE(buf)) >= ARRAY_SIZE(buf)) {
+ pr_err("String is too long: %s\n", p);
+ continue;
+ }
+
+ if (ftrace_set_clr_event(tr, buf, 1) < 0)
+ pr_err("Failed to enable event: %s\n", p);
+ }
+}
+#else
+#define trace_boot_enable_events(tr, node) do {} while (0)
+#endif
+
+static void __init
+trace_boot_enable_tracer(struct trace_array *tr, struct xbc_node *node)
+{
+ const char *p;
+
+ p = xbc_node_find_value(node, "tracer", NULL);
+ if (p && *p != '\0') {
+ if (tracing_set_tracer(tr, p) < 0)
+ pr_err("Failed to set given tracer: %s\n", p);
+ }
+}
+
+static int __init trace_boot_init(void)
+{
+ struct xbc_node *trace_node;
+ struct trace_array *tr;
+
+ trace_node = xbc_find_node("ftrace");
+ if (!trace_node)
+ return 0;
+
+ tr = top_trace_array();
+ if (!tr)
+ return 0;
+
+ trace_boot_set_ftrace_options(tr, trace_node);
+ trace_boot_enable_events(tr, trace_node);
+ trace_boot_enable_tracer(tr, trace_node);
+
+ return 0;
+}
+
+fs_initcall(trace_boot_init);

2020-01-10 16:08:11

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 16/22] tracing/boot: Add per-event settings

Add per-event settings for boottime tracing. User can set filter,
actions and enable on each event on boot. The event entries are
under ftrace.event.GROUP.EVENT node (note that the option key
includes event's group name and event name.) This supports below
configs.

- ftrace.event.GROUP.EVENT.enable
Enables GROUP:EVENT tracing.

- ftrace.event.GROUP.EVENT.filter = FILTER
Set FILTER rule to the GROUP:EVENT.

- ftrace.event.GROUP.EVENT.actions = ACTION[, ACTION2...]
Set ACTIONs to the GROUP:EVENT.

For example,

ftrace.event.sched.sched_process_exec {
filter = "pid < 128"
enable
}

this will enable tracing "sched:sched_process_exec" event
with "pid < 128" filter.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace_boot.c | 60 +++++++++++++++++++++++++++++++++++
kernel/trace/trace_events_trigger.c | 2 +
2 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
index 4b41310184df..37524031533e 100644
--- a/kernel/trace/trace_boot.c
+++ b/kernel/trace/trace_boot.c
@@ -56,6 +56,7 @@ trace_boot_set_ftrace_options(struct trace_array *tr, struct xbc_node *node)

#ifdef CONFIG_EVENT_TRACING
extern int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set);
+extern int trigger_process_regex(struct trace_event_file *file, char *buff);

static void __init
trace_boot_enable_events(struct trace_array *tr, struct xbc_node *node)
@@ -74,8 +75,66 @@ trace_boot_enable_events(struct trace_array *tr, struct xbc_node *node)
pr_err("Failed to enable event: %s\n", p);
}
}
+
+static void __init
+trace_boot_init_one_event(struct trace_array *tr, struct xbc_node *gnode,
+ struct xbc_node *enode)
+{
+ struct trace_event_file *file;
+ struct xbc_node *anode;
+ char buf[MAX_BUF_LEN];
+ const char *p, *group, *event;
+
+ group = xbc_node_get_data(gnode);
+ event = xbc_node_get_data(enode);
+
+ mutex_lock(&event_mutex);
+ file = find_event_file(tr, group, event);
+ if (!file) {
+ pr_err("Failed to find event: %s:%s\n", group, event);
+ goto out;
+ }
+
+ p = xbc_node_find_value(enode, "filter", NULL);
+ if (p && *p != '\0') {
+ if (strlcpy(buf, p, ARRAY_SIZE(buf)) >= ARRAY_SIZE(buf))
+ pr_err("filter string is too long: %s\n", p);
+ else if (apply_event_filter(file, buf) < 0)
+ pr_err("Failed to apply filter: %s\n", buf);
+ }
+
+ xbc_node_for_each_array_value(enode, "actions", anode, p) {
+ if (strlcpy(buf, p, ARRAY_SIZE(buf)) >= ARRAY_SIZE(buf))
+ pr_err("action string is too long: %s\n", p);
+ else if (trigger_process_regex(file, buf) < 0)
+ pr_err("Failed to apply an action: %s\n", buf);
+ }
+
+ if (xbc_node_find_value(enode, "enable", NULL)) {
+ if (trace_event_enable_disable(file, 1, 0) < 0)
+ pr_err("Failed to enable event node: %s:%s\n",
+ group, event);
+ }
+out:
+ mutex_unlock(&event_mutex);
+}
+
+static void __init
+trace_boot_init_events(struct trace_array *tr, struct xbc_node *node)
+{
+ struct xbc_node *gnode, *enode;
+
+ node = xbc_node_find_child(node, "event");
+ if (!node)
+ return;
+ /* per-event key starts with "event.GROUP.EVENT" */
+ xbc_node_for_each_child(node, gnode)
+ xbc_node_for_each_child(gnode, enode)
+ trace_boot_init_one_event(tr, gnode, enode);
+}
#else
#define trace_boot_enable_events(tr, node) do {} while (0)
+#define trace_boot_init_events(tr, node) do {} while (0)
#endif

static void __init
@@ -104,6 +163,7 @@ static int __init trace_boot_init(void)
return 0;

trace_boot_set_ftrace_options(tr, trace_node);
+ trace_boot_init_events(tr, trace_node);
trace_boot_enable_events(tr, trace_node);
trace_boot_enable_tracer(tr, trace_node);

diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 2cd53ca21b51..d8ada4c6f3f7 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -213,7 +213,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
return ret;
}

-static int trigger_process_regex(struct trace_event_file *file, char *buff)
+int trigger_process_regex(struct trace_event_file *file, char *buff)
{
char *command, *next = buff;
struct event_command *p;

2020-01-10 16:08:12

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 17/22] tracing/boot Add kprobe event support

Add kprobe event support on event node to boot-time tracing.
If the group name of event is "kprobes", the boot-time tracing
defines new probe event according to "probes" values.

- ftrace.event.kprobes.EVENT.probes = PROBE[, PROBE2...]
Defines new kprobe event based on PROBEs. It is able to define
multiple probes on one event, but those must have same type of
arguments.

For example,

ftrace.events.kprobes.myevent {
probes = "vfs_read $arg1 $arg2";
enable;
}

This will add kprobes:myevent on vfs_read with the 1st and the 2nd
arguments.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace_boot.c | 46 +++++++++++++++++++++++++++++++++++++++++++
kernel/trace/trace_kprobe.c | 5 +++++
2 files changed, 51 insertions(+)

diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
index 37524031533e..a11dc60299fb 100644
--- a/kernel/trace/trace_boot.c
+++ b/kernel/trace/trace_boot.c
@@ -76,6 +76,48 @@ trace_boot_enable_events(struct trace_array *tr, struct xbc_node *node)
}
}

+#ifdef CONFIG_KPROBE_EVENTS
+extern int trace_kprobe_run_command(const char *command);
+
+static int __init
+trace_boot_add_kprobe_event(struct xbc_node *node, const char *event)
+{
+ struct xbc_node *anode;
+ char buf[MAX_BUF_LEN];
+ const char *val;
+ char *p;
+ int len;
+
+ len = snprintf(buf, ARRAY_SIZE(buf) - 1, "p:kprobes/%s ", event);
+ if (len >= ARRAY_SIZE(buf)) {
+ pr_err("Event name is too long: %s\n", event);
+ return -E2BIG;
+ }
+ p = buf + len;
+ len = ARRAY_SIZE(buf) - len;
+
+ xbc_node_for_each_array_value(node, "probes", anode, val) {
+ if (strlcpy(p, val, len) >= len) {
+ pr_err("Probe definition is too long: %s\n", val);
+ return -E2BIG;
+ }
+ if (trace_kprobe_run_command(buf) < 0) {
+ pr_err("Failed to add probe: %s\n", buf);
+ return -EINVAL;
+ }
+ }
+
+ return 0;
+}
+#else
+static inline int __init
+trace_boot_add_kprobe_event(struct xbc_node *node, const char *event)
+{
+ pr_err("Kprobe event is not supported.\n");
+ return -ENOTSUPP;
+}
+#endif
+
static void __init
trace_boot_init_one_event(struct trace_array *tr, struct xbc_node *gnode,
struct xbc_node *enode)
@@ -88,6 +130,10 @@ trace_boot_init_one_event(struct trace_array *tr, struct xbc_node *gnode,
group = xbc_node_get_data(gnode);
event = xbc_node_get_data(enode);

+ if (!strcmp(group, "kprobes"))
+ if (trace_boot_add_kprobe_event(enode, event) < 0)
+ return;
+
mutex_lock(&event_mutex);
file = find_event_file(tr, group, event);
if (!file) {
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5584405b899d..318a3579a928 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -902,6 +902,11 @@ static int create_or_delete_trace_kprobe(int argc, char **argv)
return ret == -ECANCELED ? -EINVAL : ret;
}

+int trace_kprobe_run_command(const char *command)
+{
+ return trace_run_command(command, create_or_delete_trace_kprobe);
+}
+
static int trace_kprobe_release(struct dyn_event *ev)
{
struct trace_kprobe *tk = to_trace_kprobe(ev);

2020-01-10 16:08:20

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 14/22] tracing: Add NULL trace-array check in print_synth_event()

Add NULL trace-array check in print_synth_event(), because
if we enable tp_printk option, iter->tr can be NULL.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace_events_hist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index dae2c25b209a..137fc50f2b35 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -833,7 +833,7 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
fmt = synth_field_fmt(se->fields[i]->type);

/* parameter types */
- if (tr->trace_flags & TRACE_ITER_VERBOSE)
+ if (tr && tr->trace_flags & TRACE_ITER_VERBOSE)
trace_seq_printf(s, "%s ", fmt);

snprintf(print_fmt, sizeof(print_fmt), "%%s=%s%%s", fmt);

2020-01-10 16:08:24

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 18/22] tracing/boot: Add synthetic event support

Add synthetic event node support to boot time tracing.
The synthetic event is a kind of event node, but the group
name is "synthetic".

- ftrace.event.synthetic.EVENT.fields = FIELD[, FIELD2...]
Defines new synthetic event with FIELDs. Each field should be
"type varname".

The synthetic node requires "fields" string arraies, which defines
the fields as same as tracing/synth_events interface.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace_boot.c | 47 ++++++++++++++++++++++++++++++++++++++
kernel/trace/trace_events_hist.c | 5 ++++
2 files changed, 52 insertions(+)

diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
index a11dc60299fb..3054921b0877 100644
--- a/kernel/trace/trace_boot.c
+++ b/kernel/trace/trace_boot.c
@@ -118,6 +118,50 @@ trace_boot_add_kprobe_event(struct xbc_node *node, const char *event)
}
#endif

+#ifdef CONFIG_HIST_TRIGGERS
+extern int synth_event_run_command(const char *command);
+
+static int __init
+trace_boot_add_synth_event(struct xbc_node *node, const char *event)
+{
+ struct xbc_node *anode;
+ char buf[MAX_BUF_LEN], *q;
+ const char *p;
+ int len, delta, ret;
+
+ len = ARRAY_SIZE(buf);
+ delta = snprintf(buf, len, "%s", event);
+ if (delta >= len) {
+ pr_err("Event name is too long: %s\n", event);
+ return -E2BIG;
+ }
+ len -= delta; q = buf + delta;
+
+ xbc_node_for_each_array_value(node, "fields", anode, p) {
+ delta = snprintf(q, len, " %s;", p);
+ if (delta >= len) {
+ pr_err("fields string is too long: %s\n", p);
+ return -E2BIG;
+ }
+ len -= delta; q += delta;
+ }
+
+ ret = synth_event_run_command(buf);
+ if (ret < 0)
+ pr_err("Failed to add synthetic event: %s\n", buf);
+
+
+ return ret;
+}
+#else
+static inline int __init
+trace_boot_add_synth_event(struct xbc_node *node, const char *event)
+{
+ pr_err("Synthetic event is not supported.\n");
+ return -ENOTSUPP;
+}
+#endif
+
static void __init
trace_boot_init_one_event(struct trace_array *tr, struct xbc_node *gnode,
struct xbc_node *enode)
@@ -133,6 +177,9 @@ trace_boot_init_one_event(struct trace_array *tr, struct xbc_node *gnode,
if (!strcmp(group, "kprobes"))
if (trace_boot_add_kprobe_event(enode, event) < 0)
return;
+ if (!strcmp(group, "synthetic"))
+ if (trace_boot_add_synth_event(enode, event) < 0)
+ return;

mutex_lock(&event_mutex);
file = find_event_file(tr, group, event);
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 137fc50f2b35..3f26c4ed212a 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1384,6 +1384,11 @@ static int create_or_delete_synth_event(int argc, char **argv)
return ret == -ECANCELED ? -EINVAL : ret;
}

+int synth_event_run_command(const char *command)
+{
+ return trace_run_command(command, create_or_delete_synth_event);
+}
+
static int synth_event_create(int argc, const char **argv)
{
const char *name = argv[0];

2020-01-10 16:08:32

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 19/22] tracing/boot: Add instance node support

Add instance node support to boot-time tracing. User can set
some options and event nodes under instance node.

- ftrace.instance.INSTANCE[...]
Add new INSTANCE instance. Some options and event nodes
are acceptable for instance node.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v4:
- Use trace_array_get_by_name() instead of trace_array_create().
- Remove global boot option setting.
---
kernel/trace/trace_boot.c | 43 ++++++++++++++++++++++++++++++++++++++-----
1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
index 3054921b0877..f5db30d25b0b 100644
--- a/kernel/trace/trace_boot.c
+++ b/kernel/trace/trace_boot.c
@@ -20,7 +20,7 @@ extern ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
unsigned long size, int cpu_id);

static void __init
-trace_boot_set_ftrace_options(struct trace_array *tr, struct xbc_node *node)
+trace_boot_set_instance_options(struct trace_array *tr, struct xbc_node *node)
{
struct xbc_node *anode;
const char *p;
@@ -242,6 +242,40 @@ trace_boot_enable_tracer(struct trace_array *tr, struct xbc_node *node)
}
}

+static void __init
+trace_boot_init_one_instance(struct trace_array *tr, struct xbc_node *node)
+{
+ trace_boot_set_instance_options(tr, node);
+ trace_boot_init_events(tr, node);
+ trace_boot_enable_events(tr, node);
+ trace_boot_enable_tracer(tr, node);
+}
+
+static void __init
+trace_boot_init_instances(struct xbc_node *node)
+{
+ struct xbc_node *inode;
+ struct trace_array *tr;
+ const char *p;
+
+ node = xbc_node_find_child(node, "instance");
+ if (!node)
+ return;
+
+ xbc_node_for_each_child(node, inode) {
+ p = xbc_node_get_data(inode);
+ if (!p || *p == '\0')
+ continue;
+
+ tr = trace_array_get_by_name(p);
+ if (IS_ERR(tr)) {
+ pr_err("Failed to get trace instance %s\n", p);
+ continue;
+ }
+ trace_boot_init_one_instance(tr, inode);
+ }
+}
+
static int __init trace_boot_init(void)
{
struct xbc_node *trace_node;
@@ -255,10 +289,9 @@ static int __init trace_boot_init(void)
if (!tr)
return 0;

- trace_boot_set_ftrace_options(tr, trace_node);
- trace_boot_init_events(tr, trace_node);
- trace_boot_enable_events(tr, trace_node);
- trace_boot_enable_tracer(tr, trace_node);
+ /* Global trace array is also one instance */
+ trace_boot_init_one_instance(tr, trace_node);
+ trace_boot_init_instances(trace_node);

return 0;
}

2020-01-10 16:09:05

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 21/22] tracing/boot: Add function tracer filter options

Add below function-tracer filter options to boot-time tracing.

- ftrace.[instance.INSTANCE.]ftrace.filters
This will take an array of tracing function filter rules

- ftrace.[instance.INSTANCE.]ftrace.notraces
This will take an array of NON-tracing function filter rules

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v6:
- Fix to depend on CONFIG_DYNAMIC_FTRACE instead of
CONFIG_FUNCTION_TRACER.
---
kernel/trace/trace_boot.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)

diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
index 81d923c16a4d..fa9603dc6469 100644
--- a/kernel/trace/trace_boot.c
+++ b/kernel/trace/trace_boot.c
@@ -244,11 +244,51 @@ trace_boot_init_events(struct trace_array *tr, struct xbc_node *node)
#define trace_boot_init_events(tr, node) do {} while (0)
#endif

+#ifdef CONFIG_DYNAMIC_FTRACE
+extern bool ftrace_filter_param __initdata;
+extern int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
+ int len, int reset);
+extern int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
+ int len, int reset);
+static void __init
+trace_boot_set_ftrace_filter(struct trace_array *tr, struct xbc_node *node)
+{
+ struct xbc_node *anode;
+ const char *p;
+ char *q;
+
+ xbc_node_for_each_array_value(node, "ftrace.filters", anode, p) {
+ q = kstrdup(p, GFP_KERNEL);
+ if (!q)
+ return;
+ if (ftrace_set_filter(tr->ops, q, strlen(q), 0) < 0)
+ pr_err("Failed to add %s to ftrace filter\n", p);
+ else
+ ftrace_filter_param = true;
+ kfree(q);
+ }
+ xbc_node_for_each_array_value(node, "ftrace.notraces", anode, p) {
+ q = kstrdup(p, GFP_KERNEL);
+ if (!q)
+ return;
+ if (ftrace_set_notrace(tr->ops, q, strlen(q), 0) < 0)
+ pr_err("Failed to add %s to ftrace filter\n", p);
+ else
+ ftrace_filter_param = true;
+ kfree(q);
+ }
+}
+#else
+#define trace_boot_set_ftrace_filter(tr, node) do {} while (0)
+#endif
+
static void __init
trace_boot_enable_tracer(struct trace_array *tr, struct xbc_node *node)
{
const char *p;

+ trace_boot_set_ftrace_filter(tr, node);
+
p = xbc_node_find_value(node, "tracer", NULL);
if (p && *p != '\0') {
if (tracing_set_tracer(tr, p) < 0)

2020-01-10 16:09:05

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 09/22] Documentation: bootconfig: Add a doc for extended boot config

Add a documentation for extended boot config under
admin-guide, since it is including the syntax of boot config.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v6:
- Add a note about comment after value.
Changes in v5:
- Fix to insert bootconfig to TOC list alphabetically.
- Add notes about avaliable characters in values.
- Fix to use correct quotes (``) for .rst.
Changes in v4:
- Rename suppremental kernel command line to boot config.
- Update document according to the recent changes.
- Add How to load it on boot.
- Style bugfix.
---
Documentation/admin-guide/bootconfig.rst | 184 ++++++++++++++++++++++++++++++
Documentation/admin-guide/index.rst | 1
MAINTAINERS | 1
3 files changed, 186 insertions(+)
create mode 100644 Documentation/admin-guide/bootconfig.rst

diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
new file mode 100644
index 000000000000..f7475df2a718
--- /dev/null
+++ b/Documentation/admin-guide/bootconfig.rst
@@ -0,0 +1,184 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================
+Boot Configuration
+==================
+
+:Author: Masami Hiramatsu <[email protected]>
+
+Overview
+========
+
+The boot configuration is expanding current kernel cmdline to support
+additional key-value data when boot the kernel in an efficient way.
+This allows adoministrators to pass a structured-Key config file.
+
+Config File Syntax
+==================
+
+The boot config syntax is a simple structured key-value. Each key consists
+of dot-connected-words, and key and value are connected by "=". The value
+has to be terminated by semi-colon (``;``) or newline (``\n``).
+For array value, array entries are separated by comma (``,``). ::
+
+KEY[.WORD[...]] = VALUE[, VALUE2[...]][;]
+
+Each key word must contain only alphabets, numbers, dash (``-``) or underscore
+(``_``). And each value only contains printable characters or spaces except
+for delimiters such as semi-colon (``;``), new-line (``\n``), comma (``,``),
+hash (``#``) and closing brace (``}``).
+
+If you want to use those delimiters in a value, you can use either double-
+quotes (``"VALUE"``) or single-quotes (``'VALUE'``) to quote it. Note that
+you can not escape these quotes.
+
+There can be a key which doesn't have value or has an empty value. Those keys
+are used for checking the key exists or not (like a boolean).
+
+Key-Value Syntax
+----------------
+
+The boot config file syntax allows user to merge partially same word keys
+by brace. For example::
+
+ foo.bar.baz = value1
+ foo.bar.qux.quux = value2
+
+These can be written also in::
+
+ foo.bar {
+ baz = value1
+ qux.quux = value2
+ }
+
+Or more shorter, written as following::
+
+ foo.bar { baz = value1; qux.quux = value2 }
+
+In both styles, same key words are automatically merged when parsing it
+at boot time. So you can append similar trees or key-values.
+
+Comments
+--------
+
+The config syntax accepts shell-script style comments. The comments start
+with hash ("#") until newline ("\n") will be ignored.
+
+::
+
+ # comment line
+ foo = value # value is set to foo.
+ bar = 1, # 1st element
+ 2, # 2nd element
+ 3 # 3rd element
+
+This is parsed as below::
+
+ foo = value
+ bar = 1, 2, 3
+
+Note that you can not put a comment between value and delimiter(``,`` or
+``;``). This means following config has a syntax error ::
+
+ key = 1 # comment
+ ,2
+
+
+/proc/bootconfig
+================
+
+/proc/bootconfig is a user-space interface of the boot config.
+Unlike /proc/cmdline, this file shows the key-value style list.
+Each key-value pair is shown in each line with following style::
+
+ KEY[.WORDS...] = "[VALUE]"[,"VALUE2"...]
+
+
+Boot Kernel With a Boot Config
+==============================
+
+Since the boot configuration file is loaded with initrd, it will be added
+to the end of the initrd (initramfs) image file. The Linux kernel decodes
+the last part of the initrd image in memory to get the boot configuration
+data.
+Because of this "piggyback" method, there is no need to change or
+update the boot loader and the kernel image itself.
+
+To do this operation, Linux kernel provides "bootconfig" command under
+tools/bootconfig, which allows admin to apply or delete the config file
+to/from initrd image. You can build it by follwoing command::
+
+ # make -C tools/bootconfig
+
+To add your boot config file to initrd image, run bootconfig as below
+(Old data is removed automatically if exists)::
+
+ # tools/bootconfig/bootconfig -a your-config /boot/initrd.img-X.Y.Z
+
+To remove the config from the image, you can use -d option as below::
+
+ # tools/bootconfig/bootconfig -d /boot/initrd.img-X.Y.Z
+
+
+C onfig File Limitation
+======================
+
+Currently the maximum config size size is 32KB and the total key-words (not
+key-value entries) must be under 1024 nodes.
+Note: this is not the number of entries but nodes, an entry must consume
+more than 2 nodes (a key-word and a value). So theoretically, it will be
+up to 512 key-value pairs. If keys contains 3 words in average, it can
+contain 256 key-value pairs. In most cases, the number of config items
+will be under 100 entries and smaller than 8KB, so it would be enough.
+If the node number exceeds 1024, parser returns an error even if the file
+size is smaller than 32KB.
+Anyway, since bootconfig command verifies it when appending a boot config
+to initrd image, user can notice it before boot.
+
+
+Bootconfig APIs
+===============
+
+User can query or loop on key-value pairs, also it is possible to find
+a root (prefix) key node and find key-values under that node.
+
+If you have a key string, you can query the value directly with the key
+using xbc_find_value(). If you want to know what keys exist in the SKC
+tree, you can use xbc_for_each_key_value() to iterate key-value pairs.
+Note that you need to use xbc_array_for_each_value() for accessing
+each arraies value, e.g.::
+
+ vnode = NULL;
+ xbc_find_value("key.word", &vnode);
+ if (vnode && xbc_node_is_array(vnode))
+ xbc_array_for_each_value(vnode, value) {
+ printk("%s ", value);
+ }
+
+If you want to focus on keys which has a prefix string, you can use
+xbc_find_node() to find a node which prefix key words, and iterate
+keys under the prefix node with xbc_node_for_each_key_value().
+
+But the most typical usage is to get the named value under prefix
+or get the named array under prefix as below::
+
+ root = xbc_find_node("key.prefix");
+ value = xbc_node_find_value(root, "option", &vnode);
+ ...
+ xbc_node_for_each_array_value(root, "array-option", value, anode) {
+ ...
+ }
+
+This accesses a value of "key.prefix.option" and an array of
+"key.prefix.array-option".
+
+Locking is not needed, since after initialized, the config becomes readonly.
+All data and keys must be copied if you need to modify it.
+
+
+Functions and structures
+========================
+
+.. kernel-doc:: include/linux/bootconfig.h
+.. kernel-doc:: lib/bootconfig.c
+
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 4405b7485312..9e0f1e3fd152 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -64,6 +64,7 @@ configure specific aspects of kernel behavior to your liking.
binderfs
binfmt-misc
blockdev/index
+ bootconfig
braille-console
btmrvl
cgroup-v1/index
diff --git a/MAINTAINERS b/MAINTAINERS
index d0da06bdf3d8..c14a956343b9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15780,6 +15780,7 @@ F: lib/bootconfig.c
F: fs/proc/bootconfig.c
F: include/linux/bootconfig.h
F: tools/bootconfig/*
+F: Documentation/admin-guide/bootconfig.rst

SUN3/3X
M: Sam Creasey <[email protected]>

2020-01-10 16:09:07

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 12/22] tracing: kprobes: Register to dynevent earlier stage

Register kprobe event to dynevent in subsys_initcall level.
This will allow kernel to register new kprobe events in
fs_initcall level via trace_run_command.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace_kprobe.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5899911a5720..5584405b899d 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1685,11 +1685,12 @@ static __init void setup_boot_kprobe_events(void)
enable_boot_kprobe_events();
}

-/* Make a tracefs interface for controlling probe points */
-static __init int init_kprobe_trace(void)
+/*
+ * Register dynevent at subsys_initcall. This allows kernel to setup kprobe
+ * events in fs_initcall without tracefs.
+ */
+static __init int init_kprobe_trace_early(void)
{
- struct dentry *d_tracer;
- struct dentry *entry;
int ret;

ret = dyn_event_register(&trace_kprobe_ops);
@@ -1699,6 +1700,16 @@ static __init int init_kprobe_trace(void)
if (register_module_notifier(&trace_kprobe_module_nb))
return -EINVAL;

+ return 0;
+}
+subsys_initcall(init_kprobe_trace_early);
+
+/* Make a tracefs interface for controlling probe points */
+static __init int init_kprobe_trace(void)
+{
+ struct dentry *d_tracer;
+ struct dentry *entry;
+
d_tracer = tracing_init_dentry();
if (IS_ERR(d_tracer))
return 0;

2020-01-10 16:09:11

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 11/22] tracing: kprobes: Output kprobe event to printk buffer

Since kprobe-events use event_trigger_unlock_commit_regs() directly,
that events doesn't show up in printk buffer if "tp_printk" is set.

Use trace_event_buffer_commit() in kprobe events so that it can
invoke output_printk() as same as other trace events.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
include/linux/trace_events.h | 1 +
kernel/trace/trace.c | 4 +--
kernel/trace/trace_events.c | 1 +
kernel/trace/trace_kprobe.c | 57 +++++++++++++++++++++---------------------
4 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 4c6e15605766..5c94b8bacc88 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -216,6 +216,7 @@ struct trace_event_buffer {
void *entry;
unsigned long flags;
int pc;
+ struct pt_regs *regs;
};

void *trace_event_buffer_reserve(struct trace_event_buffer *fbuffer,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 43f0f255ad66..970bac1299b5 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2680,9 +2680,9 @@ void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);

- event_trigger_unlock_commit(fbuffer->trace_file, fbuffer->buffer,
+ event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
fbuffer->event, fbuffer->entry,
- fbuffer->flags, fbuffer->pc);
+ fbuffer->flags, fbuffer->pc, fbuffer->regs);
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index a5b614cc3887..13446a3a7f1e 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -272,6 +272,7 @@ void *trace_event_buffer_reserve(struct trace_event_buffer *fbuffer,
if (!fbuffer->event)
return NULL;

+ fbuffer->regs = NULL;
fbuffer->entry = ring_buffer_event_data(fbuffer->event);
return fbuffer->entry;
}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 7f890262c8a3..5899911a5720 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1175,10 +1175,8 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
struct trace_event_file *trace_file)
{
struct kprobe_trace_entry_head *entry;
- struct ring_buffer_event *event;
- struct ring_buffer *buffer;
- int size, dsize, pc;
- unsigned long irq_flags;
+ struct trace_event_buffer fbuffer;
+ int dsize;
struct trace_event_call *call = trace_probe_event_call(&tk->tp);

WARN_ON(call != trace_file->event_call);
@@ -1186,24 +1184,26 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
if (trace_trigger_soft_disabled(trace_file))
return;

- local_save_flags(irq_flags);
- pc = preempt_count();
+ local_save_flags(fbuffer.flags);
+ fbuffer.pc = preempt_count();
+ fbuffer.trace_file = trace_file;

dsize = __get_data_size(&tk->tp, regs);
- size = sizeof(*entry) + tk->tp.size + dsize;

- event = trace_event_buffer_lock_reserve(&buffer, trace_file,
- call->event.type,
- size, irq_flags, pc);
- if (!event)
+ fbuffer.event =
+ trace_event_buffer_lock_reserve(&fbuffer.buffer, trace_file,
+ call->event.type,
+ sizeof(*entry) + tk->tp.size + dsize,
+ fbuffer.flags, fbuffer.pc);
+ if (!fbuffer.event)
return;

- entry = ring_buffer_event_data(event);
+ fbuffer.regs = regs;
+ entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
entry->ip = (unsigned long)tk->rp.kp.addr;
store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize);

- event_trigger_unlock_commit_regs(trace_file, buffer, event,
- entry, irq_flags, pc, regs);
+ trace_event_buffer_commit(&fbuffer);
}

static void
@@ -1223,10 +1223,8 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct trace_event_file *trace_file)
{
struct kretprobe_trace_entry_head *entry;
- struct ring_buffer_event *event;
- struct ring_buffer *buffer;
- int size, pc, dsize;
- unsigned long irq_flags;
+ struct trace_event_buffer fbuffer;
+ int dsize;
struct trace_event_call *call = trace_probe_event_call(&tk->tp);

WARN_ON(call != trace_file->event_call);
@@ -1234,25 +1232,26 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
if (trace_trigger_soft_disabled(trace_file))
return;

- local_save_flags(irq_flags);
- pc = preempt_count();
+ local_save_flags(fbuffer.flags);
+ fbuffer.pc = preempt_count();
+ fbuffer.trace_file = trace_file;

dsize = __get_data_size(&tk->tp, regs);
- size = sizeof(*entry) + tk->tp.size + dsize;
-
- event = trace_event_buffer_lock_reserve(&buffer, trace_file,
- call->event.type,
- size, irq_flags, pc);
- if (!event)
+ fbuffer.event =
+ trace_event_buffer_lock_reserve(&fbuffer.buffer, trace_file,
+ call->event.type,
+ sizeof(*entry) + tk->tp.size + dsize,
+ fbuffer.flags, fbuffer.pc);
+ if (!fbuffer.event)
return;

- entry = ring_buffer_event_data(event);
+ fbuffer.regs = regs;
+ entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
entry->func = (unsigned long)tk->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize);

- event_trigger_unlock_commit_regs(trace_file, buffer, event,
- entry, irq_flags, pc, regs);
+ trace_event_buffer_commit(&fbuffer);
}

static void

2020-01-10 16:09:26

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 13/22] tracing: Accept different type for synthetic event fields

Make the synthetic event accepts a different type field to record.
However, the size and signed flag must be same.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace_events_hist.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index f62de5f43e79..dae2c25b209a 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -4110,8 +4110,11 @@ static int check_synth_field(struct synth_event *event,

field = event->fields[field_pos];

- if (strcmp(field->type, hist_field->type) != 0)
- return -EINVAL;
+ if (strcmp(field->type, hist_field->type) != 0) {
+ if (field->size != hist_field->size ||
+ field->is_signed != hist_field->is_signed)
+ return -EINVAL;
+ }

return 0;
}

2020-01-10 16:10:28

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 20/22] tracing/boot: Add cpu_mask option support

Add ftrace.cpumask option support to boot-time tracing.
This sets cpumask for each instance.

- ftrace.[instance.INSTANCE.]cpumask = CPUMASK;
Set the trace cpumask. Note that the CPUMASK should be a string
which <tracefs>/tracing_cpumask can accepts.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/trace.c | 42 +++++++++++++++++++++++++++++-------------
kernel/trace/trace_boot.c | 14 ++++++++++++++
2 files changed, 43 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index c2e1b33aec17..5791e6b5136f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4561,20 +4561,13 @@ tracing_cpumask_read(struct file *filp, char __user *ubuf,
return count;
}

-static ssize_t
-tracing_cpumask_write(struct file *filp, const char __user *ubuf,
- size_t count, loff_t *ppos)
+int tracing_set_cpumask(struct trace_array *tr,
+ cpumask_var_t tracing_cpumask_new)
{
- struct trace_array *tr = file_inode(filp)->i_private;
- cpumask_var_t tracing_cpumask_new;
- int err, cpu;
-
- if (!alloc_cpumask_var(&tracing_cpumask_new, GFP_KERNEL))
- return -ENOMEM;
+ int cpu;

- err = cpumask_parse_user(ubuf, count, tracing_cpumask_new);
- if (err)
- goto err_unlock;
+ if (!tr)
+ return -EINVAL;

local_irq_disable();
arch_spin_lock(&tr->max_lock);
@@ -4598,11 +4591,34 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf,
local_irq_enable();

cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
+
+ return 0;
+}
+
+static ssize_t
+tracing_cpumask_write(struct file *filp, const char __user *ubuf,
+ size_t count, loff_t *ppos)
+{
+ struct trace_array *tr = file_inode(filp)->i_private;
+ cpumask_var_t tracing_cpumask_new;
+ int err;
+
+ if (!alloc_cpumask_var(&tracing_cpumask_new, GFP_KERNEL))
+ return -ENOMEM;
+
+ err = cpumask_parse_user(ubuf, count, tracing_cpumask_new);
+ if (err)
+ goto err_free;
+
+ err = tracing_set_cpumask(tr, tracing_cpumask_new);
+ if (err)
+ goto err_free;
+
free_cpumask_var(tracing_cpumask_new);

return count;

-err_unlock:
+err_free:
free_cpumask_var(tracing_cpumask_new);

return err;
diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
index f5db30d25b0b..81d923c16a4d 100644
--- a/kernel/trace/trace_boot.c
+++ b/kernel/trace/trace_boot.c
@@ -18,6 +18,8 @@ extern int trace_set_options(struct trace_array *tr, char *option);
extern int tracing_set_tracer(struct trace_array *tr, const char *buf);
extern ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
unsigned long size, int cpu_id);
+extern int tracing_set_cpumask(struct trace_array *tr,
+ cpumask_var_t tracing_cpumask_new);

static void __init
trace_boot_set_instance_options(struct trace_array *tr, struct xbc_node *node)
@@ -52,6 +54,18 @@ trace_boot_set_instance_options(struct trace_array *tr, struct xbc_node *node)
if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0)
pr_err("Failed to resize trace buffer to %s\n", p);
}
+
+ p = xbc_node_find_value(node, "cpumask", NULL);
+ if (p && *p != '\0') {
+ cpumask_var_t new_mask;
+
+ if (alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
+ if (cpumask_parse(p, new_mask) < 0 ||
+ tracing_set_cpumask(tr, new_mask) < 0)
+ pr_err("Failed to set new CPU mask %s\n", p);
+ free_cpumask_var(new_mask);
+ }
+ }
}

#ifdef CONFIG_EVENT_TRACING

2020-01-10 16:10:43

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH v6 22/22] Documentation: tracing: Add boot-time tracing document

Add a documentation about boot-time tracing options in
boot config.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Documentation/admin-guide/bootconfig.rst | 2
Documentation/trace/boottime-trace.rst | 184 ++++++++++++++++++++++++++++++
Documentation/trace/index.rst | 1
3 files changed, 187 insertions(+)
create mode 100644 Documentation/trace/boottime-trace.rst

diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
index f7475df2a718..c8f7cd4cf44e 100644
--- a/Documentation/admin-guide/bootconfig.rst
+++ b/Documentation/admin-guide/bootconfig.rst
@@ -1,5 +1,7 @@
.. SPDX-License-Identifier: GPL-2.0

+.. _bootconfig:
+
==================
Boot Configuration
==================
diff --git a/Documentation/trace/boottime-trace.rst b/Documentation/trace/boottime-trace.rst
new file mode 100644
index 000000000000..1d10fdebf1b2
--- /dev/null
+++ b/Documentation/trace/boottime-trace.rst
@@ -0,0 +1,184 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Boot-time tracing
+=================
+
+:Author: Masami Hiramatsu <[email protected]>
+
+Overview
+========
+
+Boot-time tracing allows users to trace boot-time process including
+device initialization with full features of ftrace including per-event
+filter and actions, histograms, kprobe-events and synthetic-events,
+and trace instances.
+Since kernel cmdline is not enough to control these complex features,
+this uses bootconfig file to describe tracing feature programming.
+
+Options in the Boot Config
+==========================
+
+Here is the list of available options list for boot time tracing in
+boot config file [1]_. All options are under "ftrace." or "kernel."
+refix. See kernel parameters for the options which starts
+with "kernel." prefix [2]_.
+
+.. [1] See :ref:`Documentation/admin-guide/bootconfig.rst <bootconfig>`
+.. [2] See :ref:`Documentation/admin-guide/kernel-parameters.rst <kernelparameters>`
+
+Ftrace Global Options
+---------------------
+
+Ftrace global options have "kernel." prefix in boot config, which means
+these options are passed as a part of kernel legacy command line.
+
+kernel.tp_printk
+ Output trace-event data on printk buffer too.
+
+kernel.dump_on_oops [= MODE]
+ Dump ftrace on Oops. If MODE = 1 or omitted, dump trace buffer
+ on all CPUs. If MODE = 2, dump a buffer on a CPU which kicks Oops.
+
+kernel.traceoff_on_warning
+ Stop tracing if WARN_ON() occurs.
+
+kernel.fgraph_max_depth = MAX_DEPTH
+ Set MAX_DEPTH to maximum depth of fgraph tracer.
+
+kernel.fgraph_filters = FILTER[, FILTER2...]
+ Add fgraph tracing function filters.
+
+kernel.fgraph_notraces = FILTER[, FILTER2...]
+ Add fgraph non tracing function filters.
+
+
+Ftrace Per-instance Options
+---------------------------
+
+These options can be used for each instance including global ftrace node.
+
+ftrace.[instance.INSTANCE.]options = OPT1[, OPT2[...]]
+ Enable given ftrace options.
+
+ftrace.[instance.INSTANCE.]trace_clock = CLOCK
+ Set given CLOCK to ftrace's trace_clock.
+
+ftrace.[instance.INSTANCE.]buffer_size = SIZE
+ Configure ftrace buffer size to SIZE. You can use "KB" or "MB"
+ for that SIZE.
+
+ftrace.[instance.INSTANCE.]alloc_snapshot
+ Allocate snapshot buffer.
+
+ftrace.[instance.INSTANCE.]cpumask = CPUMASK
+ Set CPUMASK as trace cpu-mask.
+
+ftrace.[instance.INSTANCE.]events = EVENT[, EVENT2[...]]
+ Enable given events on boot. You can use a wild card in EVENT.
+
+ftrace.[instance.INSTANCE.]tracer = TRACER
+ Set TRACER to current tracer on boot. (e.g. function)
+
+ftrace.[instance.INSTANCE.]ftrace.filters
+ This will take an array of tracing function filter rules
+
+ftrace.[instance.INSTANCE.]ftrace.notraces
+ This will take an array of NON-tracing function filter rules
+
+
+Ftrace Per-Event Options
+------------------------
+
+These options are setting per-event options.
+
+ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
+ Enables GROUP:EVENT tracing.
+
+ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
+ Set FILTER rule to the GROUP:EVENT.
+
+ftrace.[instance.INSTANCE.]event.GROUP.EVENT.actions = ACTION[, ACTION2[...]]
+ Set ACTIONs to the GROUP:EVENT.
+
+ftrace.[instance.INSTANCE.]event.kprobes.EVENT.probes = PROBE[, PROBE2[...]]
+ Defines new kprobe event based on PROBEs. It is able to define
+ multiple probes on one event, but those must have same type of
+ arguments. This option is available only for the event which
+ group name is "kprobes".
+
+ftrace.[instance.INSTANCE.]event.synthetic.EVENT.fields = FIELD[, FIELD2[...]]
+ Defines new synthetic event with FIELDs. Each field should be
+ "type varname".
+
+Note that kprobe and synthetic event definitions can be written under
+instance node, but those are also visible from other instances. So please
+take care for event name conflict.
+
+
+Examples
+========
+
+For example, to add filter and actions for each event, define kprobe
+events, and synthetic events with histogram, write a boot config like
+below::
+
+ ftrace.event {
+ task.task_newtask {
+ filter = "pid < 128"
+ enable
+ }
+ kprobes.vfs_read {
+ probes = "vfs_read $arg1 $arg2"
+ filter = "common_pid < 200"
+ enable
+ }
+ synthetic.initcall_latency {
+ fields = "unsigned long func", "u64 lat"
+ actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
+ }
+ initcall.initcall_start {
+ actions = "hist:keys=func:ts0=common_timestamp.usecs"
+ }
+ initcall.initcall_finish {
+ actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
+ }
+ }
+
+Also, boottime tracing supports "instance" node, which allows us to run
+several tracers for different purpose at once. For example, one tracer
+is for tracing functions start with "user\_", and others tracing "kernel\_"
+functions, you can write boot config as below::
+
+ ftrace.instance {
+ foo {
+ tracer = "function"
+ ftrace.filters = "user_*"
+ }
+ bar {
+ tracer = "function"
+ ftrace.filters = "kernel_*"
+ }
+ }
+
+The instance node also accepts event nodes so that each instance
+can customize its event tracing.
+
+This boot-time tracing also supports ftrace kernel parameters via boot
+config.
+For example, following kernel parameters::
+
+ trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
+
+This can be written in boot config like below::
+
+ kernel {
+ trace_options = sym-addr
+ trace_event = "initcall:*"
+ tp_printk
+ trace_buf_size = 1M
+ ftrace = function
+ ftrace_filter = "vfs*"
+ }
+
+Note that parameters start with "kernel" prefix instead of "ftrace".
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
index 04acd277c5f6..fa9e1c730f6a 100644
--- a/Documentation/trace/index.rst
+++ b/Documentation/trace/index.rst
@@ -19,6 +19,7 @@ Linux Tracing Technologies
events-msr
mmiotrace
histogram
+ boottime-trace
hwlat_detector
intel_th
stm

2020-01-18 18:15:53

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v6 22/22] Documentation: tracing: Add boot-time tracing document

Hi,

Here are a few editorial comments for you...


On 1/10/20 8:07 AM, Masami Hiramatsu wrote:
> Add a documentation about boot-time tracing options in
> boot config.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> Documentation/admin-guide/bootconfig.rst | 2
> Documentation/trace/boottime-trace.rst | 184 ++++++++++++++++++++++++++++++
> Documentation/trace/index.rst | 1
> 3 files changed, 187 insertions(+)
> create mode 100644 Documentation/trace/boottime-trace.rst
>

> diff --git a/Documentation/trace/boottime-trace.rst b/Documentation/trace/boottime-trace.rst
> new file mode 100644
> index 000000000000..1d10fdebf1b2
> --- /dev/null
> +++ b/Documentation/trace/boottime-trace.rst
> @@ -0,0 +1,184 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=================
> +Boot-time tracing
> +=================
> +
> +:Author: Masami Hiramatsu <[email protected]>
> +
> +Overview
> +========
> +
> +Boot-time tracing allows users to trace boot-time process including
> +device initialization with full features of ftrace including per-event
> +filter and actions, histograms, kprobe-events and synthetic-events,
> +and trace instances.
> +Since kernel cmdline is not enough to control these complex features,
> +this uses bootconfig file to describe tracing feature programming.
> +
> +Options in the Boot Config
> +==========================
> +
> +Here is the list of available options list for boot time tracing in
> +boot config file [1]_. All options are under "ftrace." or "kernel."
> +refix. See kernel parameters for the options which starts

prefix.

> +with "kernel." prefix [2]_.
> +
> +.. [1] See :ref:`Documentation/admin-guide/bootconfig.rst <bootconfig>`
> +.. [2] See :ref:`Documentation/admin-guide/kernel-parameters.rst <kernelparameters>`
> +
> +Ftrace Global Options
> +---------------------
> +
> +Ftrace global options have "kernel." prefix in boot config, which means
> +these options are passed as a part of kernel legacy command line.
> +
> +kernel.tp_printk
> + Output trace-event data on printk buffer too.
> +
> +kernel.dump_on_oops [= MODE]
> + Dump ftrace on Oops. If MODE = 1 or omitted, dump trace buffer
> + on all CPUs. If MODE = 2, dump a buffer on a CPU which kicks Oops.
> +
> +kernel.traceoff_on_warning
> + Stop tracing if WARN_ON() occurs.
> +
> +kernel.fgraph_max_depth = MAX_DEPTH
> + Set MAX_DEPTH to maximum depth of fgraph tracer.
> +
> +kernel.fgraph_filters = FILTER[, FILTER2...]
> + Add fgraph tracing function filters.
> +
> +kernel.fgraph_notraces = FILTER[, FILTER2...]
> + Add fgraph non tracing function filters.

non-tracing

> +
> +
> +Ftrace Per-instance Options
> +---------------------------
> +
> +These options can be used for each instance including global ftrace node.
> +
> +ftrace.[instance.INSTANCE.]options = OPT1[, OPT2[...]]
> + Enable given ftrace options.
> +
> +ftrace.[instance.INSTANCE.]trace_clock = CLOCK
> + Set given CLOCK to ftrace's trace_clock.
> +
> +ftrace.[instance.INSTANCE.]buffer_size = SIZE
> + Configure ftrace buffer size to SIZE. You can use "KB" or "MB"
> + for that SIZE.
> +
> +ftrace.[instance.INSTANCE.]alloc_snapshot
> + Allocate snapshot buffer.
> +
> +ftrace.[instance.INSTANCE.]cpumask = CPUMASK
> + Set CPUMASK as trace cpu-mask.
> +
> +ftrace.[instance.INSTANCE.]events = EVENT[, EVENT2[...]]
> + Enable given events on boot. You can use a wild card in EVENT.
> +
> +ftrace.[instance.INSTANCE.]tracer = TRACER
> + Set TRACER to current tracer on boot. (e.g. function)
> +
> +ftrace.[instance.INSTANCE.]ftrace.filters
> + This will take an array of tracing function filter rules

end with '.' as above descriptions.

> +
> +ftrace.[instance.INSTANCE.]ftrace.notraces
> + This will take an array of NON-tracing function filter rules

ditto

> +
> +
> +Ftrace Per-Event Options
> +------------------------
> +
> +These options are setting per-event options.
> +
> +ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
> + Enables GROUP:EVENT tracing.

Enable

> +
> +ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
> + Set FILTER rule to the GROUP:EVENT.
> +
> +ftrace.[instance.INSTANCE.]event.GROUP.EVENT.actions = ACTION[, ACTION2[...]]
> + Set ACTIONs to the GROUP:EVENT.
> +
> +ftrace.[instance.INSTANCE.]event.kprobes.EVENT.probes = PROBE[, PROBE2[...]]
> + Defines new kprobe event based on PROBEs. It is able to define
> + multiple probes on one event, but those must have same type of
> + arguments. This option is available only for the event which
> + group name is "kprobes".
> +
> +ftrace.[instance.INSTANCE.]event.synthetic.EVENT.fields = FIELD[, FIELD2[...]]
> + Defines new synthetic event with FIELDs. Each field should be
> + "type varname".
> +
> +Note that kprobe and synthetic event definitions can be written under
> +instance node, but those are also visible from other instances. So please
> +take care for event name conflict.
> +
> +
> +Examples
> +========
> +
> +For example, to add filter and actions for each event, define kprobe
> +events, and synthetic events with histogram, write a boot config like
> +below::
> +
> + ftrace.event {
> + task.task_newtask {
> + filter = "pid < 128"
> + enable
> + }
> + kprobes.vfs_read {
> + probes = "vfs_read $arg1 $arg2"
> + filter = "common_pid < 200"
> + enable
> + }
> + synthetic.initcall_latency {
> + fields = "unsigned long func", "u64 lat"
> + actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
> + }
> + initcall.initcall_start {
> + actions = "hist:keys=func:ts0=common_timestamp.usecs"
> + }
> + initcall.initcall_finish {
> + actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
> + }
> + }
> +
> +Also, boottime tracing supports "instance" node, which allows us to run

boot-time [for consistency]

> +several tracers for different purpose at once. For example, one tracer
> +is for tracing functions start with "user\_", and others tracing "kernel\_"

starting

> +functions, you can write boot config as below::
> +
> + ftrace.instance {
> + foo {
> + tracer = "function"
> + ftrace.filters = "user_*"
> + }
> + bar {
> + tracer = "function"
> + ftrace.filters = "kernel_*"
> + }
> + }
> +
> +The instance node also accepts event nodes so that each instance
> +can customize its event tracing.
> +
> +This boot-time tracing also supports ftrace kernel parameters via boot
> +config.
> +For example, following kernel parameters::
> +
> + trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
> +
> +This can be written in boot config like below::
> +
> + kernel {
> + trace_options = sym-addr
> + trace_event = "initcall:*"
> + tp_printk
> + trace_buf_size = 1M
> + ftrace = function
> + ftrace_filter = "vfs*"
> + }
> +
> +Note that parameters start with "kernel" prefix instead of "ftrace".

HTH.
--
~Randy

2020-01-18 18:30:35

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v6 09/22] Documentation: bootconfig: Add a doc for extended boot config

Hi,

Editorial comments/corrections below...

On 1/10/20 8:05 AM, Masami Hiramatsu wrote:
> Add a documentation for extended boot config under
> admin-guide, since it is including the syntax of boot config.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> Changes in v6:
> - Add a note about comment after value.
> Changes in v5:
> - Fix to insert bootconfig to TOC list alphabetically.
> - Add notes about avaliable characters in values.
> - Fix to use correct quotes (``) for .rst.
> Changes in v4:
> - Rename suppremental kernel command line to boot config.

supplemental

> - Update document according to the recent changes.
> - Add How to load it on boot.
> - Style bugfix.
> ---
> Documentation/admin-guide/bootconfig.rst | 184 ++++++++++++++++++++++++++++++
> Documentation/admin-guide/index.rst | 1
> MAINTAINERS | 1
> 3 files changed, 186 insertions(+)
> create mode 100644 Documentation/admin-guide/bootconfig.rst
>

> diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
> new file mode 100644
> index 000000000000..f7475df2a718
> --- /dev/null
> +++ b/Documentation/admin-guide/bootconfig.rst
> @@ -0,0 +1,184 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==================
> +Boot Configuration
> +==================
> +
> +:Author: Masami Hiramatsu <[email protected]>
> +
> +Overview
> +========
> +
> +The boot configuration is expanding current kernel cmdline to support

expands the current kernel command line to support

> +additional key-value data when boot the kernel in an efficient way.

booting

> +This allows adoministrators to pass a structured-Key config file.

administrators

> +
> +Config File Syntax
> +==================
> +
> +The boot config syntax is a simple structured key-value. Each key consists
> +of dot-connected-words, and key and value are connected by "=". The value
> +has to be terminated by semi-colon (``;``) or newline (``\n``).
> +For array value, array entries are separated by comma (``,``). ::
> +
> +KEY[.WORD[...]] = VALUE[, VALUE2[...]][;]

(just a note: spaces are OK here, unlike in kernel command line syntax [unless quoted].)

> +
> +Each key word must contain only alphabets, numbers, dash (``-``) or underscore
> +(``_``). And each value only contains printable characters or spaces except
> +for delimiters such as semi-colon (``;``), new-line (``\n``), comma (``,``),
> +hash (``#``) and closing brace (``}``).

what about opening brace '{'?

> +
> +If you want to use those delimiters in a value, you can use either double-
> +quotes (``"VALUE"``) or single-quotes (``'VALUE'``) to quote it. Note that
> +you can not escape these quotes.
> +
> +There can be a key which doesn't have value or has an empty value. Those keys
> +are used for checking the key exists or not (like a boolean).

I would say: checking if the key exists or not

> +
> +Key-Value Syntax
> +----------------
> +
> +The boot config file syntax allows user to merge partially same word keys
> +by brace. For example::
> +
> + foo.bar.baz = value1
> + foo.bar.qux.quux = value2
> +
> +These can be written also in::
> +
> + foo.bar {
> + baz = value1
> + qux.quux = value2
> + }
> +
> +Or more shorter, written as following::
> +
> + foo.bar { baz = value1; qux.quux = value2 }
> +
> +In both styles, same key words are automatically merged when parsing it
> +at boot time. So you can append similar trees or key-values.
> +
> +Comments
> +--------
> +
> +The config syntax accepts shell-script style comments. The comments start

s/start/starting/

> +with hash ("#") until newline ("\n") will be ignored.
> +
> +::
> +
> + # comment line
> + foo = value # value is set to foo.
> + bar = 1, # 1st element
> + 2, # 2nd element
> + 3 # 3rd element
> +
> +This is parsed as below::
> +
> + foo = value
> + bar = 1, 2, 3
> +
> +Note that you can not put a comment between value and delimiter(``,`` or
> +``;``). This means following config has a syntax error ::
> +
> + key = 1 # comment
> + ,2
> +
> +
> +/proc/bootconfig
> +================
> +
> +/proc/bootconfig is a user-space interface of the boot config.
> +Unlike /proc/cmdline, this file shows the key-value style list.
> +Each key-value pair is shown in each line with following style::
> +
> + KEY[.WORDS...] = "[VALUE]"[,"VALUE2"...]
> +
> +
> +Boot Kernel With a Boot Config
> +==============================
> +
> +Since the boot configuration file is loaded with initrd, it will be added
> +to the end of the initrd (initramfs) image file. The Linux kernel decodes
> +the last part of the initrd image in memory to get the boot configuration
> +data.
> +Because of this "piggyback" method, there is no need to change or
> +update the boot loader and the kernel image itself.
> +
> +To do this operation, Linux kernel provides "bootconfig" command under
> +tools/bootconfig, which allows admin to apply or delete the config file
> +to/from initrd image. You can build it by follwoing command::

by the following

> +
> + # make -C tools/bootconfig
> +
> +To add your boot config file to initrd image, run bootconfig as below
> +(Old data is removed automatically if exists)::
> +
> + # tools/bootconfig/bootconfig -a your-config /boot/initrd.img-X.Y.Z
> +
> +To remove the config from the image, you can use -d option as below::
> +
> + # tools/bootconfig/bootconfig -d /boot/initrd.img-X.Y.Z
> +
> +
> +C onfig File Limitation

Config

> +======================
> +
> +Currently the maximum config size size is 32KB and the total key-words (not
> +key-value entries) must be under 1024 nodes.
> +Note: this is not the number of entries but nodes, an entry must consume
> +more than 2 nodes (a key-word and a value). So theoretically, it will be
> +up to 512 key-value pairs. If keys contains 3 words in average, it can
> +contain 256 key-value pairs. In most cases, the number of config items
> +will be under 100 entries and smaller than 8KB, so it would be enough.
> +If the node number exceeds 1024, parser returns an error even if the file
> +size is smaller than 32KB.
> +Anyway, since bootconfig command verifies it when appending a boot config
> +to initrd image, user can notice it before boot.
> +
> +
> +Bootconfig APIs
> +===============
> +
> +User can query or loop on key-value pairs, also it is possible to find
> +a root (prefix) key node and find key-values under that node.
> +
> +If you have a key string, you can query the value directly with the key
> +using xbc_find_value(). If you want to know what keys exist in the SKC
> +tree, you can use xbc_for_each_key_value() to iterate key-value pairs.
> +Note that you need to use xbc_array_for_each_value() for accessing
> +each arraies value, e.g.::

array's
(I think)

> +
> + vnode = NULL;
> + xbc_find_value("key.word", &vnode);
> + if (vnode && xbc_node_is_array(vnode))
> + xbc_array_for_each_value(vnode, value) {
> + printk("%s ", value);
> + }
> +
> +If you want to focus on keys which has a prefix string, you can use

have

> +xbc_find_node() to find a node which prefix key words, and iterate

[confusing above]

> +keys under the prefix node with xbc_node_for_each_key_value().
> +
> +But the most typical usage is to get the named value under prefix
> +or get the named array under prefix as below::
> +
> + root = xbc_find_node("key.prefix");
> + value = xbc_node_find_value(root, "option", &vnode);
> + ...
> + xbc_node_for_each_array_value(root, "array-option", value, anode) {
> + ...
> + }
> +
> +This accesses a value of "key.prefix.option" and an array of
> +"key.prefix.array-option".
> +
> +Locking is not needed, since after initialized, the config becomes readonly.

after initialization,

> +All data and keys must be copied if you need to modify it.
> +
> +
> +Functions and structures
> +========================
> +
> +.. kernel-doc:: include/linux/bootconfig.h
> +.. kernel-doc:: lib/bootconfig.c
> +

HTH.
--
~Randy

2020-01-18 18:35:08

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v6 01/22] bootconfig: Add Extra Boot Config support

On 1/10/20 8:03 AM, Masami Hiramatsu wrote:
> diff --git a/init/Kconfig b/init/Kconfig
> index a34064a031a5..63450d3bbf12 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1215,6 +1215,17 @@ source "usr/Kconfig"
>
> endif
>
> +config BOOT_CONFIG
> + bool "Boot config support"
> + select LIBXBC
> + default y
> + help
> + Extra boot config allows system admin to pass a config file as
> + complemental extension of kernel cmdline when booting.
> + The boot config file is usually attached at the end of initramfs.

Is there some other location where it might be attached?
Please explain.

> +
> + If unsure, say Y.
> +
> choice
> prompt "Compiler optimization level"
> default CC_OPTIMIZE_FOR_PERFORMANCE


--
~Randy

2020-01-19 12:24:52

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v6 01/22] bootconfig: Add Extra Boot Config support

On Sat, 18 Jan 2020 10:33:01 -0800
Randy Dunlap <[email protected]> wrote:

> On 1/10/20 8:03 AM, Masami Hiramatsu wrote:
> > diff --git a/init/Kconfig b/init/Kconfig
> > index a34064a031a5..63450d3bbf12 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1215,6 +1215,17 @@ source "usr/Kconfig"
> >
> > endif
> >
> > +config BOOT_CONFIG
> > + bool "Boot config support"
> > + select LIBXBC
> > + default y
> > + help
> > + Extra boot config allows system admin to pass a config file as
> > + complemental extension of kernel cmdline when booting.
> > + The boot config file is usually attached at the end of initramfs.
>
> Is there some other location where it might be attached?
> Please explain.

Oops, good catch!
No, it supports only initramfs.

I missed to leave the comment written in planning phase.
I need to update it.

Thank you!

>
> > +
> > + If unsure, say Y.
> > +
> > choice
> > prompt "Compiler optimization level"
> > default CC_OPTIMIZE_FOR_PERFORMANCE
>
>
> --
> ~Randy


--
Masami Hiramatsu <[email protected]>

2020-01-19 13:37:39

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v6 09/22] Documentation: bootconfig: Add a doc for extended boot config

Hi Randy,

On Sat, 18 Jan 2020 10:28:40 -0800
Randy Dunlap <[email protected]> wrote:

> Hi,
>
> Editorial comments/corrections below...

Thank you for your comments! This is very helpful for me.

>
> On 1/10/20 8:05 AM, Masami Hiramatsu wrote:
> > Add a documentation for extended boot config under
> > admin-guide, since it is including the syntax of boot config.
> >
> > Signed-off-by: Masami Hiramatsu <[email protected]>
> > ---
> > Changes in v6:
> > - Add a note about comment after value.
> > Changes in v5:
> > - Fix to insert bootconfig to TOC list alphabetically.
> > - Add notes about avaliable characters in values.
> > - Fix to use correct quotes (``) for .rst.
> > Changes in v4:
> > - Rename suppremental kernel command line to boot config.
>
> supplemental
>
> > - Update document according to the recent changes.
> > - Add How to load it on boot.
> > - Style bugfix.
> > ---
> > Documentation/admin-guide/bootconfig.rst | 184 ++++++++++++++++++++++++++++++
> > Documentation/admin-guide/index.rst | 1
> > MAINTAINERS | 1
> > 3 files changed, 186 insertions(+)
> > create mode 100644 Documentation/admin-guide/bootconfig.rst
> >
>
> > diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
> > new file mode 100644
> > index 000000000000..f7475df2a718
> > --- /dev/null
> > +++ b/Documentation/admin-guide/bootconfig.rst
> > @@ -0,0 +1,184 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +==================
> > +Boot Configuration
> > +==================
> > +
> > +:Author: Masami Hiramatsu <[email protected]>
> > +
> > +Overview
> > +========
> > +
> > +The boot configuration is expanding current kernel cmdline to support
>
> expands the current kernel command line to support

OK.

>
> > +additional key-value data when boot the kernel in an efficient way.
>
> booting

OK.

>
> > +This allows adoministrators to pass a structured-Key config file.
>
> administrators

Oops. OK.

>
> > +
> > +Config File Syntax
> > +==================
> > +
> > +The boot config syntax is a simple structured key-value. Each key consists
> > +of dot-connected-words, and key and value are connected by "=". The value
> > +has to be terminated by semi-colon (``;``) or newline (``\n``).
> > +For array value, array entries are separated by comma (``,``). ::
> > +
> > +KEY[.WORD[...]] = VALUE[, VALUE2[...]][;]
>
> (just a note: spaces are OK here, unlike in kernel command line syntax [unless quoted].)

Yes.

> > +
> > +Each key word must contain only alphabets, numbers, dash (``-``) or underscore
> > +(``_``). And each value only contains printable characters or spaces except
> > +for delimiters such as semi-colon (``;``), new-line (``\n``), comma (``,``),
> > +hash (``#``) and closing brace (``}``).
>
> what about opening brace '{'?

Good question! Since the bootconfig doesn't support anonymous key-word block,
opening brace doesn't become a delimiter. (So, the above explanation might better
use "except for *some* delimiters"...)

For example, following data should be wrong.

key = value { key2 = value }


>
> > +
> > +If you want to use those delimiters in a value, you can use either double-
> > +quotes (``"VALUE"``) or single-quotes (``'VALUE'``) to quote it. Note that
> > +you can not escape these quotes.
> > +
> > +There can be a key which doesn't have value or has an empty value. Those keys
> > +are used for checking the key exists or not (like a boolean).
>
> I would say: checking if the key exists or not

OK.

>
> > +
> > +Key-Value Syntax
> > +----------------
> > +
> > +The boot config file syntax allows user to merge partially same word keys
> > +by brace. For example::
> > +
> > + foo.bar.baz = value1
> > + foo.bar.qux.quux = value2
> > +
> > +These can be written also in::
> > +
> > + foo.bar {
> > + baz = value1
> > + qux.quux = value2
> > + }
> > +
> > +Or more shorter, written as following::
> > +
> > + foo.bar { baz = value1; qux.quux = value2 }
> > +
> > +In both styles, same key words are automatically merged when parsing it
> > +at boot time. So you can append similar trees or key-values.
> > +
> > +Comments
> > +--------
> > +
> > +The config syntax accepts shell-script style comments. The comments start
>
> s/start/starting/

OK.

>
> > +with hash ("#") until newline ("\n") will be ignored.
> > +
> > +::
> > +
> > + # comment line
> > + foo = value # value is set to foo.
> > + bar = 1, # 1st element
> > + 2, # 2nd element
> > + 3 # 3rd element
> > +
> > +This is parsed as below::
> > +
> > + foo = value
> > + bar = 1, 2, 3
> > +
> > +Note that you can not put a comment between value and delimiter(``,`` or
> > +``;``). This means following config has a syntax error ::
> > +
> > + key = 1 # comment
> > + ,2
> > +
> > +
> > +/proc/bootconfig
> > +================
> > +
> > +/proc/bootconfig is a user-space interface of the boot config.
> > +Unlike /proc/cmdline, this file shows the key-value style list.
> > +Each key-value pair is shown in each line with following style::
> > +
> > + KEY[.WORDS...] = "[VALUE]"[,"VALUE2"...]
> > +
> > +
> > +Boot Kernel With a Boot Config
> > +==============================
> > +
> > +Since the boot configuration file is loaded with initrd, it will be added
> > +to the end of the initrd (initramfs) image file. The Linux kernel decodes
> > +the last part of the initrd image in memory to get the boot configuration
> > +data.
> > +Because of this "piggyback" method, there is no need to change or
> > +update the boot loader and the kernel image itself.
> > +
> > +To do this operation, Linux kernel provides "bootconfig" command under
> > +tools/bootconfig, which allows admin to apply or delete the config file
> > +to/from initrd image. You can build it by follwoing command::
>
> by the following

Oops, a typo...

>
> > +
> > + # make -C tools/bootconfig
> > +
> > +To add your boot config file to initrd image, run bootconfig as below
> > +(Old data is removed automatically if exists)::
> > +
> > + # tools/bootconfig/bootconfig -a your-config /boot/initrd.img-X.Y.Z
> > +
> > +To remove the config from the image, you can use -d option as below::
> > +
> > + # tools/bootconfig/bootconfig -d /boot/initrd.img-X.Y.Z
> > +
> > +
> > +C onfig File Limitation
>
> Config

Oops

>
> > +======================
> > +
> > +Currently the maximum config size size is 32KB and the total key-words (not
> > +key-value entries) must be under 1024 nodes.
> > +Note: this is not the number of entries but nodes, an entry must consume
> > +more than 2 nodes (a key-word and a value). So theoretically, it will be
> > +up to 512 key-value pairs. If keys contains 3 words in average, it can
> > +contain 256 key-value pairs. In most cases, the number of config items
> > +will be under 100 entries and smaller than 8KB, so it would be enough.
> > +If the node number exceeds 1024, parser returns an error even if the file
> > +size is smaller than 32KB.
> > +Anyway, since bootconfig command verifies it when appending a boot config
> > +to initrd image, user can notice it before boot.
> > +
> > +
> > +Bootconfig APIs
> > +===============
> > +
> > +User can query or loop on key-value pairs, also it is possible to find
> > +a root (prefix) key node and find key-values under that node.
> > +
> > +If you have a key string, you can query the value directly with the key
> > +using xbc_find_value(). If you want to know what keys exist in the SKC
> > +tree, you can use xbc_for_each_key_value() to iterate key-value pairs.
> > +Note that you need to use xbc_array_for_each_value() for accessing
> > +each arraies value, e.g.::
>
> array's
> (I think)

Yes, OK.

>
> > +
> > + vnode = NULL;
> > + xbc_find_value("key.word", &vnode);
> > + if (vnode && xbc_node_is_array(vnode))
> > + xbc_array_for_each_value(vnode, value) {
> > + printk("%s ", value);
> > + }
> > +
> > +If you want to focus on keys which has a prefix string, you can use
>
> have

OK.

>
> > +xbc_find_node() to find a node which prefix key words, and iterate
>
> [confusing above]

Ah, it should be "to find a node by the prefix string,"


>
> > +keys under the prefix node with xbc_node_for_each_key_value().
> > +
> > +But the most typical usage is to get the named value under prefix
> > +or get the named array under prefix as below::
> > +
> > + root = xbc_find_node("key.prefix");
> > + value = xbc_node_find_value(root, "option", &vnode);
> > + ...
> > + xbc_node_for_each_array_value(root, "array-option", value, anode) {
> > + ...
> > + }
> > +
> > +This accesses a value of "key.prefix.option" and an array of
> > +"key.prefix.array-option".
> > +
> > +Locking is not needed, since after initialized, the config becomes readonly.
>
> after initialization,

OK.

>
> > +All data and keys must be copied if you need to modify it.
> > +
> > +
> > +Functions and structures
> > +========================
> > +
> > +.. kernel-doc:: include/linux/bootconfig.h
> > +.. kernel-doc:: lib/bootconfig.c
> > +
>
> HTH.

Thank you very much!

> --
> ~Randy


--
Masami Hiramatsu <[email protected]>

2020-01-19 14:16:51

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v6 22/22] Documentation: tracing: Add boot-time tracing document

Hi Randy,

Thank you for your comments!

On Sat, 18 Jan 2020 10:14:08 -0800
Randy Dunlap <[email protected]> wrote:

> Hi,
>
> Here are a few editorial comments for you...
>
>
> On 1/10/20 8:07 AM, Masami Hiramatsu wrote:
> > Add a documentation about boot-time tracing options in
> > boot config.
> >
> > Signed-off-by: Masami Hiramatsu <[email protected]>
> > ---
> > Documentation/admin-guide/bootconfig.rst | 2
> > Documentation/trace/boottime-trace.rst | 184 ++++++++++++++++++++++++++++++
> > Documentation/trace/index.rst | 1
> > 3 files changed, 187 insertions(+)
> > create mode 100644 Documentation/trace/boottime-trace.rst
> >
>
> > diff --git a/Documentation/trace/boottime-trace.rst b/Documentation/trace/boottime-trace.rst
> > new file mode 100644
> > index 000000000000..1d10fdebf1b2
> > --- /dev/null
> > +++ b/Documentation/trace/boottime-trace.rst
> > @@ -0,0 +1,184 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=================
> > +Boot-time tracing
> > +=================
> > +
> > +:Author: Masami Hiramatsu <[email protected]>
> > +
> > +Overview
> > +========
> > +
> > +Boot-time tracing allows users to trace boot-time process including
> > +device initialization with full features of ftrace including per-event
> > +filter and actions, histograms, kprobe-events and synthetic-events,
> > +and trace instances.
> > +Since kernel cmdline is not enough to control these complex features,
> > +this uses bootconfig file to describe tracing feature programming.
> > +
> > +Options in the Boot Config
> > +==========================
> > +
> > +Here is the list of available options list for boot time tracing in
> > +boot config file [1]_. All options are under "ftrace." or "kernel."
> > +refix. See kernel parameters for the options which starts
>
> prefix.

Oops, OK.

>
> > +with "kernel." prefix [2]_.
> > +
> > +.. [1] See :ref:`Documentation/admin-guide/bootconfig.rst <bootconfig>`
> > +.. [2] See :ref:`Documentation/admin-guide/kernel-parameters.rst <kernelparameters>`
> > +
> > +Ftrace Global Options
> > +---------------------
> > +
> > +Ftrace global options have "kernel." prefix in boot config, which means
> > +these options are passed as a part of kernel legacy command line.
> > +
> > +kernel.tp_printk
> > + Output trace-event data on printk buffer too.
> > +
> > +kernel.dump_on_oops [= MODE]
> > + Dump ftrace on Oops. If MODE = 1 or omitted, dump trace buffer
> > + on all CPUs. If MODE = 2, dump a buffer on a CPU which kicks Oops.
> > +
> > +kernel.traceoff_on_warning
> > + Stop tracing if WARN_ON() occurs.
> > +
> > +kernel.fgraph_max_depth = MAX_DEPTH
> > + Set MAX_DEPTH to maximum depth of fgraph tracer.
> > +
> > +kernel.fgraph_filters = FILTER[, FILTER2...]
> > + Add fgraph tracing function filters.
> > +
> > +kernel.fgraph_notraces = FILTER[, FILTER2...]
> > + Add fgraph non tracing function filters.
>
> non-tracing

OK.

>
> > +
> > +
> > +Ftrace Per-instance Options
> > +---------------------------
> > +
> > +These options can be used for each instance including global ftrace node.
> > +
> > +ftrace.[instance.INSTANCE.]options = OPT1[, OPT2[...]]
> > + Enable given ftrace options.
> > +
> > +ftrace.[instance.INSTANCE.]trace_clock = CLOCK
> > + Set given CLOCK to ftrace's trace_clock.
> > +
> > +ftrace.[instance.INSTANCE.]buffer_size = SIZE
> > + Configure ftrace buffer size to SIZE. You can use "KB" or "MB"
> > + for that SIZE.
> > +
> > +ftrace.[instance.INSTANCE.]alloc_snapshot
> > + Allocate snapshot buffer.
> > +
> > +ftrace.[instance.INSTANCE.]cpumask = CPUMASK
> > + Set CPUMASK as trace cpu-mask.
> > +
> > +ftrace.[instance.INSTANCE.]events = EVENT[, EVENT2[...]]
> > + Enable given events on boot. You can use a wild card in EVENT.
> > +
> > +ftrace.[instance.INSTANCE.]tracer = TRACER
> > + Set TRACER to current tracer on boot. (e.g. function)
> > +
> > +ftrace.[instance.INSTANCE.]ftrace.filters
> > + This will take an array of tracing function filter rules
>
> end with '.' as above descriptions.

Yes, I missed it.

>
> > +
> > +ftrace.[instance.INSTANCE.]ftrace.notraces
> > + This will take an array of NON-tracing function filter rules
>
> ditto

OK.

>
> > +
> > +
> > +Ftrace Per-Event Options
> > +------------------------
> > +
> > +These options are setting per-event options.
> > +
> > +ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
> > + Enables GROUP:EVENT tracing.
>
> Enable

OK.

>
> > +
> > +ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
> > + Set FILTER rule to the GROUP:EVENT.
> > +
> > +ftrace.[instance.INSTANCE.]event.GROUP.EVENT.actions = ACTION[, ACTION2[...]]
> > + Set ACTIONs to the GROUP:EVENT.
> > +
> > +ftrace.[instance.INSTANCE.]event.kprobes.EVENT.probes = PROBE[, PROBE2[...]]
> > + Defines new kprobe event based on PROBEs. It is able to define
> > + multiple probes on one event, but those must have same type of
> > + arguments. This option is available only for the event which
> > + group name is "kprobes".
> > +
> > +ftrace.[instance.INSTANCE.]event.synthetic.EVENT.fields = FIELD[, FIELD2[...]]
> > + Defines new synthetic event with FIELDs. Each field should be
> > + "type varname".
> > +
> > +Note that kprobe and synthetic event definitions can be written under
> > +instance node, but those are also visible from other instances. So please
> > +take care for event name conflict.
> > +
> > +
> > +Examples
> > +========
> > +
> > +For example, to add filter and actions for each event, define kprobe
> > +events, and synthetic events with histogram, write a boot config like
> > +below::
> > +
> > + ftrace.event {
> > + task.task_newtask {
> > + filter = "pid < 128"
> > + enable
> > + }
> > + kprobes.vfs_read {
> > + probes = "vfs_read $arg1 $arg2"
> > + filter = "common_pid < 200"
> > + enable
> > + }
> > + synthetic.initcall_latency {
> > + fields = "unsigned long func", "u64 lat"
> > + actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
> > + }
> > + initcall.initcall_start {
> > + actions = "hist:keys=func:ts0=common_timestamp.usecs"
> > + }
> > + initcall.initcall_finish {
> > + actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
> > + }
> > + }
> > +
> > +Also, boottime tracing supports "instance" node, which allows us to run
>
> boot-time [for consistency]

OK.

>
> > +several tracers for different purpose at once. For example, one tracer
> > +is for tracing functions start with "user\_", and others tracing "kernel\_"
>
> starting

OK.

>
> > +functions, you can write boot config as below::
> > +
> > + ftrace.instance {
> > + foo {
> > + tracer = "function"
> > + ftrace.filters = "user_*"
> > + }
> > + bar {
> > + tracer = "function"
> > + ftrace.filters = "kernel_*"
> > + }
> > + }
> > +
> > +The instance node also accepts event nodes so that each instance
> > +can customize its event tracing.
> > +
> > +This boot-time tracing also supports ftrace kernel parameters via boot
> > +config.
> > +For example, following kernel parameters::
> > +
> > + trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
> > +
> > +This can be written in boot config like below::
> > +
> > + kernel {
> > + trace_options = sym-addr
> > + trace_event = "initcall:*"
> > + tp_printk
> > + trace_buf_size = 1M
> > + ftrace = function
> > + ftrace_filter = "vfs*"
> > + }
> > +
> > +Note that parameters start with "kernel" prefix instead of "ftrace".
>
> HTH.

Very helpful. Thanks!


--
Masami Hiramatsu <[email protected]>

2020-01-19 14:21:54

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v6 00/22] tracing: bootconfig: Boot-time tracing and Extra boot config

Hi Steve,

Thanks for pick this series on your tree. I would like to fix some patches
according to Randy's comments. Should I update this series or just incremental
updates on top of your tree?

Thank you,

On Sat, 11 Jan 2020 01:03:20 +0900
Masami Hiramatsu <[email protected]> wrote:

> Hello,
>
> This is the 6th version of the series for the boot-time tracing.
>
> Previous version is here.
>
> https://lkml.kernel.org/r/157736902773.11126.2531161235817081873.stgit@devnote2
>
> Thanks Steve for reivew. I fixed issues in this version.
>
> - [1/22] Remove "!!" from xbc_node_is_value().
> Redefine xbc_node_is_key() as "!xbc_node_is_value()".
> Fix a memory leak and a bug in __xbc_parse_value().
> Add xbc_destroy_all() to clean up the parsed data.
> Fix to treat comment right after value as a newline.
>
> - [3/22] Fix memory leaks.
> Fix to cleanup old bootconfig on memory before load new one.
> Show applying message.
> Suppress parse error with wrong data in initrd for delete_xbc().
>
> - [4/22] Add some testcases for value parser
> Add a test case for checking delete old bootconfig
>
> - [9/22] Add a note about comment after value.
>
> - [21/22] Fix to depend on CONFIG_DYNAMIC_FTRACE instead
> of CONFIG_FUNCTION_TRACER.
>
> This series can be applied on v5.5-rc5 or directly available at;
>
> https://github.com/mhiramat/linux.git ftrace-boottrace-v6
>
>
> Extra Boot Config
> =================
>
> Extra boot config allows admin to pass a tree-structured key-value
> list when booting up the kernel. This expands the kernel command
> line in an efficient way.
>
> Each key is described as a dot-jointed-words. And user can write
> the key-words in tree stlye. (In this version, the tailing ';'
> becomes optional. See Documentation/admin-guide/bootconfig.rst)
>
> For example,
>
> feature.option.foo = 1
> feature.option.bar = 2
>
> can be also written in
>
> feature.option {
> foo = 1
> bar = 2
> }
>
> or more compact,
>
> feature.option{foo=1;bar=2}
>
> (Note that in both style, the same words are merged automatically
> and make a single tree)
> All values are treated as a string, or array of strings, e.g.
>
> feature.options = "foo", "bar"
>
> User can see the loaded key-value list via /proc/bootconfig.
> The size is limited upto 32KB and 1024 key-words and values
> in total.
>
> Boot with a Boot Config
> =======================
>
> This version doesn't require to modify boot loaders anymore.
> The boot config is loaded with initrd, and there is new "bootconfig"
> command under tools/bootconfig.
> To add (append) a bootconfig file to an initrd, you can use the
> bootconfig command like:
>
> # tools/bootconfig/bootconfig -a your-config /boot/initrd.img-X.Y.Z
>
> This verifies the configuration file too.
>
>
> Boot-time Tracing
> =================
>
> Boot-time tracing supports following boot configs. Please read
> Documentation/trace/boottime-trace.rst for details.
>
> - kernel.dump_on_oops [= MODE]
> - kernel.traceoff_on_warning
> - kernel.tp_printk
> - kernel.fgraph_filters = FILTER[, FILTER2...]
> - kernel.fgraph_notraces = FILTER[, FILTER2...]
> - kernel.fgraph_max_depth = MAX_DEPTH
> - ftrace.[instance.INSTANCE.]options = OPT1[,OPT2...]
> - ftrace.[instance.INSTANCE.]trace_clock = CLOCK
> - ftrace.[instance.INSTANCE.]buffer_size = SIZE
> - ftrace.[instance.INSTANCE.]alloc_snapshot
> - ftrace.[instance.INSTANCE.]cpumask = CPUMASK
> - ftrace.[instance.INSTANCE.]events = EVENT[, EVENT2...]
> - ftrace.[instance.INSTANCE.]tracer = TRACER
> - ftrace.[instance.INSTANCE.]ftrace.filters
> - ftrace.[instance.INSTANCE.]ftrace.notraces
> - ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
> - ftrace.[instance.INSTANCE.]event.GROUP.EVENT.actions = ACTION[, ACTION2...]
> - ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
> - ftrace.[instance.INSTANCE.]event.kprobes.EVENT.probes = PROBE[, PROBE2...]
> - ftrace.[instance.INSTANCE.]event.synthetic.EVENT.fields = FIELD[, FIELD2...]
>
> Kernel and Init Command Line
> ============================
>
> Boot config also supports kernel and init command line parameters
> except for early kernel parameters.
>
> In boot config, all key-values start with "kernel." are automatically
> merged into user passed boot command line, and key-values which
> start with "init." are also passed to init. These options are visible
> on /proc/cmdline.
>
> For example,
>
> kernel {
> audit = on
> audit_backlog_limit = 256
> }
> init.systemd.unified_cgroup_hierarchy = 1
>
>
> Usage
> =====
>
> With this series, we can setup new kprobe and synthetic events, more
> complicated event filters and trigger actions including histogram
> via supplemental kernel cmdline.
>
> We can add filter and actions for each event, define kprobe events,
> and synthetic events with histogram like below.
>
> ftrace.event {
> task.task_newtask {
> filter = "pid < 128"
> enable
> }
> kprobes.vfs_read {
> probes = "vfs_read $arg1 $arg2"
> filter = "common_pid < 200"
> enable
> }
> synthetic.initcall_latency {
> fields = "unsigned long func", "u64 lat"
> actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
> }
> initcall.initcall_start {
> actions = "hist:keys=func:ts0=common_timestamp.usecs"
> }
> initcall.initcall_finish {
> actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
> }
> }
>
> Also, this supports "instance" node, which allows us to run several
> tracers for different purpose at once. For example, one tracer is for
> tracing functions start with "user_", and others tracing "kernel_",
> you can write boot config as:
>
> ftrace.instance {
> foo {
> tracer = "function"
> ftrace-filters = "user_*"
> }
> bar {
> tracer = "function"
> ftrace-filters = "function_*"
> }
> }
>
> The instance node also accepts event nodes so that each instance
> can customize its event tracing.
>
> This boot-time trace also supports ftrace kernel parameters.
> For example, following kernel parameters
>
> trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
>
> it can be written in boot config like below.
>
> ftrace {
> options = sym-addr
> events = "initcall:*"
> tp-printk
> buffer-size = 1MB
> ftrace-filters = "vfs*"
> }
>
> However, since the initialization timing is different, if you need
> to trace very early boot, please use normal kernel parameters.
>
> Some Notes
> ==========
>
> - To align the legacy command line rule, I made the quotes (double
> quotes or single quotes) not able to be escaped.
> Also, this rejects non-printable chars (except for space). Actually
> legacy cmdline accepts any of them, but it might confuse users if
> they put a control code by mistake. Imagine that they put a "\b"
> on it...
>
> - Since it is not easy to write boot-time tracing without any bug
> in bootconfig, a user-helper command will be needed.
> That command will generate a boot config file from current ftrace
> settings, or try to apply given boot config setting to the ftrace.
>
> Thank you,
>
> ---
>
> Masami Hiramatsu (22):
> bootconfig: Add Extra Boot Config support
> bootconfig: Load boot config from the tail of initrd
> tools: bootconfig: Add bootconfig command
> tools: bootconfig: Add bootconfig test script
> proc: bootconfig: Add /proc/bootconfig to show boot config list
> init/main.c: Alloc initcall_command_line in do_initcall() and free it
> bootconfig: init: Allow admin to use bootconfig for kernel command line
> bootconfig: init: Allow admin to use bootconfig for init command line
> Documentation: bootconfig: Add a doc for extended boot config
> tracing: Apply soft-disabled and filter to tracepoints printk
> tracing: kprobes: Output kprobe event to printk buffer
> tracing: kprobes: Register to dynevent earlier stage
> tracing: Accept different type for synthetic event fields
> tracing: Add NULL trace-array check in print_synth_event()
> tracing/boot: Add boot-time tracing
> tracing/boot: Add per-event settings
> tracing/boot Add kprobe event support
> tracing/boot: Add synthetic event support
> tracing/boot: Add instance node support
> tracing/boot: Add cpu_mask option support
> tracing/boot: Add function tracer filter options
> Documentation: tracing: Add boot-time tracing document
>
>
> Documentation/admin-guide/bootconfig.rst | 186 +++++
> Documentation/admin-guide/index.rst | 1
> Documentation/trace/boottime-trace.rst | 184 +++++
> Documentation/trace/index.rst | 1
> MAINTAINERS | 9
> fs/proc/Makefile | 1
> fs/proc/bootconfig.c | 89 ++
> include/linux/bootconfig.h | 224 ++++++
> include/linux/trace_events.h | 1
> init/Kconfig | 12
> init/main.c | 213 +++++
> kernel/trace/Kconfig | 9
> kernel/trace/Makefile | 1
> kernel/trace/trace.c | 63 +-
> kernel/trace/trace_boot.c | 353 +++++++++
> kernel/trace/trace_events.c | 1
> kernel/trace/trace_events_hist.c | 14
> kernel/trace/trace_events_trigger.c | 2
> kernel/trace/trace_kprobe.c | 81 +-
> lib/Kconfig | 3
> lib/Makefile | 2
> lib/bootconfig.c | 803 ++++++++++++++++++++
> tools/Makefile | 11
> tools/bootconfig/.gitignore | 1
> tools/bootconfig/Makefile | 23 +
> tools/bootconfig/include/linux/bootconfig.h | 7
> tools/bootconfig/include/linux/bug.h | 12
> tools/bootconfig/include/linux/ctype.h | 7
> tools/bootconfig/include/linux/errno.h | 7
> tools/bootconfig/include/linux/kernel.h | 18
> tools/bootconfig/include/linux/printk.h | 17
> tools/bootconfig/include/linux/string.h | 32 +
> tools/bootconfig/main.c | 354 +++++++++
> .../samples/bad-array-space-comment.bconf | 5
> tools/bootconfig/samples/bad-array.bconf | 2
> tools/bootconfig/samples/bad-dotword.bconf | 4
> tools/bootconfig/samples/bad-empty.bconf | 1
> tools/bootconfig/samples/bad-keyerror.bconf | 2
> tools/bootconfig/samples/bad-longkey.bconf | 1
> tools/bootconfig/samples/bad-manywords.bconf | 1
> tools/bootconfig/samples/bad-no-keyword.bconf | 2
> tools/bootconfig/samples/bad-nonprintable.bconf | 2
> tools/bootconfig/samples/bad-spaceword.bconf | 2
> tools/bootconfig/samples/bad-tree.bconf | 5
> tools/bootconfig/samples/bad-value.bconf | 3
> tools/bootconfig/samples/escaped.bconf | 3
> .../samples/good-array-space-comment.bconf | 4
> .../samples/good-comment-after-value.bconf | 1
> tools/bootconfig/samples/good-printables.bconf | 2
> tools/bootconfig/samples/good-simple.bconf | 11
> tools/bootconfig/samples/good-single.bconf | 4
> .../samples/good-space-after-value.bconf | 1
> tools/bootconfig/samples/good-tree.bconf | 12
> tools/bootconfig/test-bootconfig.sh | 105 +++
> 54 files changed, 2836 insertions(+), 79 deletions(-)
> create mode 100644 Documentation/admin-guide/bootconfig.rst
> create mode 100644 Documentation/trace/boottime-trace.rst
> create mode 100644 fs/proc/bootconfig.c
> create mode 100644 include/linux/bootconfig.h
> create mode 100644 kernel/trace/trace_boot.c
> create mode 100644 lib/bootconfig.c
> create mode 100644 tools/bootconfig/.gitignore
> create mode 100644 tools/bootconfig/Makefile
> create mode 100644 tools/bootconfig/include/linux/bootconfig.h
> create mode 100644 tools/bootconfig/include/linux/bug.h
> create mode 100644 tools/bootconfig/include/linux/ctype.h
> create mode 100644 tools/bootconfig/include/linux/errno.h
> create mode 100644 tools/bootconfig/include/linux/kernel.h
> create mode 100644 tools/bootconfig/include/linux/printk.h
> create mode 100644 tools/bootconfig/include/linux/string.h
> create mode 100644 tools/bootconfig/main.c
> create mode 100644 tools/bootconfig/samples/bad-array-space-comment.bconf
> create mode 100644 tools/bootconfig/samples/bad-array.bconf
> create mode 100644 tools/bootconfig/samples/bad-dotword.bconf
> create mode 100644 tools/bootconfig/samples/bad-empty.bconf
> create mode 100644 tools/bootconfig/samples/bad-keyerror.bconf
> create mode 100644 tools/bootconfig/samples/bad-longkey.bconf
> create mode 100644 tools/bootconfig/samples/bad-manywords.bconf
> create mode 100644 tools/bootconfig/samples/bad-no-keyword.bconf
> create mode 100644 tools/bootconfig/samples/bad-nonprintable.bconf
> create mode 100644 tools/bootconfig/samples/bad-spaceword.bconf
> create mode 100644 tools/bootconfig/samples/bad-tree.bconf
> create mode 100644 tools/bootconfig/samples/bad-value.bconf
> create mode 100644 tools/bootconfig/samples/escaped.bconf
> create mode 100644 tools/bootconfig/samples/good-array-space-comment.bconf
> create mode 100644 tools/bootconfig/samples/good-comment-after-value.bconf
> create mode 100644 tools/bootconfig/samples/good-printables.bconf
> create mode 100644 tools/bootconfig/samples/good-simple.bconf
> create mode 100644 tools/bootconfig/samples/good-single.bconf
> create mode 100644 tools/bootconfig/samples/good-space-after-value.bconf
> create mode 100644 tools/bootconfig/samples/good-tree.bconf
> create mode 100755 tools/bootconfig/test-bootconfig.sh
>
> --
> Masami Hiramatsu (Linaro) <[email protected]>


--
Masami Hiramatsu <[email protected]>

2020-01-19 15:00:47

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v6 00/22] tracing: bootconfig: Boot-time tracing and Extra boot config

On Sun, 19 Jan 2020 23:20:37 +0900
Masami Hiramatsu <[email protected]> wrote:

> Hi Steve,
>
> Thanks for pick this series on your tree. I would like to fix some patches
> according to Randy's comments. Should I update this series or just incremental
> updates on top of your tree?
>

Hi Masami,

Just send me incremental patches. I try not to ever rebase what I push
to linux-next (except for adding changes that don't affect the content
of the code, like adding acked-by to commit messages).

Thanks,

-- Steve

2020-02-07 18:05:00

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Sat, Jan 11, 2020 at 01:04:55AM +0900, Masami Hiramatsu wrote:
> Since the current kernel command line is too short to describe
> long and many options for init (e.g. systemd command line options),
> this allows admin to use boot config for init command line.
>
> All init command line under "init." keywords will be passed to
> init.
>
> For example,
>
> init.systemd {
> unified_cgroup_hierarchy = 1
> debug_shell
> default_timeout_start_sec = 60
> }
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> init/main.c | 31 ++++++++++++++++++++++++++++---
> 1 file changed, 28 insertions(+), 3 deletions(-)
>
> diff --git a/init/main.c b/init/main.c
> index c0017d9d16e7..dd7da62d99a5 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -139,6 +139,8 @@ char *saved_command_line;
> static char *static_command_line;
> /* Untouched extra command line */
> static char *extra_command_line;
> +/* Extra init arguments */
> +static char *extra_init_args;
>
> static char *execute_command;
> static char *ramdisk_execute_command;
> @@ -372,6 +374,8 @@ static void __init setup_boot_config(void)
> pr_info("Load boot config: %d bytes\n", size);
> /* keys starting with "kernel." are passed via cmdline */
> extra_command_line = xbc_make_cmdline("kernel");
> + /* Also, "init." keys are init arguments */
> + extra_init_args = xbc_make_cmdline("init");
> }
> }
> #else
> @@ -507,16 +511,18 @@ static inline void smp_prepare_cpus(unsigned int maxcpus) { }
> */
> static void __init setup_command_line(char *command_line)
> {
> - size_t len, xlen = 0;
> + size_t len, xlen = 0, ilen = 0;
>
> if (extra_command_line)
> xlen = strlen(extra_command_line);
> + if (extra_init_args)
> + ilen = strlen(extra_init_args) + 4; /* for " -- " */
>
> len = xlen + strlen(boot_command_line) + 1;
>
> - saved_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
> + saved_command_line = memblock_alloc(len + ilen, SMP_CACHE_BYTES);
> if (!saved_command_line)
> - panic("%s: Failed to allocate %zu bytes\n", __func__, len);
> + panic("%s: Failed to allocate %zu bytes\n", __func__, len + ilen);
>
> static_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
> if (!static_command_line)
> @@ -533,6 +539,22 @@ static void __init setup_command_line(char *command_line)
> }
> strcpy(saved_command_line + xlen, boot_command_line);
> strcpy(static_command_line + xlen, command_line);
> +
> + if (ilen) {
> + /*
> + * Append supplemental init boot args to saved_command_line
> + * so that user can check what command line options passed
> + * to init.
> + */
> + len = strlen(saved_command_line);
> + if (!strstr(boot_command_line, " -- ")) {
> + strcpy(saved_command_line + len, " -- ");
> + len += 4;
> + } else
> + saved_command_line[len++] = ' ';
> +
> + strcpy(saved_command_line + len, extra_init_args);
> + }

This isn't safe because it will destroy any argument with " -- " in
quotes and anything after it. For example, booting with:

thing=on acpi_osi="! -- " other=setting

will wreck acpi_osi's value and potentially overwrite "other=settings",
etc.

(Yes, this seems very unlikely, but you can't treat " -- " as special,
the command line string must be correct parsed for double quotes, as
parse_args() does.)

> }
>
> /*
> @@ -759,6 +781,9 @@ asmlinkage __visible void __init start_kernel(void)
> if (!IS_ERR_OR_NULL(after_dashes))
> parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
> NULL, set_init_arg);
> + if (extra_init_args)
> + parse_args("Setting extra init args", extra_init_args,
> + NULL, 0, -1, -1, NULL, set_init_arg);

Here is where you can append the extra_init_args, since parse_args()
will have done the work to find after_dashes correctly.

-Kees

>
> /*
> * These use large bootmem allocations and must precede
>

--
Kees Cook

2020-02-07 19:32:26

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Fri, Feb 07, 2020 at 10:03:16AM -0800, Kees Cook wrote:
> > +
> > + if (ilen) {
> > + /*
> > + * Append supplemental init boot args to saved_command_line
> > + * so that user can check what command line options passed
> > + * to init.
> > + */
> > + len = strlen(saved_command_line);
> > + if (!strstr(boot_command_line, " -- ")) {
> > + strcpy(saved_command_line + len, " -- ");
> > + len += 4;
> > + } else
> > + saved_command_line[len++] = ' ';
> > +
> > + strcpy(saved_command_line + len, extra_init_args);
> > + }
>
> This isn't safe because it will destroy any argument with " -- " in
> quotes and anything after it. For example, booting with:
>
> thing=on acpi_osi="! -- " other=setting
>
> will wreck acpi_osi's value and potentially overwrite "other=settings",
> etc.
>
> (Yes, this seems very unlikely, but you can't treat " -- " as special,
> the command line string must be correct parsed for double quotes, as
> parse_args() does.)
>

I think it won't overwrite anything, it will just leave out the " -- "
that should have been added?

I wonder if this is necessary, though -- since commit b88c50ac304a ("log
arguments and environment passed to init") the init arguments will be in
the kernel log anyway.

2020-02-07 19:47:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Fri, 7 Feb 2020 10:03:16 -0800
Kees Cook <[email protected]> wrote:

> > static void __init setup_command_line(char *command_line)
> > {
> > - size_t len, xlen = 0;
> > + size_t len, xlen = 0, ilen = 0;
> >
> > if (extra_command_line)
> > xlen = strlen(extra_command_line);
> > + if (extra_init_args)
> > + ilen = strlen(extra_init_args) + 4; /* for " -- " */
> >
> > len = xlen + strlen(boot_command_line) + 1;
> >
> > - saved_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
> > + saved_command_line = memblock_alloc(len + ilen, SMP_CACHE_BYTES);
> > if (!saved_command_line)
> > - panic("%s: Failed to allocate %zu bytes\n", __func__, len);
> > + panic("%s: Failed to allocate %zu bytes\n", __func__, len + ilen);
> >
> > static_command_line = memblock_alloc(len, SMP_CACHE_BYTES);
> > if (!static_command_line)
> > @@ -533,6 +539,22 @@ static void __init setup_command_line(char *command_line)
> > }
> > strcpy(saved_command_line + xlen, boot_command_line);
> > strcpy(static_command_line + xlen, command_line);
> > +
> > + if (ilen) {
> > + /*
> > + * Append supplemental init boot args to saved_command_line
> > + * so that user can check what command line options passed
> > + * to init.
> > + */
> > + len = strlen(saved_command_line);
> > + if (!strstr(boot_command_line, " -- ")) {
> > + strcpy(saved_command_line + len, " -- ");
> > + len += 4;
> > + } else
> > + saved_command_line[len++] = ' ';
> > +
> > + strcpy(saved_command_line + len, extra_init_args);
> > + }
>
> This isn't safe because it will destroy any argument with " -- " in
> quotes and anything after it. For example, booting with:
>
> thing=on acpi_osi="! -- " other=setting
>
> will wreck acpi_osi's value and potentially overwrite "other=settings",
> etc.
>
> (Yes, this seems very unlikely, but you can't treat " -- " as special,
> the command line string must be correct parsed for double quotes, as
> parse_args() does.)
>

This is not the args you are looking for. ;-)

There is a slight bug, but not as bad as you may think it is.
bootconfig (when added to the command line) will look for a json like
file appended to the initrd, and it will parse that. That's what all the
xbc_*() functions do (extended boot commandline). If one of the options
in that json like file is "init", then it will create the
extra_init_args, which will make ilen greater than zero.

The above if statement looks for that ' -- ', and if it doesn't find it
(strcmp() returns NULL when not found) it will than append " -- " to
the boot_command_line. If it is found, then the " -- " is not added. In
either case, the init args found in the json like file in the initrd is
appended to the saved_command_line.

I did say there's a slight bug here. If you have your condition, and
you add init arguments to that json file, it wont properly add the " --
", and the init arguments in that file will be ignored.

That should be fixed, and I think I was able to do that below. I also
noticed that we don't properly look for "bootconfig" either.

-- Steve




> > }
> >
> > /*
> > @@ -759,6 +781,9 @@ asmlinkage __visible void __init start_kernel(void)
> > if (!IS_ERR_OR_NULL(after_dashes))
> > parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
> > NULL, set_init_arg);
> > + if (extra_init_args)
> > + parse_args("Setting extra init args", extra_init_args,
> > + NULL, 0, -1, -1, NULL, set_init_arg);

diff --git a/init/main.c b/init/main.c
index 491f1cdb3105..113c8244e5f0 100644
--- a/init/main.c
+++ b/init/main.c
@@ -142,6 +142,15 @@ static char *extra_command_line;
/* Extra init arguments */
static char *extra_init_args;

+#ifdef CONFIG_BOOT_CONFIG
+/* Is bootconfig on command line? */
+static bool bootconfig_found;
+static bool initargs_found;
+#else
+# define bootconfig_found false
+# define initargs_found false
+#endif
+
static char *execute_command;
static char *ramdisk_execute_command;

@@ -336,17 +345,32 @@ u32 boot_config_checksum(unsigned char *p, u32 size)
return ret;
}

+static int __init bootconfig_params(char *param, char *val,
+ const char *unused, void *arg)
+{
+ if (strcmp(param, "bootconfig") == 0) {
+ bootconfig_found = true;
+ } else if (strcmp(param, "--") == 0) {
+ initargs_found = true;
+ }
+ return 0;
+}
+
static void __init setup_boot_config(const char *cmdline)
{
+ static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
u32 size, csum;
char *data, *copy;
const char *p;
u32 *hdr;
int ret;

- p = strstr(cmdline, "bootconfig");
- if (!p || (p != cmdline && !isspace(*(p-1))) ||
- (p[10] && !isspace(p[10])))
+ /* All fall through to do_early_param. */
+ strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
+ parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
+ bootconfig_params);
+
+ if (!bootconfig_found)
return;

if (!initrd_end)
@@ -563,11 +587,12 @@ static void __init setup_command_line(char *command_line)
* to init.
*/
len = strlen(saved_command_line);
- if (!strstr(boot_command_line, " -- ")) {
+ if (initargs_found) {
+ saved_command_line[len++] = ' ';
+ } else {
strcpy(saved_command_line + len, " -- ");
len += 4;
- } else
- saved_command_line[len++] = ' ';
+ }

strcpy(saved_command_line + len, extra_init_args);
}

2020-02-08 00:47:17

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Fri, Feb 07, 2020 at 02:46:03PM -0500, Steven Rostedt wrote:
> On Fri, 7 Feb 2020 10:03:16 -0800
> Kees Cook <[email protected]> wrote:
> > > + len = strlen(saved_command_line);
> > > + if (!strstr(boot_command_line, " -- ")) {
> > > + strcpy(saved_command_line + len, " -- ");
> > > + len += 4;
> > > + } else
> > > + saved_command_line[len++] = ' ';
> > > +
> > > + strcpy(saved_command_line + len, extra_init_args);
> > > + }
> >
> > This isn't safe because it will destroy any argument with " -- " in
> > quotes and anything after it. For example, booting with:
> >
> > thing=on acpi_osi="! -- " other=setting
> >
> > will wreck acpi_osi's value and potentially overwrite "other=settings",
> > etc.
> >
> > (Yes, this seems very unlikely, but you can't treat " -- " as special,
> > the command line string must be correct parsed for double quotes, as
> > parse_args() does.)
> >
>
> This is not the args you are looking for. ;-)
>
> There is a slight bug, but not as bad as you may think it is.
> bootconfig (when added to the command line) will look for a json like
> file appended to the initrd, and it will parse that. That's what all the
> xbc_*() functions do (extended boot commandline). If one of the options
> in that json like file is "init", then it will create the
> extra_init_args, which will make ilen greater than zero.
>
> The above if statement looks for that ' -- ', and if it doesn't find it
> (strcmp() returns NULL when not found) it will than append " -- " to
> the boot_command_line. If it is found, then the " -- " is not added. In
> either case, the init args found in the json like file in the initrd is
> appended to the saved_command_line.
>
> I did say there's a slight bug here. If you have your condition, and
> you add init arguments to that json file, it wont properly add the " --
> ", and the init arguments in that file will be ignored.

Ah, right, it's even more slight, sorry, I had the strstr() in my head
still. So, yes, with an "init" section and a very goofy " -- " present
in a kernel bootparam string value, the appended init args will be
parsed as kernel options.

--
Kees Cook

2020-08-02 02:34:56

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Fri, Feb 07, 2020 at 02:46:03PM -0500, Steven Rostedt wrote:
>
> diff --git a/init/main.c b/init/main.c
> index 491f1cdb3105..113c8244e5f0 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -142,6 +142,15 @@ static char *extra_command_line;
> /* Extra init arguments */
> static char *extra_init_args;
>
> +#ifdef CONFIG_BOOT_CONFIG
> +/* Is bootconfig on command line? */
> +static bool bootconfig_found;
> +static bool initargs_found;
> +#else
> +# define bootconfig_found false
> +# define initargs_found false
> +#endif
> +
> static char *execute_command;
> static char *ramdisk_execute_command;
>
> @@ -336,17 +345,32 @@ u32 boot_config_checksum(unsigned char *p, u32 size)
> return ret;
> }
>
> +static int __init bootconfig_params(char *param, char *val,
> + const char *unused, void *arg)
> +{
> + if (strcmp(param, "bootconfig") == 0) {
> + bootconfig_found = true;
> + } else if (strcmp(param, "--") == 0) {
> + initargs_found = true;
> + }
> + return 0;
> +}
> +

I came across this as I was poking around some of the command line
parsing. AFAICT, initargs_found will never be set to true here, because
parse_args handles "--" itself by immediately returning: it doesn't
invoke the callback for it. So you'd instead have to check the return of
parse_args("bootconfig"...) to detect the initargs_found case.

> static void __init setup_boot_config(const char *cmdline)
> {
> + static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
> u32 size, csum;
> char *data, *copy;
> const char *p;
> u32 *hdr;
> int ret;
>
> - p = strstr(cmdline, "bootconfig");
> - if (!p || (p != cmdline && !isspace(*(p-1))) ||
> - (p[10] && !isspace(p[10])))
> + /* All fall through to do_early_param. */
> + strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
> + parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
> + bootconfig_params);
> +
> + if (!bootconfig_found)
> return;

2020-08-03 15:04:25

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Sat, 1 Aug 2020 22:33:18 -0400
Arvind Sankar <[email protected]> wrote:

> On Fri, Feb 07, 2020 at 02:46:03PM -0500, Steven Rostedt wrote:
> >
> > diff --git a/init/main.c b/init/main.c
> > index 491f1cdb3105..113c8244e5f0 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -142,6 +142,15 @@ static char *extra_command_line;
> > /* Extra init arguments */
> > static char *extra_init_args;
> >
> > +#ifdef CONFIG_BOOT_CONFIG
> > +/* Is bootconfig on command line? */
> > +static bool bootconfig_found;
> > +static bool initargs_found;
> > +#else
> > +# define bootconfig_found false
> > +# define initargs_found false
> > +#endif
> > +
> > static char *execute_command;
> > static char *ramdisk_execute_command;
> >
> > @@ -336,17 +345,32 @@ u32 boot_config_checksum(unsigned char *p, u32 size)
> > return ret;
> > }
> >
> > +static int __init bootconfig_params(char *param, char *val,
> > + const char *unused, void *arg)
> > +{
> > + if (strcmp(param, "bootconfig") == 0) {
> > + bootconfig_found = true;
> > + } else if (strcmp(param, "--") == 0) {
> > + initargs_found = true;
> > + }
> > + return 0;
> > +}
> > +
>
> I came across this as I was poking around some of the command line
> parsing. AFAICT, initargs_found will never be set to true here, because
> parse_args handles "--" itself by immediately returning: it doesn't
> invoke the callback for it. So you'd instead have to check the return of
> parse_args("bootconfig"...) to detect the initargs_found case.

Oops, good catch!
Does this fixes the problem?

From b078e8b02ad54aea74f8c3645fc11dd3a1cdc1e7 Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <[email protected]>
Date: Mon, 3 Aug 2020 23:57:29 +0900
Subject: [PATCH] bootconfig: Fix to find the initargs correctly

Since the parse_args() stops parsing at '--', bootconfig_params()
will never get the '--' as param and initargs_found never be true.
In the result, if we pass some init arguments via the bootconfig,
those are always appended to the kernel command line with '--'
and user will see double '--'.

To fix this correctly, check the return value of parse_args()
and set initargs_found true if the return value is not an error
but a valid address.

Fixes: f61872bb58a1 ("bootconfig: Use parse_args() to find bootconfig and '--'")
Cc: [email protected]
Reported-by: Arvind Sankar <[email protected]>
Suggested-by: Arvind Sankar <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
---
init/main.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/init/main.c b/init/main.c
index 0ead83e86b5a..627f9230dbe8 100644
--- a/init/main.c
+++ b/init/main.c
@@ -387,8 +387,6 @@ static int __init bootconfig_params(char *param, char *val,
{
if (strcmp(param, "bootconfig") == 0) {
bootconfig_found = true;
- } else if (strcmp(param, "--") == 0) {
- initargs_found = true;
}
return 0;
}
@@ -399,19 +397,23 @@ static void __init setup_boot_config(const char *cmdline)
const char *msg;
int pos;
u32 size, csum;
- char *data, *copy;
+ char *data, *copy, *err;
int ret;

/* Cut out the bootconfig data even if we have no bootconfig option */
data = get_boot_config_from_initrd(&size, &csum);

strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
- parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
- bootconfig_params);
+ err = parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
+ bootconfig_params);

- if (!bootconfig_found)
+ if (IS_ERR(err) || !bootconfig_found)
return;

+ /* parse_args() stops at '--' and returns an address */
+ if (!IS_ERR(err) && err)
+ initargs_found = true;
+
if (!data) {
pr_err("'bootconfig' found on command line, but no bootconfig found\n");
return;
--
2.25.1

2020-08-03 15:30:36

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Tue, Aug 04, 2020 at 12:03:45AM +0900, Masami Hiramatsu wrote:
> On Sat, 1 Aug 2020 22:33:18 -0400
> Arvind Sankar <[email protected]> wrote:
> >
> > I came across this as I was poking around some of the command line
> > parsing. AFAICT, initargs_found will never be set to true here, because
> > parse_args handles "--" itself by immediately returning: it doesn't
> > invoke the callback for it. So you'd instead have to check the return of
> > parse_args("bootconfig"...) to detect the initargs_found case.
>
> Oops, good catch!
> Does this fixes the problem?

Note I found the issue by code inspection, I don't have an actual test
case. But the change looks good to me, with one comment below.

>
> strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
> - parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
> - bootconfig_params);
> + err = parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
> + bootconfig_params);
>
> - if (!bootconfig_found)
> + if (IS_ERR(err) || !bootconfig_found)
> return;
>
> + /* parse_args() stops at '--' and returns an address */
> + if (!IS_ERR(err) && err)
> + initargs_found = true;
> +

I think you can drop the second IS_ERR, since we already checked that.

> if (!data) {
> pr_err("'bootconfig' found on command line, but no bootconfig found\n");
> return;
> --
> 2.25.1

2020-08-03 17:23:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Mon, 3 Aug 2020 11:29:59 -0400
Arvind Sankar <[email protected]> wrote:

> > + /* parse_args() stops at '--' and returns an address */
> > + if (!IS_ERR(err) && err)
> > + initargs_found = true;
> > +
>
> I think you can drop the second IS_ERR, since we already checked that.

Masami,

Can you send this with the update as a normal patch (not a Cc to this
thread). That way it gets caught by my patchwork scanning of my inbox.

Thanks!

(/me is currently going through all his patchwork patches to pull in
for the merge window.)

-- Steve

2020-08-04 00:29:51

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v6 08/22] bootconfig: init: Allow admin to use bootconfig for init command line

On Mon, 3 Aug 2020 13:22:38 -0400
Steven Rostedt <[email protected]> wrote:

> On Mon, 3 Aug 2020 11:29:59 -0400
> Arvind Sankar <[email protected]> wrote:
>
> > > + /* parse_args() stops at '--' and returns an address */
> > > + if (!IS_ERR(err) && err)
> > > + initargs_found = true;
> > > +
> >
> > I think you can drop the second IS_ERR, since we already checked that.
>
> Masami,
>
> Can you send this with the update as a normal patch (not a Cc to this
> thread). That way it gets caught by my patchwork scanning of my inbox.

OK, I'll update it.

>
> Thanks!
>
> (/me is currently going through all his patchwork patches to pull in
> for the merge window.)

Thank you!

>
> -- Steve


--
Masami Hiramatsu <[email protected]>