2017-12-26 07:46:56

by Masami Hiramatsu

[permalink] [raw]
Subject: [RFC PATCH bpf-next v2 0/4] Separate error injection table from kprobes

Hi Josef and Alexei,

Here are the 2nd version of patches to moving error injection
table from kprobes. In this series I did a small fixes and
add function-based fault injection.

Here is the previous version:

https://lkml.org/lkml/2017/12/22/554

There are 2 main reasons why I separate it from kprobes.

- kprobes users can modify execution path not only at
error-injection whitelist functions but also other
functions. I don't like to suggest user that such
limitation is from kprobes itself.

- This error injection information is also useful for
ftrace (function-hook) and livepatch. It should not
be limited by CONFIG_KPROBES.

So I introduced CONFIG_FUNCTION_ERROR_INJECTION for this feature.
Also CONFIG_FAIL_FUNCTION is added, which provides function-based
error injection interface via debugfs following fault-injection
framework. See [4/4].

Any thoughts?

BTW, I think we should add an error-range description in
ALLOW_ERROR_INJECTION() macro. If user sets a success
return value and override it by mistake, caller must
break data or cause kernel panic.

Thank you,

---

Masami Hiramatsu (4):
tracing/kprobe: bpf: Check error injectable event is on function entry
tracing/kprobe: bpf: Compare instruction pointer with original one
error-injection: Separate error-injection from kprobe
error-injection: Support fault injection framework


Documentation/fault-injection/fault-injection.txt | 5 +
arch/Kconfig | 2
arch/x86/Kconfig | 2
arch/x86/include/asm/error-injection.h | 12 +
arch/x86/kernel/kprobes/ftrace.c | 14 -
arch/x86/lib/Makefile | 2
arch/x86/lib/error-inject.c | 19 ++
fs/btrfs/disk-io.c | 2
fs/btrfs/free-space-cache.c | 2
include/asm-generic/error-injection.h | 20 ++
include/asm-generic/vmlinux.lds.h | 14 +
include/linux/bpf.h | 12 -
include/linux/error-injection.h | 21 ++
include/linux/kprobes.h | 1
include/linux/module.h | 6 -
kernel/Makefile | 1
kernel/fail_function.c | 169 ++++++++++++++++++
kernel/kprobes.c | 163 -----------------
kernel/module.c | 8 -
kernel/trace/Kconfig | 4
kernel/trace/bpf_trace.c | 9 -
kernel/trace/trace_kprobe.c | 32 +--
kernel/trace/trace_probe.h | 12 +
lib/Kconfig.debug | 14 +
lib/Makefile | 1
lib/error-inject.c | 198 +++++++++++++++++++++
26 files changed, 506 insertions(+), 239 deletions(-)
create mode 100644 arch/x86/include/asm/error-injection.h
create mode 100644 arch/x86/lib/error-inject.c
create mode 100644 include/asm-generic/error-injection.h
create mode 100644 include/linux/error-injection.h
create mode 100644 kernel/fail_function.c
create mode 100644 lib/error-inject.c

--
Masami Hiramatsu (Linaro)


2017-12-26 07:47:23

by Masami Hiramatsu

[permalink] [raw]
Subject: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

Check whether error injectable event is on function entry or not.
Currently it checks the event is ftrace-based kprobes or not,
but that is wrong. It should check if the event is on the entry
of target function. Since error injection will override a function
to just return with modified return value, that operation must
be done before the target function starts making stackframe.

As a side effect, bpf error injection is no need to depend on
function-tracer. It can work with sw-breakpoint based kprobe
events too.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/Kconfig | 2 --
kernel/trace/bpf_trace.c | 6 +++---
kernel/trace/trace_kprobe.c | 8 +++++---
kernel/trace/trace_probe.h | 12 ++++++------
4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index ae3a2d519e50..6400e1bf97c5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -533,9 +533,7 @@ config FUNCTION_PROFILER
config BPF_KPROBE_OVERRIDE
bool "Enable BPF programs to override a kprobed function"
depends on BPF_EVENTS
- depends on KPROBES_ON_FTRACE
depends on HAVE_KPROBE_OVERRIDE
- depends on DYNAMIC_FTRACE_WITH_REGS
default n
help
Allows BPF to override the execution of a probed function and
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f6d2327ecb59..d663660f8392 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
int ret = -EEXIST;

/*
- * Kprobe override only works for ftrace based kprobes, and only if they
- * are on the opt-in list.
+ * Kprobe override only works if they are on the function entry,
+ * and only if they are on the opt-in list.
*/
if (prog->kprobe_override &&
- (!trace_kprobe_ftrace(event->tp_event) ||
+ (!trace_kprobe_on_func_entry(event->tp_event) ||
!trace_kprobe_error_injectable(event->tp_event)))
return -EINVAL;

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 91f4b57dab82..265e3e27e8dc 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
return nhit;
}

-int trace_kprobe_ftrace(struct trace_event_call *call)
+bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
- return kprobe_ftrace(&tk->rp.kp);
+
+ return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
+ tk->rp.kp.offset);
}

-int trace_kprobe_error_injectable(struct trace_event_call *call)
+bool trace_kprobe_error_injectable(struct trace_event_call *call)
{
struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
unsigned long addr;
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 5e54d748c84c..e101c5bb9eda 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -252,8 +252,8 @@ struct symbol_cache;
unsigned long update_symbol_cache(struct symbol_cache *sc);
void free_symbol_cache(struct symbol_cache *sc);
struct symbol_cache *alloc_symbol_cache(const char *sym, long offset);
-int trace_kprobe_ftrace(struct trace_event_call *call);
-int trace_kprobe_error_injectable(struct trace_event_call *call);
+bool trace_kprobe_on_func_entry(struct trace_event_call *call);
+bool trace_kprobe_error_injectable(struct trace_event_call *call);
#else
/* uprobes do not support symbol fetch methods */
#define fetch_symbol_u8 NULL
@@ -280,14 +280,14 @@ alloc_symbol_cache(const char *sym, long offset)
return NULL;
}

-static inline int trace_kprobe_ftrace(struct trace_event_call *call)
+static inline bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
- return 0;
+ return false;
}

-static inline int trace_kprobe_error_injectable(struct trace_event_call *call)
+static inline bool trace_kprobe_error_injectable(struct trace_event_call *call)
{
- return 0;
+ return false;
}
#endif /* CONFIG_KPROBE_EVENTS */


2017-12-26 07:47:52

by Masami Hiramatsu

[permalink] [raw]
Subject: [RFC PATCH bpf-next v2 2/4] tracing/kprobe: bpf: Compare instruction pointer with original one

Compare instruction pointer with original one on the
stack instead using per-cpu bpf_kprobe_override flag.

This patch also consolidates reset_current_kprobe() and
preempt_enable_no_resched() blocks. Those can be done
in one place.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/trace/bpf_trace.c | 1 -
kernel/trace/trace_kprobe.c | 21 +++++++--------------
2 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index d663660f8392..cefa9b0e396c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -83,7 +83,6 @@ EXPORT_SYMBOL_GPL(trace_call_bpf);
#ifdef CONFIG_BPF_KPROBE_OVERRIDE
BPF_CALL_2(bpf_override_return, struct pt_regs *, regs, unsigned long, rc)
{
- __this_cpu_write(bpf_kprobe_override, 1);
regs_set_return_value(regs, rc);
arch_ftrace_kprobe_override_function(regs);
return 0;
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 265e3e27e8dc..a7c7035963f2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -42,8 +42,6 @@ struct trace_kprobe {
(offsetof(struct trace_kprobe, tp.args) + \
(sizeof(struct probe_arg) * (n)))

-DEFINE_PER_CPU(int, bpf_kprobe_override);
-
static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
{
return tk->rp.handler != NULL;
@@ -1204,6 +1202,7 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
int rctx;

if (bpf_prog_array_valid(call)) {
+ unsigned long orig_ip = instruction_pointer(regs);
int ret;

ret = trace_call_bpf(call, regs);
@@ -1211,12 +1210,13 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
/*
* We need to check and see if we modified the pc of the
* pt_regs, and if so clear the kprobe and return 1 so that we
- * don't do the instruction skipping. Also reset our state so
- * we are clean the next pass through.
+ * don't do the single stepping.
+ * The ftrace kprobe handler leaves it up to us to re-enable
+ * preemption here before returning if we've modified the ip.
*/
- if (__this_cpu_read(bpf_kprobe_override)) {
- __this_cpu_write(bpf_kprobe_override, 0);
+ if (orig_ip != instruction_pointer(regs)) {
reset_current_kprobe();
+ preempt_enable_no_resched();
return 1;
}
if (!ret)
@@ -1324,15 +1324,8 @@ static int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
if (tk->tp.flags & TP_FLAG_TRACE)
kprobe_trace_func(tk, regs);
#ifdef CONFIG_PERF_EVENTS
- if (tk->tp.flags & TP_FLAG_PROFILE) {
+ if (tk->tp.flags & TP_FLAG_PROFILE)
ret = kprobe_perf_func(tk, regs);
- /*
- * The ftrace kprobe handler leaves it up to us to re-enable
- * preemption here before returning if we've modified the ip.
- */
- if (ret)
- preempt_enable_no_resched();
- }
#endif
return ret;
}

2017-12-26 07:48:22

by Masami Hiramatsu

[permalink] [raw]
Subject: [RFC PATCH bpf-next v2 3/4] error-injection: Separate error-injection from kprobe

Since error-injection framework is not limited to be used
by kprobes, nor bpf. Other kernel subsystems can use it
freely for checking safeness of error-injection, e.g.
livepatch, ftrace etc.
So this separate error-injection framework from kprobes.

Some differences has been made:

- "kprobe" word is removed from any APIs/structures.
- BPF_ALLOW_ERROR_INJECTION() is renamed to
ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
- CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
feature. It is automatically enabled if the arch supports
error injection feature for kprobe or ftrace etc.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v2:
- Fix the override function name to override_function_with_return()
- Show only function name in the list, user don't have to care about
it's size, since function override only happens at the entry.
---
arch/Kconfig | 2
arch/x86/Kconfig | 2
arch/x86/include/asm/error-injection.h | 12 ++
arch/x86/kernel/kprobes/ftrace.c | 14 --
arch/x86/lib/Makefile | 2
arch/x86/lib/error-inject.c | 19 +++
fs/btrfs/disk-io.c | 2
fs/btrfs/free-space-cache.c | 2
include/asm-generic/error-injection.h | 20 +++
include/asm-generic/vmlinux.lds.h | 14 +-
include/linux/bpf.h | 12 --
include/linux/error-injection.h | 21 +++
include/linux/kprobes.h | 1
include/linux/module.h | 6 -
kernel/kprobes.c | 163 --------------------------
kernel/module.c | 8 +
kernel/trace/Kconfig | 2
kernel/trace/bpf_trace.c | 2
kernel/trace/trace_kprobe.c | 3
lib/Kconfig.debug | 4 +
lib/Makefile | 1
lib/error-inject.c | 198 ++++++++++++++++++++++++++++++++
22 files changed, 300 insertions(+), 210 deletions(-)
create mode 100644 arch/x86/include/asm/error-injection.h
create mode 100644 arch/x86/lib/error-inject.c
create mode 100644 include/asm-generic/error-injection.h
create mode 100644 include/linux/error-injection.h
create mode 100644 lib/error-inject.c

diff --git a/arch/Kconfig b/arch/Kconfig
index d3f4aaf9cb7a..97376accfb14 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -196,7 +196,7 @@ config HAVE_OPTPROBES
config HAVE_KPROBES_ON_FTRACE
bool

-config HAVE_KPROBE_OVERRIDE
+config HAVE_FUNCTION_ERROR_INJECTION
bool

config HAVE_NMI
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 04d66e6fa447..fc519e3ae754 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -154,7 +154,7 @@ config X86
select HAVE_KERNEL_XZ
select HAVE_KPROBES
select HAVE_KPROBES_ON_FTRACE
- select HAVE_KPROBE_OVERRIDE
+ select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_KRETPROBES
select HAVE_KVM
select HAVE_LIVEPATCH if X86_64
diff --git a/arch/x86/include/asm/error-injection.h b/arch/x86/include/asm/error-injection.h
new file mode 100644
index 000000000000..6c2a133622f4
--- /dev/null
+++ b/arch/x86/include/asm/error-injection.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ERROR_INJECTION_H
+#define _ASM_ERROR_INJECTION_H
+
+#include <asm/linkage.h>
+#include <asm/ptrace.h>
+#include <asm-generic/error-injection.h>
+
+asmlinkage void just_return_func(void);
+void override_function_with_return(struct pt_regs *regs);
+
+#endif /* _ASM_ERROR_INJECTION_H */
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 1ea748d682fd..8dc0161cec8f 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -97,17 +97,3 @@ int arch_prepare_kprobe_ftrace(struct kprobe *p)
p->ainsn.boostable = false;
return 0;
}
-
-asmlinkage void override_func(void);
-asm(
- ".type override_func, @function\n"
- "override_func:\n"
- " ret\n"
- ".size override_func, .-override_func\n"
-);
-
-void arch_ftrace_kprobe_override_function(struct pt_regs *regs)
-{
- regs->ip = (unsigned long)&override_func;
-}
-NOKPROBE_SYMBOL(arch_ftrace_kprobe_override_function);
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 7b181b61170e..081f09435d28 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -26,6 +26,8 @@ lib-y += memcpy_$(BITS).o
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
+lib-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
+

obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o

diff --git a/arch/x86/lib/error-inject.c b/arch/x86/lib/error-inject.c
new file mode 100644
index 000000000000..7b881d03d0dd
--- /dev/null
+++ b/arch/x86/lib/error-inject.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/error-injection.h>
+#include <linux/kprobes.h>
+
+asmlinkage void just_return_func(void);
+
+asm(
+ ".type just_return_func, @function\n"
+ "just_return_func:\n"
+ " ret\n"
+ ".size just_return_func, .-just_return_func\n"
+);
+
+void override_function_with_return(struct pt_regs *regs)
+{
+ regs->ip = (unsigned long)&just_return_func;
+}
+NOKPROBE_SYMBOL(override_function_with_return);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5da18ebc9222..5c540129ad81 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3124,7 +3124,7 @@ int open_ctree(struct super_block *sb,
goto fail_block_groups;
goto retry_root_backup;
}
-BPF_ALLOW_ERROR_INJECTION(open_ctree);
+ALLOW_ERROR_INJECTION(open_ctree);

static void btrfs_end_buffer_write_sync(struct buffer_head *bh, int uptodate)
{
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index fb1382893bfc..2a75e088b215 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -333,7 +333,7 @@ static int io_ctl_init(struct btrfs_io_ctl *io_ctl, struct inode *inode,

return 0;
}
-BPF_ALLOW_ERROR_INJECTION(io_ctl_init);
+ALLOW_ERROR_INJECTION(io_ctl_init);

static void io_ctl_free(struct btrfs_io_ctl *io_ctl)
{
diff --git a/include/asm-generic/error-injection.h b/include/asm-generic/error-injection.h
new file mode 100644
index 000000000000..08352c9d9f97
--- /dev/null
+++ b/include/asm-generic/error-injection.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_ERROR_INJECTION_H
+#define _ASM_GENERIC_ERROR_INJECTION_H
+
+#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
+#ifdef CONFIG_FUNCTION_ERROR_INJECTION
+/*
+ * Whitelist ganerating macro. Specify functions which can be
+ * error-injectable using this macro.
+ */
+#define ALLOW_ERROR_INJECTION(fname) \
+static unsigned long __used \
+ __attribute__((__section__("_error_injection_whitelist"))) \
+ _eil_addr_##fname = (unsigned long)fname;
+#else
+#define ALLOW_ERROR_INJECTION(fname)
+#endif
+#endif
+
+#endif /* _ASM_GENERIC_ERROR_INJECTION_H */
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index a2e8582d094a..fad8b8c4210c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -136,13 +136,13 @@
#define KPROBE_BLACKLIST()
#endif

-#ifdef CONFIG_BPF_KPROBE_OVERRIDE
-#define ERROR_INJECT_LIST() . = ALIGN(8); \
- VMLINUX_SYMBOL(__start_kprobe_error_inject_list) = .; \
- KEEP(*(_kprobe_error_inject_list)) \
- VMLINUX_SYMBOL(__stop_kprobe_error_inject_list) = .;
+#ifdef CONFIG_FUNCTION_ERROR_INJECTION
+#define ERROR_INJECT_WHITELIST() . = ALIGN(8); \
+ VMLINUX_SYMBOL(__start_error_injection_whitelist) = .; \
+ KEEP(*(_error_injection_whitelist)) \
+ VMLINUX_SYMBOL(__stop_error_injection_whitelist) = .;
#else
-#define ERROR_INJECT_LIST()
+#define ERROR_INJECT_WHITELIST()
#endif

#ifdef CONFIG_EVENT_TRACING
@@ -573,7 +573,7 @@
FTRACE_EVENTS() \
TRACE_SYSCALLS() \
KPROBE_BLACKLIST() \
- ERROR_INJECT_LIST() \
+ ERROR_INJECT_WHITELIST() \
MEM_DISCARD(init.rodata) \
CLK_OF_TABLES() \
RESERVEDMEM_OF_TABLES() \
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index da54ef644fcd..6426a6a81b3e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -16,6 +16,7 @@
#include <linux/rbtree_latch.h>
#include <linux/numa.h>
#include <linux/wait.h>
+#include <linux/error-injection.h>

struct perf_event;
struct bpf_prog;
@@ -583,15 +584,4 @@ extern const struct bpf_func_proto bpf_sock_map_update_proto;
void bpf_user_rnd_init_once(void);
u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);

-#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
-#ifdef CONFIG_BPF_KPROBE_OVERRIDE
-#define BPF_ALLOW_ERROR_INJECTION(fname) \
-static unsigned long __used \
- __attribute__((__section__("_kprobe_error_inject_list"))) \
- _eil_addr_##fname = (unsigned long)fname;
-#else
-#define BPF_ALLOW_ERROR_INJECTION(fname)
-#endif
-#endif
-
#endif /* _LINUX_BPF_H */
diff --git a/include/linux/error-injection.h b/include/linux/error-injection.h
new file mode 100644
index 000000000000..130a67c50dac
--- /dev/null
+++ b/include/linux/error-injection.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_ERROR_INJECTION_H
+#define _LINUX_ERROR_INJECTION_H
+
+#ifdef CONFIG_FUNCTION_ERROR_INJECTION
+
+#include <asm/error-injection.h>
+
+extern bool within_error_injection_list(unsigned long addr);
+
+#else /* !CONFIG_FUNCTION_ERROR_INJECTION */
+
+#include <asm-generic/error-injection.h>
+static inline bool within_error_injection_list(unsigned long addr)
+{
+ return false;
+}
+
+#endif
+
+#endif /* _LINUX_ERROR_INJECTION_H */
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 963fd364f3d6..9440a2fc8893 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -271,7 +271,6 @@ extern bool arch_kprobe_on_func_entry(unsigned long offset);
extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);

extern bool within_kprobe_blacklist(unsigned long addr);
-extern bool within_kprobe_error_injection_list(unsigned long addr);

struct kprobe_insn_cache {
struct mutex mutex;
diff --git a/include/linux/module.h b/include/linux/module.h
index 548fa09fa806..792e51d83bda 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -476,9 +476,9 @@ struct module {
unsigned int num_ctors;
#endif

-#ifdef CONFIG_BPF_KPROBE_OVERRIDE
- unsigned int num_kprobe_ei_funcs;
- unsigned long *kprobe_ei_funcs;
+#ifdef CONFIG_FUNCTION_ERROR_INJECTION
+ unsigned int num_ei_funcs;
+ unsigned long *ei_funcs;
#endif
} ____cacheline_aligned __randomize_layout;
#ifndef MODULE_ARCH_INIT
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index b4aab48ad258..da2ccf142358 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -83,16 +83,6 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
return &(kretprobe_table_locks[hash].lock);
}

-/* List of symbols that can be overriden for error injection. */
-static LIST_HEAD(kprobe_error_injection_list);
-static DEFINE_MUTEX(kprobe_ei_mutex);
-struct kprobe_ei_entry {
- struct list_head list;
- unsigned long start_addr;
- unsigned long end_addr;
- void *priv;
-};
-
/* Blacklist -- list of struct kprobe_blacklist_entry */
static LIST_HEAD(kprobe_blacklist);

@@ -1404,17 +1394,6 @@ bool within_kprobe_blacklist(unsigned long addr)
return false;
}

-bool within_kprobe_error_injection_list(unsigned long addr)
-{
- struct kprobe_ei_entry *ent;
-
- list_for_each_entry(ent, &kprobe_error_injection_list, list) {
- if (addr >= ent->start_addr && addr < ent->end_addr)
- return true;
- }
- return false;
-}
-
/*
* If we have a symbol_name argument, look it up and add the offset field
* to it. This way, we can specify a relative address to a symbol.
@@ -2189,86 +2168,6 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
return 0;
}

-#ifdef CONFIG_BPF_KPROBE_OVERRIDE
-/* Markers of the _kprobe_error_inject_list section */
-extern unsigned long __start_kprobe_error_inject_list[];
-extern unsigned long __stop_kprobe_error_inject_list[];
-
-/*
- * Lookup and populate the kprobe_error_injection_list.
- *
- * For safety reasons we only allow certain functions to be overriden with
- * bpf_error_injection, so we need to populate the list of the symbols that have
- * been marked as safe for overriding.
- */
-static void populate_kprobe_error_injection_list(unsigned long *start,
- unsigned long *end,
- void *priv)
-{
- unsigned long *iter;
- struct kprobe_ei_entry *ent;
- unsigned long entry, offset = 0, size = 0;
-
- mutex_lock(&kprobe_ei_mutex);
- for (iter = start; iter < end; iter++) {
- entry = arch_deref_entry_point((void *)*iter);
-
- if (!kernel_text_address(entry) ||
- !kallsyms_lookup_size_offset(entry, &size, &offset)) {
- pr_err("Failed to find error inject entry at %p\n",
- (void *)entry);
- continue;
- }
-
- ent = kmalloc(sizeof(*ent), GFP_KERNEL);
- if (!ent)
- break;
- ent->start_addr = entry;
- ent->end_addr = entry + size;
- ent->priv = priv;
- INIT_LIST_HEAD(&ent->list);
- list_add_tail(&ent->list, &kprobe_error_injection_list);
- }
- mutex_unlock(&kprobe_ei_mutex);
-}
-
-static void __init populate_kernel_kprobe_ei_list(void)
-{
- populate_kprobe_error_injection_list(__start_kprobe_error_inject_list,
- __stop_kprobe_error_inject_list,
- NULL);
-}
-
-static void module_load_kprobe_ei_list(struct module *mod)
-{
- if (!mod->num_kprobe_ei_funcs)
- return;
- populate_kprobe_error_injection_list(mod->kprobe_ei_funcs,
- mod->kprobe_ei_funcs +
- mod->num_kprobe_ei_funcs, mod);
-}
-
-static void module_unload_kprobe_ei_list(struct module *mod)
-{
- struct kprobe_ei_entry *ent, *n;
- if (!mod->num_kprobe_ei_funcs)
- return;
-
- mutex_lock(&kprobe_ei_mutex);
- list_for_each_entry_safe(ent, n, &kprobe_error_injection_list, list) {
- if (ent->priv == mod) {
- list_del_init(&ent->list);
- kfree(ent);
- }
- }
- mutex_unlock(&kprobe_ei_mutex);
-}
-#else
-static inline void __init populate_kernel_kprobe_ei_list(void) {}
-static inline void module_load_kprobe_ei_list(struct module *m) {}
-static inline void module_unload_kprobe_ei_list(struct module *m) {}
-#endif
-
/* Module notifier call back, checking kprobes on the module */
static int kprobes_module_callback(struct notifier_block *nb,
unsigned long val, void *data)
@@ -2279,11 +2178,6 @@ static int kprobes_module_callback(struct notifier_block *nb,
unsigned int i;
int checkcore = (val == MODULE_STATE_GOING);

- if (val == MODULE_STATE_COMING)
- module_load_kprobe_ei_list(mod);
- else if (val == MODULE_STATE_GOING)
- module_unload_kprobe_ei_list(mod);
-
if (val != MODULE_STATE_GOING && val != MODULE_STATE_LIVE)
return NOTIFY_DONE;

@@ -2346,8 +2240,6 @@ static int __init init_kprobes(void)
pr_err("Please take care of using kprobes.\n");
}

- populate_kernel_kprobe_ei_list();
-
if (kretprobe_blacklist_size) {
/* lookup the function address from its name */
for (i = 0; kretprobe_blacklist[i].name != NULL; i++) {
@@ -2515,56 +2407,6 @@ static const struct file_operations debugfs_kprobe_blacklist_ops = {
.release = seq_release,
};

-/*
- * kprobes/error_injection_list -- shows which functions can be overriden for
- * error injection.
- * */
-static void *kprobe_ei_seq_start(struct seq_file *m, loff_t *pos)
-{
- mutex_lock(&kprobe_ei_mutex);
- return seq_list_start(&kprobe_error_injection_list, *pos);
-}
-
-static void kprobe_ei_seq_stop(struct seq_file *m, void *v)
-{
- mutex_unlock(&kprobe_ei_mutex);
-}
-
-static void *kprobe_ei_seq_next(struct seq_file *m, void *v, loff_t *pos)
-{
- return seq_list_next(v, &kprobe_error_injection_list, pos);
-}
-
-static int kprobe_ei_seq_show(struct seq_file *m, void *v)
-{
- char buffer[KSYM_SYMBOL_LEN];
- struct kprobe_ei_entry *ent =
- list_entry(v, struct kprobe_ei_entry, list);
-
- sprint_symbol(buffer, ent->start_addr);
- seq_printf(m, "%s\n", buffer);
- return 0;
-}
-
-static const struct seq_operations kprobe_ei_seq_ops = {
- .start = kprobe_ei_seq_start,
- .next = kprobe_ei_seq_next,
- .stop = kprobe_ei_seq_stop,
- .show = kprobe_ei_seq_show,
-};
-
-static int kprobe_ei_open(struct inode *inode, struct file *filp)
-{
- return seq_open(filp, &kprobe_ei_seq_ops);
-}
-
-static const struct file_operations debugfs_kprobe_ei_ops = {
- .open = kprobe_ei_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = seq_release,
-};
-
static void arm_all_kprobes(void)
{
struct hlist_head *head;
@@ -2706,11 +2548,6 @@ static int __init debugfs_kprobe_init(void)
if (!file)
goto error;

- file = debugfs_create_file("error_injection_list", 0444, dir, NULL,
- &debugfs_kprobe_ei_ops);
- if (!file)
- goto error;
-
return 0;

error:
diff --git a/kernel/module.c b/kernel/module.c
index bd695bfdc5c4..588f86a1f9c3 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3118,10 +3118,10 @@ static int find_module_sections(struct module *mod, struct load_info *info)
sizeof(*mod->ftrace_callsites),
&mod->num_ftrace_callsites);
#endif
-#ifdef CONFIG_BPF_KPROBE_OVERRIDE
- mod->kprobe_ei_funcs = section_objs(info, "_kprobe_error_inject_list",
- sizeof(*mod->kprobe_ei_funcs),
- &mod->num_kprobe_ei_funcs);
+#ifdef CONFIG_FUNCTION_ERROR_INJECT
+ mod->ei_funcs = section_objs(info, "_error_injection_whitelist",
+ sizeof(*mod->ei_funcs),
+ &mod->num_ei_funcs);
#endif
mod->extable = section_objs(info, "__ex_table",
sizeof(*mod->extable), &mod->num_exentries);
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 6400e1bf97c5..a356b8c1f830 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -533,7 +533,7 @@ config FUNCTION_PROFILER
config BPF_KPROBE_OVERRIDE
bool "Enable BPF programs to override a kprobed function"
depends on BPF_EVENTS
- depends on HAVE_KPROBE_OVERRIDE
+ depends on FUNCTION_ERROR_INJECT
default n
help
Allows BPF to override the execution of a probed function and
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index cefa9b0e396c..460d5a899cef 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -84,7 +84,7 @@ EXPORT_SYMBOL_GPL(trace_call_bpf);
BPF_CALL_2(bpf_override_return, struct pt_regs *, regs, unsigned long, rc)
{
regs_set_return_value(regs, rc);
- arch_ftrace_kprobe_override_function(regs);
+ override_function_with_return(regs);
return 0;
}

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index a7c7035963f2..23f88a062965 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -21,6 +21,7 @@
#include <linux/module.h>
#include <linux/uaccess.h>
#include <linux/rculist.h>
+#include <linux/error-injection.h>

#include "trace_probe.h"

@@ -106,7 +107,7 @@ bool trace_kprobe_error_injectable(struct trace_event_call *call)
} else {
addr = (unsigned long)tk->rp.kp.addr;
}
- return within_kprobe_error_injection_list(addr);
+ return within_error_injection_list(addr);
}

static int register_kprobe_event(struct trace_kprobe *tk);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9d5b78aad4c5..fe88ac0f003c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1500,6 +1500,10 @@ config FAULT_INJECTION
Provide fault-injection framework.
For more details, see Documentation/fault-injection/.

+config FUNCTION_ERROR_INJECTION
+ def_bool y
+ depends on HAVE_FUNCTION_ERROR_INJECTION
+
config FAILSLAB
bool "Fault-injection capability for kmalloc"
depends on FAULT_INJECTION
diff --git a/lib/Makefile b/lib/Makefile
index a6c8529dd9b2..75ec13778cd8 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -149,6 +149,7 @@ obj-$(CONFIG_NETDEV_NOTIFIER_ERROR_INJECT) += netdev-notifier-error-inject.o
obj-$(CONFIG_MEMORY_NOTIFIER_ERROR_INJECT) += memory-notifier-error-inject.o
obj-$(CONFIG_OF_RECONFIG_NOTIFIER_ERROR_INJECT) += \
of-reconfig-notifier-error-inject.o
+obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o

lib-$(CONFIG_GENERIC_BUG) += bug.o

diff --git a/lib/error-inject.c b/lib/error-inject.c
new file mode 100644
index 000000000000..beafd2c7002c
--- /dev/null
+++ b/lib/error-inject.c
@@ -0,0 +1,198 @@
+// SPDX-License-Identifier: GPL-2.0
+// error-inject.c: Function-level error injection table
+#include <linux/error-injection.h>
+#include <linux/debugfs.h>
+#include <linux/kallsyms.h>
+#include <linux/kprobes.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+
+/* Whitelist of symbols that can be overridden for error injection. */
+static LIST_HEAD(error_injection_list);
+static DEFINE_MUTEX(ei_mutex);
+struct ei_entry {
+ struct list_head list;
+ unsigned long start_addr;
+ unsigned long end_addr;
+ void *priv;
+};
+
+bool within_error_injection_list(unsigned long addr)
+{
+ struct ei_entry *ent;
+
+ list_for_each_entry(ent, &error_injection_list, list) {
+ if (addr >= ent->start_addr && addr < ent->end_addr)
+ return true;
+ }
+ return false;
+}
+
+/*
+ * Lookup and populate the error_injection_list.
+ *
+ * For safety reasons we only allow certain functions to be overridden with
+ * bpf_error_injection, so we need to populate the list of the symbols that have
+ * been marked as safe for overriding.
+ */
+static void populate_error_injection_list(unsigned long *start,
+ unsigned long *end, void *priv)
+{
+ unsigned long *iter;
+ struct ei_entry *ent;
+ unsigned long entry, offset = 0, size = 0;
+
+ mutex_lock(&ei_mutex);
+ for (iter = start; iter < end; iter++) {
+ entry = arch_deref_entry_point((void *)*iter);
+
+ if (!kernel_text_address(entry) ||
+ !kallsyms_lookup_size_offset(entry, &size, &offset)) {
+ pr_err("Failed to find error inject entry at %p\n",
+ (void *)entry);
+ continue;
+ }
+
+ ent = kmalloc(sizeof(*ent), GFP_KERNEL);
+ if (!ent)
+ break;
+ ent->start_addr = entry;
+ ent->end_addr = entry + size;
+ ent->priv = priv;
+ INIT_LIST_HEAD(&ent->list);
+ list_add_tail(&ent->list, &error_injection_list);
+ }
+ mutex_unlock(&ei_mutex);
+}
+
+/* Markers of the _error_inject_whitelist section */
+extern unsigned long __start_error_injection_whitelist[];
+extern unsigned long __stop_error_injection_whitelist[];
+
+static void __init populate_kernel_ei_list(void)
+{
+ populate_error_injection_list(__start_error_injection_whitelist,
+ __stop_error_injection_whitelist,
+ NULL);
+}
+
+static void module_load_ei_list(struct module *mod)
+{
+ if (!mod->num_ei_funcs)
+ return;
+
+ populate_error_injection_list(mod->ei_funcs,
+ mod->ei_funcs + mod->num_ei_funcs, mod);
+}
+
+static void module_unload_ei_list(struct module *mod)
+{
+ struct ei_entry *ent, *n;
+
+ if (!mod->num_ei_funcs)
+ return;
+
+ mutex_lock(&ei_mutex);
+ list_for_each_entry_safe(ent, n, &error_injection_list, list) {
+ if (ent->priv == mod) {
+ list_del_init(&ent->list);
+ kfree(ent);
+ }
+ }
+ mutex_unlock(&ei_mutex);
+}
+
+/* Module notifier call back, checking error injection table on the module */
+static int ei_module_callback(struct notifier_block *nb,
+ unsigned long val, void *data)
+{
+ struct module *mod = data;
+
+ if (val == MODULE_STATE_COMING)
+ module_load_ei_list(mod);
+ else if (val == MODULE_STATE_GOING)
+ module_unload_ei_list(mod);
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block ei_module_nb = {
+ .notifier_call = ei_module_callback,
+ .priority = 0
+};
+
+/*
+ * error_injection/whitelist -- shows which functions can be overridden for
+ * error injection.
+ */
+static void *ei_seq_start(struct seq_file *m, loff_t *pos)
+{
+ mutex_lock(&ei_mutex);
+ return seq_list_start(&error_injection_list, *pos);
+}
+
+static void ei_seq_stop(struct seq_file *m, void *v)
+{
+ mutex_unlock(&ei_mutex);
+}
+
+static void *ei_seq_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ return seq_list_next(v, &error_injection_list, pos);
+}
+
+static int ei_seq_show(struct seq_file *m, void *v)
+{
+ struct ei_entry *ent = list_entry(v, struct ei_entry, list);
+
+ seq_printf(m, "%pf\n", (void *)ent->start_addr);
+ return 0;
+}
+
+static const struct seq_operations ei_seq_ops = {
+ .start = ei_seq_start,
+ .next = ei_seq_next,
+ .stop = ei_seq_stop,
+ .show = ei_seq_show,
+};
+
+static int ei_open(struct inode *inode, struct file *filp)
+{
+ return seq_open(filp, &ei_seq_ops);
+}
+
+static const struct file_operations debugfs_ei_ops = {
+ .open = ei_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
+static int __init ei_debugfs_init(void)
+{
+ struct dentry *dir, *file;
+
+ dir = debugfs_create_dir("error_injection", NULL);
+ if (!dir)
+ return -ENOMEM;
+
+ file = debugfs_create_file("list", 0444, dir, NULL, &debugfs_ei_ops);
+ if (!file) {
+ debugfs_remove(dir);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static int __init init_error_injection(void)
+{
+ populate_kernel_ei_list();
+ if (!register_module_notifier(&ei_module_nb))
+ ei_debugfs_init();
+
+ return 0;
+}
+module_init(init_error_injection);

2017-12-26 07:48:52

by Masami Hiramatsu

[permalink] [raw]
Subject: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

Support in-kernel fault-injection framework via debugfs.
This allows you to inject a conditional error to specified
function using debugfs interfaces.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Documentation/fault-injection/fault-injection.txt | 5 +
kernel/Makefile | 1
kernel/fail_function.c | 169 +++++++++++++++++++++
lib/Kconfig.debug | 10 +
4 files changed, 185 insertions(+)
create mode 100644 kernel/fail_function.c

diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index 918972babcd8..6243a588dd71 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -30,6 +30,11 @@ o fail_mmc_request
injects MMC data errors on devices permitted by setting
debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request

+o fail_function
+
+ injects error return on specific functions by setting debugfs entries
+ under /sys/kernel/debug/fail_function. No boot option supported.
+
Configure fault-injection capabilities behavior
-----------------------------------------------

diff --git a/kernel/Makefile b/kernel/Makefile
index 172d151d429c..f85ae5dfa474 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -81,6 +81,7 @@ obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
obj-$(CONFIG_GCOV_KERNEL) += gcov/
obj-$(CONFIG_KCOV) += kcov.o
obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_FAIL_FUNCTION) += fail_function.o
obj-$(CONFIG_KGDB) += debug/
obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o
diff --git a/kernel/fail_function.c b/kernel/fail_function.c
new file mode 100644
index 000000000000..203d3582487a
--- /dev/null
+++ b/kernel/fail_function.c
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fail_function.c: Function-based error injection
+ */
+#include <linux/error-injection.h>
+#include <linux/debugfs.h>
+#include <linux/fault-inject.h>
+#include <linux/kallsyms.h>
+#include <linux/kprobes.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+static int fei_kprobe_handler(struct kprobe *kp, struct pt_regs *regs);
+
+static DEFINE_MUTEX(fei_lock);
+static struct {
+ struct kprobe kp;
+ unsigned long retval;
+ struct fault_attr attr;
+} fei_attr = {
+ .kp = { .pre_handler = fei_kprobe_handler, },
+ .retval = ~0UL, /* This indicates -1 in long/int return value */
+ .attr = FAULT_ATTR_INITIALIZER,
+};
+
+static int fei_kprobe_handler(struct kprobe *kp, struct pt_regs *regs)
+{
+ if (should_fail(&fei_attr.attr, 1)) {
+ regs_set_return_value(regs, fei_attr.retval);
+ override_function_with_return(regs);
+ /* Kprobe specific fixup */
+ reset_current_kprobe();
+ preempt_enable_no_resched();
+ return 1;
+ }
+
+ return 0;
+}
+NOKPROBE_SYMBOL(fei_kprobe_handler)
+
+static void *fei_seq_start(struct seq_file *m, loff_t *pos)
+{
+ mutex_lock(&fei_lock);
+ return *pos == 0 ? (void *)1 : NULL;
+}
+
+static void fei_seq_stop(struct seq_file *m, void *v)
+{
+ mutex_unlock(&fei_lock);
+}
+
+static void *fei_seq_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ return NULL;
+}
+
+static int fei_seq_show(struct seq_file *m, void *v)
+{
+ if (fei_attr.kp.addr)
+ seq_printf(m, "%pf\n", fei_attr.kp.addr);
+ else
+ seq_puts(m, "# not specified\n");
+ return 0;
+}
+
+static const struct seq_operations fei_seq_ops = {
+ .start = fei_seq_start,
+ .next = fei_seq_next,
+ .stop = fei_seq_stop,
+ .show = fei_seq_show,
+};
+
+static int fei_open(struct inode *inode, struct file *file)
+{
+ return seq_open(file, &fei_seq_ops);
+}
+
+static ssize_t fei_write(struct file *file, const char __user *buffer,
+ size_t count, loff_t *ppos)
+{
+ unsigned long addr;
+ char *buf, *sym;
+ int ret;
+
+ /* cut off if it is too long */
+ if (count > KSYM_NAME_LEN)
+ count = KSYM_NAME_LEN;
+ buf = kmalloc(sizeof(char) * (count + 1), GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ if (copy_from_user(buf, buffer, count)) {
+ ret = -EFAULT;
+ goto out;
+ }
+ buf[count] = '\0';
+ sym = strstrip(buf);
+
+ if (strlen(sym) == 0 || sym[0] == '0') {
+ if (fei_attr.kp.addr) {
+ unregister_kprobe(&fei_attr.kp);
+ fei_attr.kp.addr = NULL;
+ }
+ ret = count;
+ goto out;
+ }
+
+ addr = kallsyms_lookup_name(sym);
+ if (!addr) {
+ ret = -EINVAL;
+ goto out;
+ }
+ if (!within_error_injection_list(addr)) {
+ ret = -ERANGE;
+ goto out;
+ }
+
+ if (fei_attr.kp.addr) {
+ unregister_kprobe(&fei_attr.kp);
+ fei_attr.kp.addr = NULL;
+ }
+ fei_attr.kp.addr = (void *)addr;
+ ret = register_kprobe(&fei_attr.kp);
+ if (ret < 0)
+ fei_attr.kp.addr = NULL;
+ else
+ ret = count;
+out:
+ kfree(buf);
+ return ret;
+}
+
+static const struct file_operations fei_ops = {
+ .open = fei_open,
+ .read = seq_read,
+ .write = fei_write,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
+static int __init fei_debugfs_init(void)
+{
+ struct dentry *dir;
+
+ dir = fault_create_debugfs_attr("fail_function", NULL,
+ &fei_attr.attr);
+ if (IS_ERR(dir))
+ return PTR_ERR(dir);
+
+ // injectable attribute is just a symlink of error_inject/list
+ if (!debugfs_create_symlink("injectable", dir,
+ "../error_injection/list"))
+ goto error;
+
+ if (!debugfs_create_file("inject", 0600, dir, NULL, &fei_ops))
+ goto error;
+
+ if (!debugfs_create_ulong("retval", 0600, dir, &fei_attr.retval))
+ goto error;
+
+ return 0;
+error:
+ debugfs_remove_recursive(dir);
+ return -ENOMEM;
+}
+
+late_initcall(fei_debugfs_init);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index fe88ac0f003c..15c6609a1bb8 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1551,6 +1551,16 @@ config FAIL_FUTEX
help
Provide fault-injection capability for futexes.

+config FAIL_FUNCTION
+ bool "Fault-injection capability for functions"
+ depends on FAULT_INJECTION_DEBUG_FS && FUNCTION_ERROR_INJECTION
+ help
+ Provide function-based fault-injection capability.
+ This will allow you to override a specific function with a return
+ with given return value. As a result, function caller will see
+ an error value and have to handle it. This is useful to test the
+ error handling in various subsystems.
+
config FAULT_INJECTION_DEBUG_FS
bool "Debugfs entries for fault-injection capabilities"
depends on FAULT_INJECTION && SYSFS && DEBUG_FS

2017-12-27 01:57:40

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Tue, Dec 26, 2017 at 04:46:59PM +0900, Masami Hiramatsu wrote:
> Check whether error injectable event is on function entry or not.
> Currently it checks the event is ftrace-based kprobes or not,
> but that is wrong. It should check if the event is on the entry
> of target function. Since error injection will override a function
> to just return with modified return value, that operation must
> be done before the target function starts making stackframe.
>
> As a side effect, bpf error injection is no need to depend on
> function-tracer. It can work with sw-breakpoint based kprobe
> events too.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> kernel/trace/Kconfig | 2 --
> kernel/trace/bpf_trace.c | 6 +++---
> kernel/trace/trace_kprobe.c | 8 +++++---
> kernel/trace/trace_probe.h | 12 ++++++------
> 4 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index ae3a2d519e50..6400e1bf97c5 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -533,9 +533,7 @@ config FUNCTION_PROFILER
> config BPF_KPROBE_OVERRIDE
> bool "Enable BPF programs to override a kprobed function"
> depends on BPF_EVENTS
> - depends on KPROBES_ON_FTRACE
> depends on HAVE_KPROBE_OVERRIDE
> - depends on DYNAMIC_FTRACE_WITH_REGS
> default n
> help
> Allows BPF to override the execution of a probed function and
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index f6d2327ecb59..d663660f8392 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
> int ret = -EEXIST;
>
> /*
> - * Kprobe override only works for ftrace based kprobes, and only if they
> - * are on the opt-in list.
> + * Kprobe override only works if they are on the function entry,
> + * and only if they are on the opt-in list.
> */
> if (prog->kprobe_override &&
> - (!trace_kprobe_ftrace(event->tp_event) ||
> + (!trace_kprobe_on_func_entry(event->tp_event) ||
> !trace_kprobe_error_injectable(event->tp_event)))
> return -EINVAL;
>
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 91f4b57dab82..265e3e27e8dc 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
> return nhit;
> }
>
> -int trace_kprobe_ftrace(struct trace_event_call *call)
> +bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> {
> struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> - return kprobe_ftrace(&tk->rp.kp);
> +
> + return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
> + tk->rp.kp.offset);

That would be nice, but did you test this?
My understanding that kprobe will restore all regs and
here we need to override return ip _and_ value.
Could you add a patch with the test the way Josef did
or describe the steps to test this new mode?

2017-12-27 02:00:53

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 2/4] tracing/kprobe: bpf: Compare instruction pointer with original one

On Tue, Dec 26, 2017 at 04:47:26PM +0900, Masami Hiramatsu wrote:
> Compare instruction pointer with original one on the
> stack instead using per-cpu bpf_kprobe_override flag.
>
> This patch also consolidates reset_current_kprobe() and
> preempt_enable_no_resched() blocks. Those can be done
> in one place.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> kernel/trace/bpf_trace.c | 1 -
> kernel/trace/trace_kprobe.c | 21 +++++++--------------
> 2 files changed, 7 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index d663660f8392..cefa9b0e396c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -83,7 +83,6 @@ EXPORT_SYMBOL_GPL(trace_call_bpf);
> #ifdef CONFIG_BPF_KPROBE_OVERRIDE
> BPF_CALL_2(bpf_override_return, struct pt_regs *, regs, unsigned long, rc)
> {
> - __this_cpu_write(bpf_kprobe_override, 1);
> regs_set_return_value(regs, rc);
> arch_ftrace_kprobe_override_function(regs);
> return 0;
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 265e3e27e8dc..a7c7035963f2 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -42,8 +42,6 @@ struct trace_kprobe {
> (offsetof(struct trace_kprobe, tp.args) + \
> (sizeof(struct probe_arg) * (n)))
>
> -DEFINE_PER_CPU(int, bpf_kprobe_override);
> -
> static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
> {
> return tk->rp.handler != NULL;
> @@ -1204,6 +1202,7 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
> int rctx;
>
> if (bpf_prog_array_valid(call)) {
> + unsigned long orig_ip = instruction_pointer(regs);
> int ret;
>
> ret = trace_call_bpf(call, regs);
> @@ -1211,12 +1210,13 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
> /*
> * We need to check and see if we modified the pc of the
> * pt_regs, and if so clear the kprobe and return 1 so that we
> - * don't do the instruction skipping. Also reset our state so
> - * we are clean the next pass through.
> + * don't do the single stepping.
> + * The ftrace kprobe handler leaves it up to us to re-enable
> + * preemption here before returning if we've modified the ip.
> */
> - if (__this_cpu_read(bpf_kprobe_override)) {
> - __this_cpu_write(bpf_kprobe_override, 0);
> + if (orig_ip != instruction_pointer(regs)) {
> reset_current_kprobe();
> + preempt_enable_no_resched();

This is great idea.
Acked-by: Alexei Starovoitov <[email protected]>

2017-12-27 02:07:11

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 3/4] error-injection: Separate error-injection from kprobe

On Tue, Dec 26, 2017 at 04:47:55PM +0900, Masami Hiramatsu wrote:
> Since error-injection framework is not limited to be used
> by kprobes, nor bpf. Other kernel subsystems can use it
> freely for checking safeness of error-injection, e.g.
> livepatch, ftrace etc.
> So this separate error-injection framework from kprobes.
>
> Some differences has been made:
>
> - "kprobe" word is removed from any APIs/structures.
> - BPF_ALLOW_ERROR_INJECTION() is renamed to
> ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
> - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
> feature. It is automatically enabled if the arch supports
> error injection feature for kprobe or ftrace etc.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> Changes in v2:
> - Fix the override function name to override_function_with_return()
> - Show only function name in the list, user don't have to care about
> it's size, since function override only happens at the entry.

looks like nice cleanup.
Acked-by: Alexei Starovoitov <[email protected]>

2017-12-27 02:13:02

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
> Support in-kernel fault-injection framework via debugfs.
> This allows you to inject a conditional error to specified
> function using debugfs interfaces.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> Documentation/fault-injection/fault-injection.txt | 5 +
> kernel/Makefile | 1
> kernel/fail_function.c | 169 +++++++++++++++++++++
> lib/Kconfig.debug | 10 +
> 4 files changed, 185 insertions(+)
> create mode 100644 kernel/fail_function.c
>
> diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
> index 918972babcd8..6243a588dd71 100644
> --- a/Documentation/fault-injection/fault-injection.txt
> +++ b/Documentation/fault-injection/fault-injection.txt
> @@ -30,6 +30,11 @@ o fail_mmc_request
> injects MMC data errors on devices permitted by setting
> debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
>
> +o fail_function
> +
> + injects error return on specific functions by setting debugfs entries
> + under /sys/kernel/debug/fail_function. No boot option supported.

I like it.
Could you document it a bit better?
In particular retval is configurable, but without an example no one
will be able to figure out how to use it.

I think you can drop RFC tag from the next version of these patches.
Thanks!

2017-12-27 05:56:34

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Tue, 26 Dec 2017 17:57:32 -0800
Alexei Starovoitov <[email protected]> wrote:

> On Tue, Dec 26, 2017 at 04:46:59PM +0900, Masami Hiramatsu wrote:
> > Check whether error injectable event is on function entry or not.
> > Currently it checks the event is ftrace-based kprobes or not,
> > but that is wrong. It should check if the event is on the entry
> > of target function. Since error injection will override a function
> > to just return with modified return value, that operation must
> > be done before the target function starts making stackframe.
> >
> > As a side effect, bpf error injection is no need to depend on
> > function-tracer. It can work with sw-breakpoint based kprobe
> > events too.
> >
> > Signed-off-by: Masami Hiramatsu <[email protected]>
> > ---
> > kernel/trace/Kconfig | 2 --
> > kernel/trace/bpf_trace.c | 6 +++---
> > kernel/trace/trace_kprobe.c | 8 +++++---
> > kernel/trace/trace_probe.h | 12 ++++++------
> > 4 files changed, 14 insertions(+), 14 deletions(-)
> >
> > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > index ae3a2d519e50..6400e1bf97c5 100644
> > --- a/kernel/trace/Kconfig
> > +++ b/kernel/trace/Kconfig
> > @@ -533,9 +533,7 @@ config FUNCTION_PROFILER
> > config BPF_KPROBE_OVERRIDE
> > bool "Enable BPF programs to override a kprobed function"
> > depends on BPF_EVENTS
> > - depends on KPROBES_ON_FTRACE
> > depends on HAVE_KPROBE_OVERRIDE
> > - depends on DYNAMIC_FTRACE_WITH_REGS
> > default n
> > help
> > Allows BPF to override the execution of a probed function and
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index f6d2327ecb59..d663660f8392 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
> > int ret = -EEXIST;
> >
> > /*
> > - * Kprobe override only works for ftrace based kprobes, and only if they
> > - * are on the opt-in list.
> > + * Kprobe override only works if they are on the function entry,
> > + * and only if they are on the opt-in list.
> > */
> > if (prog->kprobe_override &&
> > - (!trace_kprobe_ftrace(event->tp_event) ||
> > + (!trace_kprobe_on_func_entry(event->tp_event) ||
> > !trace_kprobe_error_injectable(event->tp_event)))
> > return -EINVAL;
> >
> > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > index 91f4b57dab82..265e3e27e8dc 100644
> > --- a/kernel/trace/trace_kprobe.c
> > +++ b/kernel/trace/trace_kprobe.c
> > @@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
> > return nhit;
> > }
> >
> > -int trace_kprobe_ftrace(struct trace_event_call *call)
> > +bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> > {
> > struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> > - return kprobe_ftrace(&tk->rp.kp);
> > +
> > + return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
> > + tk->rp.kp.offset);
>
> That would be nice, but did you test this?

Yes, because the jprobe, which was only official user of modifying execution
path using kprobe, did same way to check. (and kretprobe also does it)

> My understanding that kprobe will restore all regs and
> here we need to override return ip _and_ value.

yes, no problem. kprobe restore all regs from pt_regs, including regs->ip.

> Could you add a patch with the test the way Josef did
> or describe the steps to test this new mode?

Would you mean below patch? If so, it should work without any change.

[PATCH v10 4/5] samples/bpf: add a test for bpf_override_return

Thank you,

--
Masami Hiramatsu <[email protected]>

2017-12-27 08:09:17

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On Tue, 26 Dec 2017 18:12:56 -0800
Alexei Starovoitov <[email protected]> wrote:

> On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
> > Support in-kernel fault-injection framework via debugfs.
> > This allows you to inject a conditional error to specified
> > function using debugfs interfaces.
> >
> > Signed-off-by: Masami Hiramatsu <[email protected]>
> > ---
> > Documentation/fault-injection/fault-injection.txt | 5 +
> > kernel/Makefile | 1
> > kernel/fail_function.c | 169 +++++++++++++++++++++
> > lib/Kconfig.debug | 10 +
> > 4 files changed, 185 insertions(+)
> > create mode 100644 kernel/fail_function.c
> >
> > diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
> > index 918972babcd8..6243a588dd71 100644
> > --- a/Documentation/fault-injection/fault-injection.txt
> > +++ b/Documentation/fault-injection/fault-injection.txt
> > @@ -30,6 +30,11 @@ o fail_mmc_request
> > injects MMC data errors on devices permitted by setting
> > debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
> >
> > +o fail_function
> > +
> > + injects error return on specific functions by setting debugfs entries
> > + under /sys/kernel/debug/fail_function. No boot option supported.
>
> I like it.
> Could you document it a bit better?

Yes, I will do in next series.

> In particular retval is configurable, but without an example no one
> will be able to figure out how to use it.

Ah, right. BTW, as I pointed in the covermail, should we store the
expected error value range into the injectable list? e.g.

ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO)

And provide APIs to check/get it.

const struct error_range *ei_get_error_range(unsigned long addr);


>
> I think you can drop RFC tag from the next version of these patches.
> Thanks!

Thank you, I'll fix some errors came from configurations, and resend it.


Thanks!


--
Masami Hiramatsu <[email protected]>

2017-12-27 22:47:04

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On 12/26/17 9:56 PM, Masami Hiramatsu wrote:
> On Tue, 26 Dec 2017 17:57:32 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
>> On Tue, Dec 26, 2017 at 04:46:59PM +0900, Masami Hiramatsu wrote:
>>> Check whether error injectable event is on function entry or not.
>>> Currently it checks the event is ftrace-based kprobes or not,
>>> but that is wrong. It should check if the event is on the entry
>>> of target function. Since error injection will override a function
>>> to just return with modified return value, that operation must
>>> be done before the target function starts making stackframe.
>>>
>>> As a side effect, bpf error injection is no need to depend on
>>> function-tracer. It can work with sw-breakpoint based kprobe
>>> events too.
>>>
>>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>> ---
>>> kernel/trace/Kconfig | 2 --
>>> kernel/trace/bpf_trace.c | 6 +++---
>>> kernel/trace/trace_kprobe.c | 8 +++++---
>>> kernel/trace/trace_probe.h | 12 ++++++------
>>> 4 files changed, 14 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
>>> index ae3a2d519e50..6400e1bf97c5 100644
>>> --- a/kernel/trace/Kconfig
>>> +++ b/kernel/trace/Kconfig
>>> @@ -533,9 +533,7 @@ config FUNCTION_PROFILER
>>> config BPF_KPROBE_OVERRIDE
>>> bool "Enable BPF programs to override a kprobed function"
>>> depends on BPF_EVENTS
>>> - depends on KPROBES_ON_FTRACE
>>> depends on HAVE_KPROBE_OVERRIDE
>>> - depends on DYNAMIC_FTRACE_WITH_REGS
>>> default n
>>> help
>>> Allows BPF to override the execution of a probed function and
>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>> index f6d2327ecb59..d663660f8392 100644
>>> --- a/kernel/trace/bpf_trace.c
>>> +++ b/kernel/trace/bpf_trace.c
>>> @@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
>>> int ret = -EEXIST;
>>>
>>> /*
>>> - * Kprobe override only works for ftrace based kprobes, and only if they
>>> - * are on the opt-in list.
>>> + * Kprobe override only works if they are on the function entry,
>>> + * and only if they are on the opt-in list.
>>> */
>>> if (prog->kprobe_override &&
>>> - (!trace_kprobe_ftrace(event->tp_event) ||
>>> + (!trace_kprobe_on_func_entry(event->tp_event) ||
>>> !trace_kprobe_error_injectable(event->tp_event)))
>>> return -EINVAL;
>>>
>>> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
>>> index 91f4b57dab82..265e3e27e8dc 100644
>>> --- a/kernel/trace/trace_kprobe.c
>>> +++ b/kernel/trace/trace_kprobe.c
>>> @@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
>>> return nhit;
>>> }
>>>
>>> -int trace_kprobe_ftrace(struct trace_event_call *call)
>>> +bool trace_kprobe_on_func_entry(struct trace_event_call *call)
>>> {
>>> struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
>>> - return kprobe_ftrace(&tk->rp.kp);
>>> +
>>> + return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
>>> + tk->rp.kp.offset);
>>
>> That would be nice, but did you test this?
>
> Yes, because the jprobe, which was only official user of modifying execution
> path using kprobe, did same way to check. (and kretprobe also does it)
>
>> My understanding that kprobe will restore all regs and
>> here we need to override return ip _and_ value.
>
> yes, no problem. kprobe restore all regs from pt_regs, including regs->ip.
>
>> Could you add a patch with the test the way Josef did
>> or describe the steps to test this new mode?
>
> Would you mean below patch? If so, it should work without any change.
>
> [PATCH v10 4/5] samples/bpf: add a test for bpf_override_return

yeah. I expect bpf_override_return test to work as-is.
I'm asking for the test for new functionality added by this patch.
In particular kprobe on func entry without ftrace.
How did you test it?
and how I can repeat the test?
I'm still not sure that it works correctly.

2017-12-27 22:50:28

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On 12/27/17 12:09 AM, Masami Hiramatsu wrote:
> On Tue, 26 Dec 2017 18:12:56 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
>> On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
>>> Support in-kernel fault-injection framework via debugfs.
>>> This allows you to inject a conditional error to specified
>>> function using debugfs interfaces.
>>>
>>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>> ---
>>> Documentation/fault-injection/fault-injection.txt | 5 +
>>> kernel/Makefile | 1
>>> kernel/fail_function.c | 169 +++++++++++++++++++++
>>> lib/Kconfig.debug | 10 +
>>> 4 files changed, 185 insertions(+)
>>> create mode 100644 kernel/fail_function.c
>>>
>>> diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
>>> index 918972babcd8..6243a588dd71 100644
>>> --- a/Documentation/fault-injection/fault-injection.txt
>>> +++ b/Documentation/fault-injection/fault-injection.txt
>>> @@ -30,6 +30,11 @@ o fail_mmc_request
>>> injects MMC data errors on devices permitted by setting
>>> debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
>>>
>>> +o fail_function
>>> +
>>> + injects error return on specific functions by setting debugfs entries
>>> + under /sys/kernel/debug/fail_function. No boot option supported.
>>
>> I like it.
>> Could you document it a bit better?
>
> Yes, I will do in next series.
>
>> In particular retval is configurable, but without an example no one
>> will be able to figure out how to use it.
>
> Ah, right. BTW, as I pointed in the covermail, should we store the
> expected error value range into the injectable list? e.g.
>
> ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO)
>
> And provide APIs to check/get it.

I'm afraid such check would be too costly.
Right now we have only two functions marked but I expect hundreds more
will be added in the near future as soon as developers realize the
potential of such error injection.
All of ALLOW_ERROR_INJECTION marks add 8 byte overhead each to .data.
Multiple by 1k and we have 8k of data spent on marks.
If we add max/min range marks that doubles it for very little use.
I think marking function only is enough.

2017-12-28 01:38:55

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On Wed, 27 Dec 2017 14:49:46 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/27/17 12:09 AM, Masami Hiramatsu wrote:
> > On Tue, 26 Dec 2017 18:12:56 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> >> On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
> >>> Support in-kernel fault-injection framework via debugfs.
> >>> This allows you to inject a conditional error to specified
> >>> function using debugfs interfaces.
> >>>
> >>> Signed-off-by: Masami Hiramatsu <[email protected]>
> >>> ---
> >>> Documentation/fault-injection/fault-injection.txt | 5 +
> >>> kernel/Makefile | 1
> >>> kernel/fail_function.c | 169 +++++++++++++++++++++
> >>> lib/Kconfig.debug | 10 +
> >>> 4 files changed, 185 insertions(+)
> >>> create mode 100644 kernel/fail_function.c
> >>>
> >>> diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
> >>> index 918972babcd8..6243a588dd71 100644
> >>> --- a/Documentation/fault-injection/fault-injection.txt
> >>> +++ b/Documentation/fault-injection/fault-injection.txt
> >>> @@ -30,6 +30,11 @@ o fail_mmc_request
> >>> injects MMC data errors on devices permitted by setting
> >>> debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
> >>>
> >>> +o fail_function
> >>> +
> >>> + injects error return on specific functions by setting debugfs entries
> >>> + under /sys/kernel/debug/fail_function. No boot option supported.
> >>
> >> I like it.
> >> Could you document it a bit better?
> >
> > Yes, I will do in next series.
> >
> >> In particular retval is configurable, but without an example no one
> >> will be able to figure out how to use it.
> >
> > Ah, right. BTW, as I pointed in the covermail, should we store the
> > expected error value range into the injectable list? e.g.
> >
> > ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO)
> >
> > And provide APIs to check/get it.
>
> I'm afraid such check would be too costly.
> Right now we have only two functions marked but I expect hundreds more
> will be added in the near future as soon as developers realize the
> potential of such error injection.
> All of ALLOW_ERROR_INJECTION marks add 8 byte overhead each to .data.
> Multiple by 1k and we have 8k of data spent on marks.
> If we add max/min range marks that doubles it for very little use.
> I think marking function only is enough.

Sorry, I don't think so.
Even if it takes 16 bytes more for each points, I don't think it is
any overhead for machines in these days. Even if so, we can provide
a kconfig to reduce it.
I mean, we are living in GB-order memory are, and it will be bigger
in the future. Why we have to worry about hundreds of 16bytes memory
pieces? It will take a few KB, and even if we mark thousands of
functions, it never reaches 1MB, in GB memory pool. :)

Of course, for many small-footprint embedded devices (like having
less than 128MB memory), this feature can be a overhead. But they
can cut off the table by kconfig.

Thank you,

--
Masami Hiramatsu <[email protected]>

2017-12-28 02:34:40

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Wed, 27 Dec 2017 14:46:24 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/26/17 9:56 PM, Masami Hiramatsu wrote:
> > On Tue, 26 Dec 2017 17:57:32 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> >> On Tue, Dec 26, 2017 at 04:46:59PM +0900, Masami Hiramatsu wrote:
> >>> Check whether error injectable event is on function entry or not.
> >>> Currently it checks the event is ftrace-based kprobes or not,
> >>> but that is wrong. It should check if the event is on the entry
> >>> of target function. Since error injection will override a function
> >>> to just return with modified return value, that operation must
> >>> be done before the target function starts making stackframe.
> >>>
> >>> As a side effect, bpf error injection is no need to depend on
> >>> function-tracer. It can work with sw-breakpoint based kprobe
> >>> events too.
> >>>
> >>> Signed-off-by: Masami Hiramatsu <[email protected]>
> >>> ---
> >>> kernel/trace/Kconfig | 2 --
> >>> kernel/trace/bpf_trace.c | 6 +++---
> >>> kernel/trace/trace_kprobe.c | 8 +++++---
> >>> kernel/trace/trace_probe.h | 12 ++++++------
> >>> 4 files changed, 14 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> >>> index ae3a2d519e50..6400e1bf97c5 100644
> >>> --- a/kernel/trace/Kconfig
> >>> +++ b/kernel/trace/Kconfig
> >>> @@ -533,9 +533,7 @@ config FUNCTION_PROFILER
> >>> config BPF_KPROBE_OVERRIDE
> >>> bool "Enable BPF programs to override a kprobed function"
> >>> depends on BPF_EVENTS
> >>> - depends on KPROBES_ON_FTRACE
> >>> depends on HAVE_KPROBE_OVERRIDE
> >>> - depends on DYNAMIC_FTRACE_WITH_REGS
> >>> default n
> >>> help
> >>> Allows BPF to override the execution of a probed function and
> >>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> >>> index f6d2327ecb59..d663660f8392 100644
> >>> --- a/kernel/trace/bpf_trace.c
> >>> +++ b/kernel/trace/bpf_trace.c
> >>> @@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
> >>> int ret = -EEXIST;
> >>>
> >>> /*
> >>> - * Kprobe override only works for ftrace based kprobes, and only if they
> >>> - * are on the opt-in list.
> >>> + * Kprobe override only works if they are on the function entry,
> >>> + * and only if they are on the opt-in list.
> >>> */
> >>> if (prog->kprobe_override &&
> >>> - (!trace_kprobe_ftrace(event->tp_event) ||
> >>> + (!trace_kprobe_on_func_entry(event->tp_event) ||
> >>> !trace_kprobe_error_injectable(event->tp_event)))
> >>> return -EINVAL;
> >>>
> >>> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> >>> index 91f4b57dab82..265e3e27e8dc 100644
> >>> --- a/kernel/trace/trace_kprobe.c
> >>> +++ b/kernel/trace/trace_kprobe.c
> >>> @@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
> >>> return nhit;
> >>> }
> >>>
> >>> -int trace_kprobe_ftrace(struct trace_event_call *call)
> >>> +bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> >>> {
> >>> struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> >>> - return kprobe_ftrace(&tk->rp.kp);
> >>> +
> >>> + return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
> >>> + tk->rp.kp.offset);
> >>
> >> That would be nice, but did you test this?
> >
> > Yes, because the jprobe, which was only official user of modifying execution
> > path using kprobe, did same way to check. (and kretprobe also does it)
> >
> >> My understanding that kprobe will restore all regs and
> >> here we need to override return ip _and_ value.
> >
> > yes, no problem. kprobe restore all regs from pt_regs, including regs->ip.
> >
> >> Could you add a patch with the test the way Josef did
> >> or describe the steps to test this new mode?
> >
> > Would you mean below patch? If so, it should work without any change.
> >
> > [PATCH v10 4/5] samples/bpf: add a test for bpf_override_return
>
> yeah. I expect bpf_override_return test to work as-is.
> I'm asking for the test for new functionality added by this patch.
> In particular kprobe on func entry without ftrace.
> How did you test it?

This function is used in kretprobe and jprobe. Jprobe was the user of
"modifying instruction pointer to another function" in kprobes.
If it doesn't work, jprobe also doesn't work, this means you can not
modify IP by kprobes anymore.
Anyway, until linux-4.13, that was well tested by kprobe smoke test.

> and how I can repeat the test?
> I'm still not sure that it works correctly.

That works correctly because it checks given address is on the entry
point (the 1st instruction) of a function, using kallsyms.

The reason why I made another flag for ftrace was, there are 2 modes
for ftrace dynamic instrumentation, fentry and mcount.
With new fentry mode, ftrace will be put on the first instruction
of the function, so it will work as you expected.
With traditional gcc mcount, ftrace will be called after making call
frame for _mcount(). This means if you modify ip, it will not work
or cause a trouble because _mcount call frame is still on the stack.

So, current ftrace-based checker doesn't work, it depends on the case.
Of course, in most case, kernel will be build in new gcc which
supports fentry, but there is no guarantee.

Please follow what jprobe did if you want to change invoked function
using kprobes. That has been well reviewed and discussed in more than
10 years.

Thank you,

--
Masami Hiramatsu <[email protected]>

2017-12-28 03:46:28

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On 12/27/17 6:34 PM, Masami Hiramatsu wrote:
> On Wed, 27 Dec 2017 14:46:24 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
>> On 12/26/17 9:56 PM, Masami Hiramatsu wrote:
>>> On Tue, 26 Dec 2017 17:57:32 -0800
>>> Alexei Starovoitov <[email protected]> wrote:
>>>
>>>> On Tue, Dec 26, 2017 at 04:46:59PM +0900, Masami Hiramatsu wrote:
>>>>> Check whether error injectable event is on function entry or not.
>>>>> Currently it checks the event is ftrace-based kprobes or not,
>>>>> but that is wrong. It should check if the event is on the entry
>>>>> of target function. Since error injection will override a function
>>>>> to just return with modified return value, that operation must
>>>>> be done before the target function starts making stackframe.
>>>>>
>>>>> As a side effect, bpf error injection is no need to depend on
>>>>> function-tracer. It can work with sw-breakpoint based kprobe
>>>>> events too.
>>>>>
>>>>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>>>> ---
>>>>> kernel/trace/Kconfig | 2 --
>>>>> kernel/trace/bpf_trace.c | 6 +++---
>>>>> kernel/trace/trace_kprobe.c | 8 +++++---
>>>>> kernel/trace/trace_probe.h | 12 ++++++------
>>>>> 4 files changed, 14 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
>>>>> index ae3a2d519e50..6400e1bf97c5 100644
>>>>> --- a/kernel/trace/Kconfig
>>>>> +++ b/kernel/trace/Kconfig
>>>>> @@ -533,9 +533,7 @@ config FUNCTION_PROFILER
>>>>> config BPF_KPROBE_OVERRIDE
>>>>> bool "Enable BPF programs to override a kprobed function"
>>>>> depends on BPF_EVENTS
>>>>> - depends on KPROBES_ON_FTRACE
>>>>> depends on HAVE_KPROBE_OVERRIDE
>>>>> - depends on DYNAMIC_FTRACE_WITH_REGS
>>>>> default n
>>>>> help
>>>>> Allows BPF to override the execution of a probed function and
>>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>>>> index f6d2327ecb59..d663660f8392 100644
>>>>> --- a/kernel/trace/bpf_trace.c
>>>>> +++ b/kernel/trace/bpf_trace.c
>>>>> @@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
>>>>> int ret = -EEXIST;
>>>>>
>>>>> /*
>>>>> - * Kprobe override only works for ftrace based kprobes, and only if they
>>>>> - * are on the opt-in list.
>>>>> + * Kprobe override only works if they are on the function entry,
>>>>> + * and only if they are on the opt-in list.
>>>>> */
>>>>> if (prog->kprobe_override &&
>>>>> - (!trace_kprobe_ftrace(event->tp_event) ||
>>>>> + (!trace_kprobe_on_func_entry(event->tp_event) ||
>>>>> !trace_kprobe_error_injectable(event->tp_event)))
>>>>> return -EINVAL;
>>>>>
>>>>> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
>>>>> index 91f4b57dab82..265e3e27e8dc 100644
>>>>> --- a/kernel/trace/trace_kprobe.c
>>>>> +++ b/kernel/trace/trace_kprobe.c
>>>>> @@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
>>>>> return nhit;
>>>>> }
>>>>>
>>>>> -int trace_kprobe_ftrace(struct trace_event_call *call)
>>>>> +bool trace_kprobe_on_func_entry(struct trace_event_call *call)
>>>>> {
>>>>> struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
>>>>> - return kprobe_ftrace(&tk->rp.kp);
>>>>> +
>>>>> + return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
>>>>> + tk->rp.kp.offset);
>>>>
>>>> That would be nice, but did you test this?
>>>
>>> Yes, because the jprobe, which was only official user of modifying execution
>>> path using kprobe, did same way to check. (and kretprobe also does it)
>>>
>>>> My understanding that kprobe will restore all regs and
>>>> here we need to override return ip _and_ value.
>>>
>>> yes, no problem. kprobe restore all regs from pt_regs, including regs->ip.
>>>
>>>> Could you add a patch with the test the way Josef did
>>>> or describe the steps to test this new mode?
>>>
>>> Would you mean below patch? If so, it should work without any change.
>>>
>>> [PATCH v10 4/5] samples/bpf: add a test for bpf_override_return
>>
>> yeah. I expect bpf_override_return test to work as-is.
>> I'm asking for the test for new functionality added by this patch.
>> In particular kprobe on func entry without ftrace.
>> How did you test it?
>
> This function is used in kretprobe and jprobe. Jprobe was the user of
> "modifying instruction pointer to another function" in kprobes.
> If it doesn't work, jprobe also doesn't work, this means you can not
> modify IP by kprobes anymore.
> Anyway, until linux-4.13, that was well tested by kprobe smoke test.
>
>> and how I can repeat the test?
>> I'm still not sure that it works correctly.
>
> That works correctly because it checks given address is on the entry
> point (the 1st instruction) of a function, using kallsyms.
>
> The reason why I made another flag for ftrace was, there are 2 modes
> for ftrace dynamic instrumentation, fentry and mcount.
> With new fentry mode, ftrace will be put on the first instruction
> of the function, so it will work as you expected.
> With traditional gcc mcount, ftrace will be called after making call
> frame for _mcount(). This means if you modify ip, it will not work
> or cause a trouble because _mcount call frame is still on the stack.
>
> So, current ftrace-based checker doesn't work, it depends on the case.
> Of course, in most case, kernel will be build in new gcc which
> supports fentry, but there is no guarantee.

I don't think that's the case. My reading of current
trace_kprobe_ftrace() -> arch_check_ftrace_location()
is that it will not be true for old mcount case.

As far as the rest of your arguments it very much puzzles me that
you claim that this patch suppose to work based on historical
reasoning whereas you did NOT test it.

2017-12-28 03:50:08

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On 12/27/17 5:38 PM, Masami Hiramatsu wrote:
> On Wed, 27 Dec 2017 14:49:46 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
>> On 12/27/17 12:09 AM, Masami Hiramatsu wrote:
>>> On Tue, 26 Dec 2017 18:12:56 -0800
>>> Alexei Starovoitov <[email protected]> wrote:
>>>
>>>> On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
>>>>> Support in-kernel fault-injection framework via debugfs.
>>>>> This allows you to inject a conditional error to specified
>>>>> function using debugfs interfaces.
>>>>>
>>>>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>>>> ---
>>>>> Documentation/fault-injection/fault-injection.txt | 5 +
>>>>> kernel/Makefile | 1
>>>>> kernel/fail_function.c | 169 +++++++++++++++++++++
>>>>> lib/Kconfig.debug | 10 +
>>>>> 4 files changed, 185 insertions(+)
>>>>> create mode 100644 kernel/fail_function.c
>>>>>
>>>>> diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
>>>>> index 918972babcd8..6243a588dd71 100644
>>>>> --- a/Documentation/fault-injection/fault-injection.txt
>>>>> +++ b/Documentation/fault-injection/fault-injection.txt
>>>>> @@ -30,6 +30,11 @@ o fail_mmc_request
>>>>> injects MMC data errors on devices permitted by setting
>>>>> debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
>>>>>
>>>>> +o fail_function
>>>>> +
>>>>> + injects error return on specific functions by setting debugfs entries
>>>>> + under /sys/kernel/debug/fail_function. No boot option supported.
>>>>
>>>> I like it.
>>>> Could you document it a bit better?
>>>
>>> Yes, I will do in next series.
>>>
>>>> In particular retval is configurable, but without an example no one
>>>> will be able to figure out how to use it.
>>>
>>> Ah, right. BTW, as I pointed in the covermail, should we store the
>>> expected error value range into the injectable list? e.g.
>>>
>>> ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO)
>>>
>>> And provide APIs to check/get it.
>>
>> I'm afraid such check would be too costly.
>> Right now we have only two functions marked but I expect hundreds more
>> will be added in the near future as soon as developers realize the
>> potential of such error injection.
>> All of ALLOW_ERROR_INJECTION marks add 8 byte overhead each to .data.
>> Multiple by 1k and we have 8k of data spent on marks.
>> If we add max/min range marks that doubles it for very little use.
>> I think marking function only is enough.
>
> Sorry, I don't think so.
> Even if it takes 16 bytes more for each points, I don't think it is
> any overhead for machines in these days. Even if so, we can provide
> a kconfig to reduce it.
> I mean, we are living in GB-order memory are, and it will be bigger
> in the future. Why we have to worry about hundreds of 16bytes memory
> pieces? It will take a few KB, and even if we mark thousands of
> functions, it never reaches 1MB, in GB memory pool. :)
>
> Of course, for many small-footprint embedded devices (like having
> less than 128MB memory), this feature can be a overhead. But they
> can cut off the table by kconfig.

I still disagree on wasting 16-byte * num_of_funcs of .data here.
The trade-off of usability vs memory just not worth it. Sorry.
Please focus on testing your changes instead.


2017-12-28 04:16:50

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Wed, 27 Dec 2017 19:45:42 -0800
Alexei Starovoitov <[email protected]> wrote:

> I don't think that's the case. My reading of current
> trace_kprobe_ftrace() -> arch_check_ftrace_location()
> is that it will not be true for old mcount case.

In the old mcount case, you can't use ftrace to return without calling
the function. That is, no modification of the return ip, unless you
created a trampoline that could handle arbitrary stack frames, and
remove them from the stack before returning back to the function.

>
> As far as the rest of your arguments it very much puzzles me that
> you claim that this patch suppose to work based on historical
> reasoning whereas you did NOT test it.

I believe that Masami is saying that the modification of the IP from
kprobes has been very well tested. But I'm guessing that you still want
a test case for using kprobes in this particular instance. It's not the
implementation of modifying the IP that you are worried about, but the
implementation of BPF using it in this case. Right?

-- Steve

2017-12-28 04:32:50

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On 12/27/17 8:16 PM, Steven Rostedt wrote:
> On Wed, 27 Dec 2017 19:45:42 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
>> I don't think that's the case. My reading of current
>> trace_kprobe_ftrace() -> arch_check_ftrace_location()
>> is that it will not be true for old mcount case.
>
> In the old mcount case, you can't use ftrace to return without calling
> the function. That is, no modification of the return ip, unless you
> created a trampoline that could handle arbitrary stack frames, and
> remove them from the stack before returning back to the function.

correct. I was saying that trace_kprobe_ftrace() won't let us do
bpf_override_return with old mcount.

>>
>> As far as the rest of your arguments it very much puzzles me that
>> you claim that this patch suppose to work based on historical
>> reasoning whereas you did NOT test it.
>
> I believe that Masami is saying that the modification of the IP from
> kprobes has been very well tested. But I'm guessing that you still want
> a test case for using kprobes in this particular instance. It's not the
> implementation of modifying the IP that you are worried about, but the
> implementation of BPF using it in this case. Right?

exactly. No doubt that old code works.
But it doesn't mean that bpf_override_return() will continue to
work in kprobes that are not ftrace based.
I suspect Josef's existing test case will cover this situation.
Probably only special .config is needed to disable ftrace, so
"kprobe on entry but not ftrace" check will kick in.
But I didn't get an impression that this situation was tested.
Instead I see only logical reasoning that it's _supposed_ to work.
That's not enough.

2017-12-28 07:51:46

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On Wed, 27 Dec 2017 19:49:28 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/27/17 5:38 PM, Masami Hiramatsu wrote:
> > On Wed, 27 Dec 2017 14:49:46 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> >> On 12/27/17 12:09 AM, Masami Hiramatsu wrote:
> >>> On Tue, 26 Dec 2017 18:12:56 -0800
> >>> Alexei Starovoitov <[email protected]> wrote:
> >>>
> >>>> On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
> >>>>> Support in-kernel fault-injection framework via debugfs.
> >>>>> This allows you to inject a conditional error to specified
> >>>>> function using debugfs interfaces.
> >>>>>
> >>>>> Signed-off-by: Masami Hiramatsu <[email protected]>
> >>>>> ---
> >>>>> Documentation/fault-injection/fault-injection.txt | 5 +
> >>>>> kernel/Makefile | 1
> >>>>> kernel/fail_function.c | 169 +++++++++++++++++++++
> >>>>> lib/Kconfig.debug | 10 +
> >>>>> 4 files changed, 185 insertions(+)
> >>>>> create mode 100644 kernel/fail_function.c
> >>>>>
> >>>>> diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
> >>>>> index 918972babcd8..6243a588dd71 100644
> >>>>> --- a/Documentation/fault-injection/fault-injection.txt
> >>>>> +++ b/Documentation/fault-injection/fault-injection.txt
> >>>>> @@ -30,6 +30,11 @@ o fail_mmc_request
> >>>>> injects MMC data errors on devices permitted by setting
> >>>>> debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
> >>>>>
> >>>>> +o fail_function
> >>>>> +
> >>>>> + injects error return on specific functions by setting debugfs entries
> >>>>> + under /sys/kernel/debug/fail_function. No boot option supported.
> >>>>
> >>>> I like it.
> >>>> Could you document it a bit better?
> >>>
> >>> Yes, I will do in next series.
> >>>
> >>>> In particular retval is configurable, but without an example no one
> >>>> will be able to figure out how to use it.
> >>>
> >>> Ah, right. BTW, as I pointed in the covermail, should we store the
> >>> expected error value range into the injectable list? e.g.
> >>>
> >>> ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO)
> >>>
> >>> And provide APIs to check/get it.
> >>
> >> I'm afraid such check would be too costly.
> >> Right now we have only two functions marked but I expect hundreds more
> >> will be added in the near future as soon as developers realize the
> >> potential of such error injection.
> >> All of ALLOW_ERROR_INJECTION marks add 8 byte overhead each to .data.
> >> Multiple by 1k and we have 8k of data spent on marks.
> >> If we add max/min range marks that doubles it for very little use.
> >> I think marking function only is enough.
> >
> > Sorry, I don't think so.
> > Even if it takes 16 bytes more for each points, I don't think it is
> > any overhead for machines in these days. Even if so, we can provide
> > a kconfig to reduce it.
> > I mean, we are living in GB-order memory are, and it will be bigger
> > in the future. Why we have to worry about hundreds of 16bytes memory
> > pieces? It will take a few KB, and even if we mark thousands of
> > functions, it never reaches 1MB, in GB memory pool. :)
> >
> > Of course, for many small-footprint embedded devices (like having
> > less than 128MB memory), this feature can be a overhead. But they
> > can cut off the table by kconfig.
>
> I still disagree on wasting 16-byte * num_of_funcs of .data here.
> The trade-off of usability vs memory just not worth it. Sorry.
> Please focus on testing your changes instead.

Then what happen if the user set invalid retval to those functions?
even if we limit the injectable functions, it can cause a problem,

for example,

obj = func_return_object();
if (!obj) {
handling_error...;
}
obj->field = x;

In this case, obviously func_return_object() must return NULL if there is
an error, not -ENOMEM. But without the correct retval information, how would
you check the BPF code doesn't cause a trouble?
Currently it seems you are expecting only the functions which return error code.

ret = func_return_state();
if (ret < 0) {
handling_error...;
}

But how we can distinguish those?

If we have the error range for each function, we can ensure what is
*correct* error code, NULL or errno, or any other error numbers. :)

At least fail_function needs this feature because it can check
return value when setting it up.

Thank you,

--
Masami Hiramatsu <[email protected]>

2017-12-28 07:56:11

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Wed, 27 Dec 2017 19:45:42 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/27/17 6:34 PM, Masami Hiramatsu wrote:
> > On Wed, 27 Dec 2017 14:46:24 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> >> On 12/26/17 9:56 PM, Masami Hiramatsu wrote:
> >>> On Tue, 26 Dec 2017 17:57:32 -0800
> >>> Alexei Starovoitov <[email protected]> wrote:
> >>>
> >>>> On Tue, Dec 26, 2017 at 04:46:59PM +0900, Masami Hiramatsu wrote:
> >>>>> Check whether error injectable event is on function entry or not.
> >>>>> Currently it checks the event is ftrace-based kprobes or not,
> >>>>> but that is wrong. It should check if the event is on the entry
> >>>>> of target function. Since error injection will override a function
> >>>>> to just return with modified return value, that operation must
> >>>>> be done before the target function starts making stackframe.
> >>>>>
> >>>>> As a side effect, bpf error injection is no need to depend on
> >>>>> function-tracer. It can work with sw-breakpoint based kprobe
> >>>>> events too.
> >>>>>
> >>>>> Signed-off-by: Masami Hiramatsu <[email protected]>
> >>>>> ---
> >>>>> kernel/trace/Kconfig | 2 --
> >>>>> kernel/trace/bpf_trace.c | 6 +++---
> >>>>> kernel/trace/trace_kprobe.c | 8 +++++---
> >>>>> kernel/trace/trace_probe.h | 12 ++++++------
> >>>>> 4 files changed, 14 insertions(+), 14 deletions(-)
> >>>>>
> >>>>> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> >>>>> index ae3a2d519e50..6400e1bf97c5 100644
> >>>>> --- a/kernel/trace/Kconfig
> >>>>> +++ b/kernel/trace/Kconfig
> >>>>> @@ -533,9 +533,7 @@ config FUNCTION_PROFILER
> >>>>> config BPF_KPROBE_OVERRIDE
> >>>>> bool "Enable BPF programs to override a kprobed function"
> >>>>> depends on BPF_EVENTS
> >>>>> - depends on KPROBES_ON_FTRACE
> >>>>> depends on HAVE_KPROBE_OVERRIDE
> >>>>> - depends on DYNAMIC_FTRACE_WITH_REGS
> >>>>> default n
> >>>>> help
> >>>>> Allows BPF to override the execution of a probed function and
> >>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> >>>>> index f6d2327ecb59..d663660f8392 100644
> >>>>> --- a/kernel/trace/bpf_trace.c
> >>>>> +++ b/kernel/trace/bpf_trace.c
> >>>>> @@ -800,11 +800,11 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
> >>>>> int ret = -EEXIST;
> >>>>>
> >>>>> /*
> >>>>> - * Kprobe override only works for ftrace based kprobes, and only if they
> >>>>> - * are on the opt-in list.
> >>>>> + * Kprobe override only works if they are on the function entry,
> >>>>> + * and only if they are on the opt-in list.
> >>>>> */
> >>>>> if (prog->kprobe_override &&
> >>>>> - (!trace_kprobe_ftrace(event->tp_event) ||
> >>>>> + (!trace_kprobe_on_func_entry(event->tp_event) ||
> >>>>> !trace_kprobe_error_injectable(event->tp_event)))
> >>>>> return -EINVAL;
> >>>>>
> >>>>> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> >>>>> index 91f4b57dab82..265e3e27e8dc 100644
> >>>>> --- a/kernel/trace/trace_kprobe.c
> >>>>> +++ b/kernel/trace/trace_kprobe.c
> >>>>> @@ -88,13 +88,15 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
> >>>>> return nhit;
> >>>>> }
> >>>>>
> >>>>> -int trace_kprobe_ftrace(struct trace_event_call *call)
> >>>>> +bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> >>>>> {
> >>>>> struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> >>>>> - return kprobe_ftrace(&tk->rp.kp);
> >>>>> +
> >>>>> + return kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.symbol_name,
> >>>>> + tk->rp.kp.offset);
> >>>>
> >>>> That would be nice, but did you test this?
> >>>
> >>> Yes, because the jprobe, which was only official user of modifying execution
> >>> path using kprobe, did same way to check. (and kretprobe also does it)
> >>>
> >>>> My understanding that kprobe will restore all regs and
> >>>> here we need to override return ip _and_ value.
> >>>
> >>> yes, no problem. kprobe restore all regs from pt_regs, including regs->ip.
> >>>
> >>>> Could you add a patch with the test the way Josef did
> >>>> or describe the steps to test this new mode?
> >>>
> >>> Would you mean below patch? If so, it should work without any change.
> >>>
> >>> [PATCH v10 4/5] samples/bpf: add a test for bpf_override_return
> >>
> >> yeah. I expect bpf_override_return test to work as-is.
> >> I'm asking for the test for new functionality added by this patch.
> >> In particular kprobe on func entry without ftrace.
> >> How did you test it?
> >
> > This function is used in kretprobe and jprobe. Jprobe was the user of
> > "modifying instruction pointer to another function" in kprobes.
> > If it doesn't work, jprobe also doesn't work, this means you can not
> > modify IP by kprobes anymore.
> > Anyway, until linux-4.13, that was well tested by kprobe smoke test.
> >
> >> and how I can repeat the test?
> >> I'm still not sure that it works correctly.
> >
> > That works correctly because it checks given address is on the entry
> > point (the 1st instruction) of a function, using kallsyms.
> >
> > The reason why I made another flag for ftrace was, there are 2 modes
> > for ftrace dynamic instrumentation, fentry and mcount.
> > With new fentry mode, ftrace will be put on the first instruction
> > of the function, so it will work as you expected.
> > With traditional gcc mcount, ftrace will be called after making call
> > frame for _mcount(). This means if you modify ip, it will not work
> > or cause a trouble because _mcount call frame is still on the stack.
> >
> > So, current ftrace-based checker doesn't work, it depends on the case.
> > Of course, in most case, kernel will be build in new gcc which
> > supports fentry, but there is no guarantee.
>
> I don't think that's the case. My reading of current
> trace_kprobe_ftrace() -> arch_check_ftrace_location()
> is that it will not be true for old mcount case.
>
> As far as the rest of your arguments it very much puzzles me that
> you claim that this patch suppose to work based on historical
> reasoning whereas you did NOT test it.

No, even with this patch, I meant, you can still test it with Josef's
testcase, since this is just backend change. No frontend changes.

Thank you,

--
Masami Hiramatsu <[email protected]>

2017-12-28 08:20:33

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Wed, 27 Dec 2017 20:32:07 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/27/17 8:16 PM, Steven Rostedt wrote:
> > On Wed, 27 Dec 2017 19:45:42 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> >> I don't think that's the case. My reading of current
> >> trace_kprobe_ftrace() -> arch_check_ftrace_location()
> >> is that it will not be true for old mcount case.
> >
> > In the old mcount case, you can't use ftrace to return without calling
> > the function. That is, no modification of the return ip, unless you
> > created a trampoline that could handle arbitrary stack frames, and
> > remove them from the stack before returning back to the function.
>
> correct. I was saying that trace_kprobe_ftrace() won't let us do
> bpf_override_return with old mcount.

No, trace_kprobe_ftrace() just checks the given address will be
managed by ftrace. you can see arch_check_ftrace_location() in kernel/kprobes.c.

FYI, CONFIG_KPROBES_ON_FTRACE depends on DYNAMIC_FTRACE_WITH_REGS, and
DYNAMIC_FTRACE_WITH_REGS doesn't depend on CC_USING_FENTRY.
This means if you compile kernel with old gcc and enable DYNAMIC_FTRACE,
kprobes uses ftrace on mcount address which is NOT the entry point
of target function.

On the other hand, changing IP feature has been implemented originaly
by kprobes with int3 (sw breakpoint). This means you can use kprobes
at correct address (the entry address of the function) you can hijack
the function, as jprobe did.

> >> As far as the rest of your arguments it very much puzzles me that
> >> you claim that this patch suppose to work based on historical
> >> reasoning whereas you did NOT test it.
> >
> > I believe that Masami is saying that the modification of the IP from
> > kprobes has been very well tested. But I'm guessing that you still want
> > a test case for using kprobes in this particular instance. It's not the
> > implementation of modifying the IP that you are worried about, but the
> > implementation of BPF using it in this case. Right?
>
> exactly. No doubt that old code works.
> But it doesn't mean that bpf_override_return() will continue to
> work in kprobes that are not ftrace based.
> I suspect Josef's existing test case will cover this situation.
> Probably only special .config is needed to disable ftrace, so
> "kprobe on entry but not ftrace" check will kick in.

Right. If you need to test it, you can run Josef's test case without
CONFIG_DYNAMIC_FTRACE.

> But I didn't get an impression that this situation was tested.
> Instead I see only logical reasoning that it's _supposed_ to work.
> That's not enough.

OK, so would you just ask me to run samples/bpf ?

Thanks,

--
Masami Hiramatsu <[email protected]>

2017-12-29 01:04:09

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On 12/28/17 12:20 AM, Masami Hiramatsu wrote:
> On Wed, 27 Dec 2017 20:32:07 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
>> On 12/27/17 8:16 PM, Steven Rostedt wrote:
>>> On Wed, 27 Dec 2017 19:45:42 -0800
>>> Alexei Starovoitov <[email protected]> wrote:
>>>
>>>> I don't think that's the case. My reading of current
>>>> trace_kprobe_ftrace() -> arch_check_ftrace_location()
>>>> is that it will not be true for old mcount case.
>>>
>>> In the old mcount case, you can't use ftrace to return without calling
>>> the function. That is, no modification of the return ip, unless you
>>> created a trampoline that could handle arbitrary stack frames, and
>>> remove them from the stack before returning back to the function.
>>
>> correct. I was saying that trace_kprobe_ftrace() won't let us do
>> bpf_override_return with old mcount.
>
> No, trace_kprobe_ftrace() just checks the given address will be
> managed by ftrace. you can see arch_check_ftrace_location() in kernel/kprobes.c.
>
> FYI, CONFIG_KPROBES_ON_FTRACE depends on DYNAMIC_FTRACE_WITH_REGS, and
> DYNAMIC_FTRACE_WITH_REGS doesn't depend on CC_USING_FENTRY.
> This means if you compile kernel with old gcc and enable DYNAMIC_FTRACE,
> kprobes uses ftrace on mcount address which is NOT the entry point
> of target function.

ok. fair enough. I think we can gate the feature to !mcount only.

> On the other hand, changing IP feature has been implemented originaly
> by kprobes with int3 (sw breakpoint). This means you can use kprobes
> at correct address (the entry address of the function) you can hijack
> the function, as jprobe did.
>
>>>> As far as the rest of your arguments it very much puzzles me that
>>>> you claim that this patch suppose to work based on historical
>>>> reasoning whereas you did NOT test it.
>>>
>>> I believe that Masami is saying that the modification of the IP from
>>> kprobes has been very well tested. But I'm guessing that you still want
>>> a test case for using kprobes in this particular instance. It's not the
>>> implementation of modifying the IP that you are worried about, but the
>>> implementation of BPF using it in this case. Right?
>>
>> exactly. No doubt that old code works.
>> But it doesn't mean that bpf_override_return() will continue to
>> work in kprobes that are not ftrace based.
>> I suspect Josef's existing test case will cover this situation.
>> Probably only special .config is needed to disable ftrace, so
>> "kprobe on entry but not ftrace" check will kick in.
>
> Right. If you need to test it, you can run Josef's test case without
> CONFIG_DYNAMIC_FTRACE.

It should be obvious that the person who submits the patch
must run the tests.

>> But I didn't get an impression that this situation was tested.
>> Instead I see only logical reasoning that it's _supposed_ to work.
>> That's not enough.
>
> OK, so would you just ask me to run samples/bpf ?

Please run Josef's test in the !ftrace setup.

2017-12-29 01:12:08

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On 12/27/17 11:51 PM, Masami Hiramatsu wrote:
>
> Then what happen if the user set invalid retval to those functions?
> even if we limit the injectable functions, it can cause a problem,
>
> for example,
>
> obj = func_return_object();
> if (!obj) {
> handling_error...;
> }
> obj->field = x;
>
> In this case, obviously func_return_object() must return NULL if there is
> an error, not -ENOMEM. But without the correct retval information, how would
> you check the BPF code doesn't cause a trouble?
> Currently it seems you are expecting only the functions which return error code.
>
> ret = func_return_state();
> if (ret < 0) {
> handling_error...;
> }
>
> But how we can distinguish those?
>
> If we have the error range for each function, we can ensure what is
> *correct* error code, NULL or errno, or any other error numbers. :)

messing up return values may cause problems and range check is
not going to magically help.
The caller may handle only a certain set of errors or interpret
some of them like EBUSY as a signal to retry.
It's plain impossible to make sure that kernel will be functional
after error injection has been made.
Like kmalloc() unconditionally returning NULL will be deadly
for the kernel, hence this patch 4/4 has very limited practical
use. The bpf program need to make intelligent decisions when
to return an error and what kind of error to return.
Doing blank range check adds a false sense of additional safety.
More so it wastes kilobytes of memory to do this check, hence nack.

2017-12-29 07:34:41

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

On Thu, 28 Dec 2017 17:11:31 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/27/17 11:51 PM, Masami Hiramatsu wrote:
> >
> > Then what happen if the user set invalid retval to those functions?
> > even if we limit the injectable functions, it can cause a problem,
> >
> > for example,
> >
> > obj = func_return_object();
> > if (!obj) {
> > handling_error...;
> > }
> > obj->field = x;
> >
> > In this case, obviously func_return_object() must return NULL if there is
> > an error, not -ENOMEM. But without the correct retval information, how would
> > you check the BPF code doesn't cause a trouble?
> > Currently it seems you are expecting only the functions which return error code.
> >
> > ret = func_return_state();
> > if (ret < 0) {
> > handling_error...;
> > }
> >
> > But how we can distinguish those?
> >
> > If we have the error range for each function, we can ensure what is
> > *correct* error code, NULL or errno, or any other error numbers. :)
>
> messing up return values may cause problems and range check is
> not going to magically help.
> The caller may handle only a certain set of errors or interpret
> some of them like EBUSY as a signal to retry.
> It's plain impossible to make sure that kernel will be functional
> after error injection has been made.

Hmm, if so, why we need this injectable table?
If we can not make sure the safeness of the error injection (of course, yes)
why we need to limit the error injection on such limited functions?
I think we don't need it anymore. Any function can be injectable, and no
need to make sure the safeness.

Thank you,

> Like kmalloc() unconditionally returning NULL will be deadly
> for the kernel, hence this patch 4/4 has very limited practical
> use. The bpf program need to make intelligent decisions when
> to return an error and what kind of error to return.
> Doing blank range check adds a false sense of additional safety.
> More so it wastes kilobytes of memory to do this check, hence nack.
>


--
Masami Hiramatsu <[email protected]>

2017-12-29 08:20:35

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Thu, 28 Dec 2017 17:03:24 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/28/17 12:20 AM, Masami Hiramatsu wrote:
> > On Wed, 27 Dec 2017 20:32:07 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> >> On 12/27/17 8:16 PM, Steven Rostedt wrote:
> >>> On Wed, 27 Dec 2017 19:45:42 -0800
> >>> Alexei Starovoitov <[email protected]> wrote:
> >>>
> >>>> I don't think that's the case. My reading of current
> >>>> trace_kprobe_ftrace() -> arch_check_ftrace_location()
> >>>> is that it will not be true for old mcount case.
> >>>
> >>> In the old mcount case, you can't use ftrace to return without calling
> >>> the function. That is, no modification of the return ip, unless you
> >>> created a trampoline that could handle arbitrary stack frames, and
> >>> remove them from the stack before returning back to the function.
> >>
> >> correct. I was saying that trace_kprobe_ftrace() won't let us do
> >> bpf_override_return with old mcount.
> >
> > No, trace_kprobe_ftrace() just checks the given address will be
> > managed by ftrace. you can see arch_check_ftrace_location() in kernel/kprobes.c.
> >
> > FYI, CONFIG_KPROBES_ON_FTRACE depends on DYNAMIC_FTRACE_WITH_REGS, and
> > DYNAMIC_FTRACE_WITH_REGS doesn't depend on CC_USING_FENTRY.
> > This means if you compile kernel with old gcc and enable DYNAMIC_FTRACE,
> > kprobes uses ftrace on mcount address which is NOT the entry point
> > of target function.
>
> ok. fair enough. I think we can gate the feature to !mcount only.
>
> > On the other hand, changing IP feature has been implemented originaly
> > by kprobes with int3 (sw breakpoint). This means you can use kprobes
> > at correct address (the entry address of the function) you can hijack
> > the function, as jprobe did.
> >
> >>>> As far as the rest of your arguments it very much puzzles me that
> >>>> you claim that this patch suppose to work based on historical
> >>>> reasoning whereas you did NOT test it.
> >>>
> >>> I believe that Masami is saying that the modification of the IP from
> >>> kprobes has been very well tested. But I'm guessing that you still want
> >>> a test case for using kprobes in this particular instance. It's not the
> >>> implementation of modifying the IP that you are worried about, but the
> >>> implementation of BPF using it in this case. Right?
> >>
> >> exactly. No doubt that old code works.
> >> But it doesn't mean that bpf_override_return() will continue to
> >> work in kprobes that are not ftrace based.
> >> I suspect Josef's existing test case will cover this situation.
> >> Probably only special .config is needed to disable ftrace, so
> >> "kprobe on entry but not ftrace" check will kick in.
> >
> > Right. If you need to test it, you can run Josef's test case without
> > CONFIG_DYNAMIC_FTRACE.
>
> It should be obvious that the person who submits the patch
> must run the tests.
>
> >> But I didn't get an impression that this situation was tested.
> >> Instead I see only logical reasoning that it's _supposed_ to work.
> >> That's not enough.
> >
> > OK, so would you just ask me to run samples/bpf ?
>
> Please run Josef's test in the !ftrace setup.

Yes, I'll add the result of the test case.

Thank you,


--
Masami Hiramatsu <[email protected]>

2018-01-04 16:07:23

by Josef Bacik

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 0/4] Separate error injection table from kprobes

On Tue, Dec 26, 2017 at 04:46:28PM +0900, Masami Hiramatsu wrote:
> Hi Josef and Alexei,
>
> Here are the 2nd version of patches to moving error injection
> table from kprobes. In this series I did a small fixes and
> add function-based fault injection.
>
> Here is the previous version:
>
> https://lkml.org/lkml/2017/12/22/554
>
> There are 2 main reasons why I separate it from kprobes.
>
> - kprobes users can modify execution path not only at
> error-injection whitelist functions but also other
> functions. I don't like to suggest user that such
> limitation is from kprobes itself.
>
> - This error injection information is also useful for
> ftrace (function-hook) and livepatch. It should not
> be limited by CONFIG_KPROBES.
>
> So I introduced CONFIG_FUNCTION_ERROR_INJECTION for this feature.
> Also CONFIG_FAIL_FUNCTION is added, which provides function-based
> error injection interface via debugfs following fault-injection
> framework. See [4/4].
>
> Any thoughts?

Sorry Masami, I've been on vacation for the last two weeks. This approach is
fine by me, if we want to allow other mechanisms other than bpf to use this
functionality then hooray. I'll do a proper review when you post v3, just
wanted to let you know I wasn't ignoring you. Thanks,

Josef

2018-01-08 03:02:56

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On 12/29/17 12:20 AM, Masami Hiramatsu wrote:
>> Please run Josef's test in the !ftrace setup.
> Yes, I'll add the result of the test case.

if Josef's test is passing in !ftrace config,
please resend your patches.
I think 2 and 3 were nice simplifications.
and patch 1 is good too if it's passes the test.
Would be great to get them in for this merge window.

Thanks!

2018-01-08 13:21:21

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry

On Sun, 7 Jan 2018 19:01:57 -0800
Alexei Starovoitov <[email protected]> wrote:

> On 12/29/17 12:20 AM, Masami Hiramatsu wrote:
> >> Please run Josef's test in the !ftrace setup.
> > Yes, I'll add the result of the test case.
>
> if Josef's test is passing in !ftrace config,
> please resend your patches.
> I think 2 and 3 were nice simplifications.
> and patch 1 is good too if it's passes the test.
> Would be great to get them in for this merge window.

OK, I'll try to do that.

Thanks!

>
> Thanks!
>


--
Masami Hiramatsu <[email protected]>

2018-01-09 01:17:50

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v2 0/4] Separate error injection table from kprobes

On Thu, 4 Jan 2018 11:07:16 -0500
Josef Bacik <[email protected]> wrote:

> On Tue, Dec 26, 2017 at 04:46:28PM +0900, Masami Hiramatsu wrote:
> > Hi Josef and Alexei,
> >
> > Here are the 2nd version of patches to moving error injection
> > table from kprobes. In this series I did a small fixes and
> > add function-based fault injection.
> >
> > Here is the previous version:
> >
> > https://lkml.org/lkml/2017/12/22/554
> >
> > There are 2 main reasons why I separate it from kprobes.
> >
> > - kprobes users can modify execution path not only at
> > error-injection whitelist functions but also other
> > functions. I don't like to suggest user that such
> > limitation is from kprobes itself.
> >
> > - This error injection information is also useful for
> > ftrace (function-hook) and livepatch. It should not
> > be limited by CONFIG_KPROBES.
> >
> > So I introduced CONFIG_FUNCTION_ERROR_INJECTION for this feature.
> > Also CONFIG_FAIL_FUNCTION is added, which provides function-based
> > error injection interface via debugfs following fault-injection
> > framework. See [4/4].
> >
> > Any thoughts?
>
> Sorry Masami, I've been on vacation for the last two weeks. This approach is
> fine by me, if we want to allow other mechanisms other than bpf to use this
> functionality then hooray. I'll do a proper review when you post v3, just
> wanted to let you know I wasn't ignoring you. Thanks,

Yeah, thank you for the kindful notice ;)

BTW, could you tell me how I can run your test case?

When I tried to build the tests (samples/bpf) I got below error and stopped.

[mhiramat@devbox bpf]$ LANG=C make
make -C ../../ /home/mhiramat/ksrc/linux/samples/bpf/
make[1]: Entering directory '/home/mhiramat/ksrc/linux'
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CHK include/generated/bounds.h
CHK include/generated/timeconst.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK scripts/mod/devicetable-offsets.h
HOSTCC /home/mhiramat/ksrc/linux/samples/bpf/test_lru_dist.o
/home/mhiramat/ksrc/linux/samples/bpf/test_lru_dist.c:39:8: error: redefinition of 'struct list_head'
struct list_head {
^~~~~~~~~
In file included from /home/mhiramat/ksrc/linux/samples/bpf/test_lru_dist.c:9:0:
./tools/include/linux/types.h:69:8: note: originally defined here
struct list_head {
^~~~~~~~~
make[2]: *** [scripts/Makefile.host:107: /home/mhiramat/ksrc/linux/samples/bpf/test_lru_dist.o] Error 1
make[1]: *** [Makefile:1675: /home/mhiramat/ksrc/linux/samples/bpf/] Error 2
make[1]: Leaving directory '/home/mhiramat/ksrc/linux'
make: *** [Makefile:204: all] Error 2


Thank you,

--
Masami Hiramatsu <[email protected]>