2013-10-29 06:53:47

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

Hello,

This patchset implements memory (address), stack[N], deference,
bitfield and retval (it needs uretprobe tho) fetch methods for
uprobes. It's based on the previous work [1] done by Hyeoncheol Lee.

Now kprobes and uprobes have their own fetch_type_tables and, in turn,
memory and stack access methods. Other fetch methods are shared.

For the dereference method, I added a new argument to fetch functions.
It's because for uprobes it needs to know whether the given address is
a file offset or a virtual address in an user process. For instance,
in case of fetching from a memory directly (like @offset) it should
convert the address (offset) to a virtual address of the process, but
if it's a dereferencing, the given address already has the virtual
address.

To determine this in a fetch function, I passed a pointer to
trace_uprobe for direct fetch, and passed NULL for dereference.

The patch 1-2 are bug fixes and can be applied independently.

Please look at patch 10 that uses per-cpu buffer for accessing user
memory as suggested by Steven. While I tried hard not to mess things
up there might be a chance I did something horrible. It'd be great if
you guys take a look and give comments.


* v6 changes:
- add more Ack's from Masami
- fix ref count of uprobe_cpu_buffer (thanks to Jovi)

* v5 changes:
- use user_stack_pointer() instead of GET_USP()
- fix a bug in 'stack' fetch method of uprobes

* v4 changes:
- add Ack's from Masami
- rearrange patches to make it easy for simple fixes to be applied
- update documentation
- use per-cpu buffer for storing args (thanks to Steve!)


[1] https://lkml.org/lkml/2012/11/14/84

A simple example:

# cat foo.c
int glob = -1;
char str[] = "hello uprobe.";

struct foo {
unsigned int unused: 2;
unsigned int foo: 20;
unsigned int bar: 10;
} foo = {
.foo = 5,
};

int main(int argc, char *argv[])
{
long local = 0x1234;

return 127;
}

# gcc -o foo -g foo.c

# objdump -d foo | grep -A9 -F '<main>'
00000000004004b0 <main>:
4004b0: 55 push %rbp
4004b1: 48 89 e5 mov %rsp,%rbp
4004b4: 89 7d ec mov %edi,-0x14(%rbp)
4004b7: 48 89 75 e0 mov %rsi,-0x20(%rbp)
4004bb: 48 c7 45 f8 34 12 00 movq $0x1234,-0x8(%rbp)
4004c2: 00
4004c3: b8 7f 00 00 00 mov $0x7f,%eax
4004c8: 5d pop %rbp
4004c9: c3 retq

# nm foo | grep -e glob$ -e str -e foo
00000000006008bc D foo
00000000006008a8 D glob
00000000006008ac D str

# perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
> str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
Added new event:
probe_foo:foo (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string
bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))

You can now use it in all perf tools, such as:

perf record -e probe_foo:foo -aR sleep 1

# perf record -e probe_foo:foo ./foo
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]

# perf script | grep -v ^#
foo 2008 [002 2199.867154: probe_foo:foo (4004c3)
glob=-1 str="hello uprobe." bit=5 argc=1 local=1234


This patchset is based on the current for-next branch of the Steven
Rostedt's linux-trace tree. I also put this on my 'uprobe/fetch-v6'
branch in my tree:

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Hemant Kumar <[email protected]>


Hyeoncheol Lee (2):
tracing/kprobes: Move fetch functions to trace_kprobe.c
tracing/kprobes: Add fetch{,_size} member into deref fetch method

Namhyung Kim (11):
tracing/uprobes: Fix documentation of uprobe registration syntax
tracing/probes: Fix basic print type functions
tracing/kprobes: Staticize stack and memory fetch functions
tracing/kprobes: Factor out struct trace_probe
tracing/uprobes: Convert to struct trace_probe
tracing/kprobes: Move common functions to trace_probe.h
tracing/kprobes: Integrate duplicate set_print_fmt()
tracing/uprobes: Fetch args before reserving a ring buffer
tracing/kprobes: Add priv argument to fetch functions
tracing/uprobes: Add more fetch functions
tracing/uprobes: Add support for full argument access methods

Documentation/trace/uprobetracer.txt | 35 +-
kernel/trace/trace_kprobe.c | 642 +++++++++++++++++++----------------
kernel/trace/trace_probe.c | 453 +++++++++---------------
kernel/trace/trace_probe.h | 202 ++++++++++-
kernel/trace/trace_uprobe.c | 458 +++++++++++++++++--------
5 files changed, 1063 insertions(+), 727 deletions(-)

--
1.7.11.7


2013-10-29 06:53:50

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 02/13] tracing/probes: Fix basic print type functions

From: Namhyung Kim <[email protected]>

The print format of s32 type was "ld" and it's casted to "long". So
it turned out to print 4294967295 for "-1" on 64-bit systems. Not
sure whether it worked well on 32-bit systems.

Anyway, it'd be better if we have exact format and type cast for each
types on both of 32- and 64-bit systems. In fact, the only difference
is on s64/u64 types.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_probe.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 412e959709b4..b571e4de0769 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -49,14 +49,19 @@ static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
} \
static const char PRINT_TYPE_FMT_NAME(type)[] = fmt;

-DEFINE_BASIC_PRINT_TYPE_FUNC(u8, "%x", unsigned int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%lx", unsigned long)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "%x", unsigned char)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned short)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%x", unsigned int)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", signed char)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", short)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d", int)
+#if BITS_PER_LONG == 32
DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%llx", unsigned long long)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
+#else /* BITS_PER_LONG == 64 */
+DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%lx", unsigned long)
+DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%ld", long)
+#endif

static inline void *get_rloc_data(u32 *dl)
{
--
1.7.11.7

2013-10-29 06:54:18

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 13/13] tracing/uprobes: Add support for full argument access methods

From: Namhyung Kim <[email protected]>

Enable to fetch other types of argument for the uprobes. IOW, we can
access stack, memory, deref, bitfield and retval from uprobes now.

The format for the argument types are same as kprobes (but @SYMBOL
type is not supported for uprobes), i.e:

@ADDR : Fetch memory at ADDR
$stackN : Fetch Nth entry of stack (N >= 0)
$stack : Fetch stack address
$retval : Fetch return value
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address

Note that the retval only can be used with uretprobes.

Original-patch-by: Hyeoncheol Lee <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
Documentation/trace/uprobetracer.txt | 25 +++++++++++++++++++++++++
kernel/trace/trace_probe.c | 36 +++++++++++++++++++++++-------------
2 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt b/Documentation/trace/uprobetracer.txt
index 8f1a8b8956fc..6e5cff263e2b 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -31,6 +31,31 @@ Synopsis of uprobe_tracer

FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
+ @ADDR : Fetch memory at ADDR (ADDR should be in userspace)
+ $stackN : Fetch Nth entry of stack (N >= 0)
+ $stack : Fetch stack address.
+ $retval : Fetch return value.(*)
+ +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
+ NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
+ FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
+ (u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
+ are supported.
+
+ (*) only for return probe.
+ (**) this is useful for fetching a field of data structures.
+
+Types
+-----
+Several types are supported for fetch-args. Uprobe tracer will access memory
+by given type. Prefix 's' and 'u' means those types are signed and unsigned
+respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
+String type is a special type, which fetches a "null-terminated" string from
+user space.
+Bitfield is another special type, which takes 3 parameters, bit-width, bit-
+offset, and container-size (usually 32). The syntax is;
+
+ b<bit-width>@<bit-offset>/<container-size>
+

Event Profiling
---------------
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 70cd3bfde5a6..8c77825e87e6 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -253,12 +253,18 @@ fail:
}

/* Special function : only accept unsigned long */
-static __kprobes void fetch_stack_address(struct pt_regs *regs,
+static __kprobes void fetch_kernel_stack_address(struct pt_regs *regs,
void *dummy, void *dest, void *priv)
{
*(unsigned long *)dest = kernel_stack_pointer(regs);
}

+static __kprobes void fetch_user_stack_address(struct pt_regs *regs,
+ void *dummy, void *dest, void *priv)
+{
+ *(unsigned long *)dest = user_stack_pointer(regs);
+}
+
static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
fetch_func_t orig_fn,
const struct fetch_type *ttbl)
@@ -303,7 +309,8 @@ int traceprobe_split_symbol_offset(char *symbol, unsigned long *offset)
#define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))

static int parse_probe_vars(char *arg, const struct fetch_type *t,
- struct fetch_param *f, bool is_return)
+ struct fetch_param *f, bool is_return,
+ bool is_kprobe)
{
int ret = 0;
unsigned long param;
@@ -315,13 +322,16 @@ static int parse_probe_vars(char *arg, const struct fetch_type *t,
ret = -EINVAL;
} else if (strncmp(arg, "stack", 5) == 0) {
if (arg[5] == '\0') {
- if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR) == 0)
- f->fn = fetch_stack_address;
+ if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR))
+ return -EINVAL;
+
+ if (is_kprobe)
+ f->fn = fetch_kernel_stack_address;
else
- ret = -EINVAL;
+ f->fn = fetch_user_stack_address;
} else if (isdigit(arg[5])) {
ret = kstrtoul(arg + 5, 10, &param);
- if (ret || param > PARAM_MAX_STACK)
+ if (ret || (is_kprobe && param > PARAM_MAX_STACK))
ret = -EINVAL;
else {
f->fn = t->fetch[FETCH_MTD_stack];
@@ -345,17 +355,13 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
int ret;
const struct fetch_type *ttbl;

- ttbl = kprobes_fetch_type_table;
+ ttbl = is_kprobe ? kprobes_fetch_type_table : uprobes_fetch_type_table;

ret = 0;

- /* Until uprobe_events supports only reg arguments */
- if (!is_kprobe && arg[0] != '%')
- return -EINVAL;
-
switch (arg[0]) {
case '$':
- ret = parse_probe_vars(arg + 1, t, f, is_return);
+ ret = parse_probe_vars(arg + 1, t, f, is_return, is_kprobe);
break;

case '%': /* named register */
@@ -376,6 +382,10 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
f->fn = t->fetch[FETCH_MTD_memory];
f->data = (void *)param;
} else {
+ /* uprobes don't support symbols */
+ if (!is_kprobe)
+ return -EINVAL;
+
ret = traceprobe_split_symbol_offset(arg + 1, &offset);
if (ret)
break;
@@ -482,7 +492,7 @@ int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
int ret;
const struct fetch_type *ttbl;

- ttbl = kprobes_fetch_type_table;
+ ttbl = is_kprobe ? kprobes_fetch_type_table : uprobes_fetch_type_table;

if (strlen(arg) > MAX_ARGSTR_LEN) {
pr_info("Argument is too long.: %s\n", arg);
--
1.7.11.7

2013-10-29 06:54:51

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 11/13] tracing/kprobes: Add priv argument to fetch functions

From: Namhyung Kim <[email protected]>

This argument is for passing private data structure to each fetch
function and will be used by uprobes.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 32 ++++++++++++++++++--------------
kernel/trace/trace_probe.c | 24 ++++++++++++------------
kernel/trace/trace_probe.h | 19 ++++++++++---------
kernel/trace/trace_uprobe.c | 8 ++++----
4 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 3159b114f215..c0f4c2dbdbb1 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -745,7 +745,7 @@ static const struct file_operations kprobe_profile_ops = {
*/
#define DEFINE_FETCH_stack(type) \
static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+ void *offset, void *dest, void *priv) \
{ \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
(unsigned int)((unsigned long)offset)); \
@@ -757,7 +757,7 @@ DEFINE_BASIC_FETCH_FUNCS(stack)

#define DEFINE_FETCH_memory(type) \
static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
- void *addr, void *dest) \
+ void *addr, void *dest, void *priv) \
{ \
type retval; \
if (probe_kernel_address(addr, retval)) \
@@ -771,7 +771,7 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
* length and relative data location.
*/
static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
+ void *addr, void *dest, void *priv)
{
long ret;
int maxlen = get_rloc_len(*(u32 *)dest);
@@ -808,7 +808,7 @@ static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,

/* Return the length of string -- including null terminal byte */
static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
- void *addr, void *dest)
+ void *addr, void *dest, void *priv)
{
mm_segment_t old_fs;
int ret, len = 0;
@@ -879,11 +879,11 @@ struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)

#define DEFINE_FETCH_symbol(type) \
__kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+ void *data, void *dest, void *priv) \
{ \
struct symbol_cache *sc = data; \
if (sc->addr) \
- fetch_memory_##type(regs, (void *)sc->addr, dest); \
+ fetch_memory_##type(regs, (void *)sc->addr, dest, priv);\
else \
*(type *)dest = 0; \
}
@@ -929,7 +929,7 @@ __kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs *regs,
local_save_flags(irq_flags);
pc = preempt_count();

- dsize = __get_data_size(&tp->p, regs);
+ dsize = __get_data_size(&tp->p, regs, NULL);
size = sizeof(*entry) + tp->p.size + dsize;

event = trace_event_buffer_lock_reserve(&buffer, ftrace_file,
@@ -940,7 +940,8 @@ __kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs *regs,

entry = ring_buffer_event_data(event);
entry->ip = (unsigned long)tp->rp.kp.addr;
- store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize,
+ NULL);

if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit_regs(buffer, event,
@@ -977,7 +978,7 @@ __kretprobe_trace_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,
local_save_flags(irq_flags);
pc = preempt_count();

- dsize = __get_data_size(&tp->p, regs);
+ dsize = __get_data_size(&tp->p, regs, NULL);
size = sizeof(*entry) + tp->p.size + dsize;

event = trace_event_buffer_lock_reserve(&buffer, ftrace_file,
@@ -989,7 +990,8 @@ __kretprobe_trace_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,
entry = ring_buffer_event_data(event);
entry->func = (unsigned long)tp->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
- store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize,
+ NULL);

if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit_regs(buffer, event,
@@ -1149,7 +1151,7 @@ kprobe_perf_func(struct trace_kprobe *tp, struct pt_regs *regs)
if (hlist_empty(head))
return;

- dsize = __get_data_size(&tp->p, regs);
+ dsize = __get_data_size(&tp->p, regs, NULL);
__size = sizeof(*entry) + tp->p.size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);
@@ -1160,7 +1162,8 @@ kprobe_perf_func(struct trace_kprobe *tp, struct pt_regs *regs)

entry->ip = (unsigned long)tp->rp.kp.addr;
memset(&entry[1], 0, dsize);
- store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize,
+ NULL);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}

@@ -1179,7 +1182,7 @@ kretprobe_perf_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,
if (hlist_empty(head))
return;

- dsize = __get_data_size(&tp->p, regs);
+ dsize = __get_data_size(&tp->p, regs, NULL);
__size = sizeof(*entry) + tp->p.size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);
@@ -1190,7 +1193,8 @@ kretprobe_perf_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,

entry->func = (unsigned long)tp->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
- store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize,
+ NULL);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}
#endif /* CONFIG_PERF_EVENTS */
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 1ab83d4c7775..eaee44d5d9d1 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -78,7 +78,7 @@ const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
/* Data fetch function templates */
#define DEFINE_FETCH_reg(type) \
__kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
- void *offset, void *dest) \
+ void *offset, void *dest, void *priv) \
{ \
*(type *)dest = (type)regs_get_register(regs, \
(unsigned int)((unsigned long)offset)); \
@@ -87,7 +87,7 @@ DEFINE_BASIC_FETCH_FUNCS(reg)

#define DEFINE_FETCH_retval(type) \
__kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
- void *dummy, void *dest) \
+ void *dummy, void *dest, void *priv) \
{ \
*(type *)dest = (type)regs_return_value(regs); \
}
@@ -103,14 +103,14 @@ struct deref_fetch_param {

#define DEFINE_FETCH_deref(type) \
__kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+ void *data, void *dest, void *priv) \
{ \
struct deref_fetch_param *dprm = data; \
unsigned long addr; \
- call_fetch(&dprm->orig, regs, &addr); \
+ call_fetch(&dprm->orig, regs, &addr, priv); \
if (addr) { \
addr += dprm->offset; \
- dprm->fetch(regs, (void *)addr, dest); \
+ dprm->fetch(regs, (void *)addr, dest, priv); \
} else \
*(type *)dest = 0; \
}
@@ -118,15 +118,15 @@ DEFINE_BASIC_FETCH_FUNCS(deref)
DEFINE_FETCH_deref(string)

__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
- void *data, void *dest)
+ void *data, void *dest, void *priv)
{
struct deref_fetch_param *dprm = data;
unsigned long addr;

- call_fetch(&dprm->orig, regs, &addr);
+ call_fetch(&dprm->orig, regs, &addr, priv);
if (addr && dprm->fetch_size) {
addr += dprm->offset;
- dprm->fetch_size(regs, (void *)addr, dest);
+ dprm->fetch_size(regs, (void *)addr, dest, priv);
} else
*(string_size *)dest = 0;
}
@@ -156,12 +156,12 @@ struct bitfield_fetch_param {
};

#define DEFINE_FETCH_bitfield(type) \
-__kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs,\
- void *data, void *dest) \
+__kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
+ void *data, void *dest, void *priv) \
{ \
struct bitfield_fetch_param *bprm = data; \
type buf = 0; \
- call_fetch(&bprm->orig, regs, &buf); \
+ call_fetch(&bprm->orig, regs, &buf, priv); \
if (buf) { \
buf <<= bprm->hi_shift; \
buf >>= bprm->low_shift; \
@@ -249,7 +249,7 @@ fail:

/* Special function : only accept unsigned long */
static __kprobes void fetch_stack_address(struct pt_regs *regs,
- void *dummy, void *dest)
+ void *dummy, void *dest, void *priv)
{
*(unsigned long *)dest = kernel_stack_pointer(regs);
}
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 325989f24dbf..fc7edf3749ef 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -98,7 +98,7 @@ typedef u32 string;
typedef u32 string_size;

/* Data fetch function type */
-typedef void (*fetch_func_t)(struct pt_regs *, void *, void *);
+typedef void (*fetch_func_t)(struct pt_regs *, void *, void *, void *);
/* Printing function type */
typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *, void *);

@@ -182,7 +182,8 @@ static inline bool trace_probe_is_registered(struct trace_probe *tp)
#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type

#define DECLARE_FETCH_FUNC(method, type) \
-extern void FETCH_FUNC_NAME(method, type)(struct pt_regs *, void *, void *)
+extern void FETCH_FUNC_NAME(method, type)(struct pt_regs *, void *, \
+ void *, void *)

#define DECLARE_BASIC_FETCH_FUNCS(method) \
DECLARE_FETCH_FUNC(method, u8); \
@@ -264,9 +265,9 @@ ASSIGN_FETCH_FUNC(bitfield, ftype), \
extern const struct fetch_type kprobes_fetch_type_table[];

static inline __kprobes void call_fetch(struct fetch_param *fprm,
- struct pt_regs *regs, void *dest)
+ struct pt_regs *regs, void *dest, void *priv)
{
- return fprm->fn(regs, fprm->data, dest);
+ return fprm->fn(regs, fprm->data, dest, priv);
}

struct symbol_cache;
@@ -305,14 +306,14 @@ extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));

/* Sum up total data length for dynamic arraies (strings) */
static inline __kprobes int
-__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
+__get_data_size(struct trace_probe *tp, struct pt_regs *regs, void *priv)
{
int i, ret = 0;
u32 len;

for (i = 0; i < tp->nr_args; i++)
if (unlikely(tp->args[i].fetch_size.fn)) {
- call_fetch(&tp->args[i].fetch_size, regs, &len);
+ call_fetch(&tp->args[i].fetch_size, regs, &len, priv);
ret += len;
}

@@ -322,7 +323,7 @@ __get_data_size(struct trace_probe *tp, struct pt_regs *regs)
/* Store the value of each argument */
static inline __kprobes void
store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
- u8 *data, int maxlen)
+ u8 *data, int maxlen, void *priv)
{
int i;
u32 end = tp->size;
@@ -337,7 +338,7 @@ store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
dl = (u32 *)(data + tp->args[i].offset);
*dl = make_data_rloc(maxlen, end - tp->args[i].offset);
/* Then try to fetch string or dynamic array data */
- call_fetch(&tp->args[i].fetch, regs, dl);
+ call_fetch(&tp->args[i].fetch, regs, dl, priv);
/* Reduce maximum length */
end += get_rloc_len(*dl);
maxlen -= get_rloc_len(*dl);
@@ -347,7 +348,7 @@ store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
} else
/* Just fetching data normally */
call_fetch(&tp->args[i].fetch, regs,
- data + tp->args[i].offset);
+ data + tp->args[i].offset, priv);
}
}

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index c32f8f2ddc11..b41e11621aed 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -546,7 +546,7 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
int cpu;
struct ftrace_event_call *call = &tu->p.call;

- dsize = __get_data_size(&tu->p, regs);
+ dsize = __get_data_size(&tu->p, regs, NULL);
esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));

if (WARN_ON_ONCE(!uprobe_cpu_buffer || tu->p.size + dsize > PAGE_SIZE))
@@ -561,7 +561,7 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
* so the mutex makes sure we have sole access to it.
*/
mutex_lock(mutex);
- store_trace_args(esize, &tu->p, regs, arg_buf, dsize);
+ store_trace_args(esize, &tu->p, regs, arg_buf, dsize, NULL);

size = esize + tu->p.size + dsize;
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
@@ -823,7 +823,7 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
int cpu;
int rctx;

- dsize = __get_data_size(&tu->p, regs);
+ dsize = __get_data_size(&tu->p, regs, NULL);
esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));

if (WARN_ON_ONCE(!uprobe_cpu_buffer))
@@ -843,7 +843,7 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
* so the mutex makes sure we have sole access to it.
*/
mutex_lock(mutex);
- store_trace_args(esize, &tu->p, regs, arg_buf, dsize);
+ store_trace_args(esize, &tu->p, regs, arg_buf, dsize, NULL);

preempt_disable();
head = this_cpu_ptr(call->perf_events);
--
1.7.11.7

2013-10-29 06:54:50

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 12/13] tracing/uprobes: Add more fetch functions

From: Namhyung Kim <[email protected]>

Implement uprobe-specific stack and memory fetch functions and add
them to the uprobes_fetch_type_table. Other fetch fucntions will be
shared with kprobes.

Original-patch-by: Hyeoncheol Lee <[email protected]>
Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_probe.c | 9 ++-
kernel/trace/trace_probe.h | 1 +
kernel/trace/trace_uprobe.c | 188 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 192 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index eaee44d5d9d1..70cd3bfde5a6 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -101,6 +101,10 @@ struct deref_fetch_param {
fetch_func_t fetch_size;
};

+/*
+ * For uprobes, it'll get a vaddr from first call_fetch() so pass NULL
+ * as a priv on the second dprm->fetch() not to translate it to vaddr again.
+ */
#define DEFINE_FETCH_deref(type) \
__kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
void *data, void *dest, void *priv) \
@@ -110,13 +114,14 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
call_fetch(&dprm->orig, regs, &addr, priv); \
if (addr) { \
addr += dprm->offset; \
- dprm->fetch(regs, (void *)addr, dest, priv); \
+ dprm->fetch(regs, (void *)addr, dest, NULL); \
} else \
*(type *)dest = 0; \
}
DEFINE_BASIC_FETCH_FUNCS(deref)
DEFINE_FETCH_deref(string)

+/* Same as above */
__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
void *data, void *dest, void *priv)
{
@@ -126,7 +131,7 @@ __kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
call_fetch(&dprm->orig, regs, &addr, priv);
if (addr && dprm->fetch_size) {
addr += dprm->offset;
- dprm->fetch_size(regs, (void *)addr, dest, priv);
+ dprm->fetch_size(regs, (void *)addr, dest, NULL);
} else
*(string_size *)dest = 0;
}
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index fc7edf3749ef..b1e7d722c354 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -263,6 +263,7 @@ ASSIGN_FETCH_FUNC(bitfield, ftype), \
#define NR_FETCH_TYPES 10

extern const struct fetch_type kprobes_fetch_type_table[];
+extern const struct fetch_type uprobes_fetch_type_table[];

static inline __kprobes void call_fetch(struct fetch_param *fprm,
struct pt_regs *regs, void *dest, void *priv)
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index b41e11621aed..b7664f6eb356 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -530,6 +530,186 @@ static const struct file_operations uprobe_profile_ops = {
.release = seq_release,
};

+#ifdef CONFIG_STACK_GROWSUP
+static unsigned long adjust_stack_addr(unsigned long addr, unsigned n)
+{
+ return addr - (n * sizeof(long));
+}
+
+static bool within_user_stack(struct vm_area_struct *vma, unsigned long addr,
+ unsigned int n)
+{
+ return vma->vm_start <= adjust_stack_addr(addr, n);
+}
+#else
+static unsigned long adjust_stack_addr(unsigned long addr, unsigned n)
+{
+ return addr + (n * sizeof(long));
+}
+
+static bool within_user_stack(struct vm_area_struct *vma, unsigned long addr,
+ unsigned int n)
+{
+ return vma->vm_end >= adjust_stack_addr(addr, n);
+}
+#endif
+
+static unsigned long get_user_stack_nth(struct pt_regs *regs, unsigned int n)
+{
+ struct vm_area_struct *vma;
+ unsigned long addr = user_stack_pointer(regs);
+ bool valid = false;
+ unsigned long ret = 0;
+
+ down_read(&current->mm->mmap_sem);
+ vma = find_vma(current->mm, addr);
+ if (vma && vma->vm_start <= addr) {
+ if (within_user_stack(vma, addr, n))
+ valid = true;
+ }
+ up_read(&current->mm->mmap_sem);
+
+ addr = adjust_stack_addr(addr, n);
+
+ if (valid && copy_from_user(&ret, (void __force __user *)addr,
+ sizeof(ret)) == 0)
+ return ret;
+ return 0;
+}
+
+static unsigned long offset_to_vaddr(struct vm_area_struct *vma,
+ unsigned long offset)
+{
+ return vma->vm_start + offset - ((loff_t)vma->vm_pgoff << PAGE_SHIFT);
+}
+
+static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe *tu)
+{
+ unsigned long pgoff = addr >> PAGE_SHIFT;
+ struct vm_area_struct *vma;
+ struct address_space *mapping;
+ unsigned long vaddr = 0;
+
+ if (tu == NULL) {
+ /* A NULL tu means that we already got the vaddr */
+ return (void __force __user *) addr;
+ }
+
+ mapping = tu->inode->i_mapping;
+
+ mutex_lock(&mapping->i_mmap_mutex);
+ vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+ if (vma->vm_mm != current->mm)
+ continue;
+ if (!(vma->vm_flags & VM_READ))
+ continue;
+
+ vaddr = offset_to_vaddr(vma, addr);
+ break;
+ }
+ mutex_unlock(&mapping->i_mmap_mutex);
+
+ WARN_ON_ONCE(vaddr == 0);
+ return (void __force __user *) vaddr;
+}
+
+/*
+ * uprobes-specific fetch functions
+ */
+#define DEFINE_FETCH_stack(type) \
+static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
+ void *offset, void *dest, void *priv) \
+{ \
+ *(type *)dest = (type)get_user_stack_nth(regs, \
+ (unsigned int)((unsigned long)offset)); \
+}
+DEFINE_BASIC_FETCH_FUNCS(stack)
+/* No string on the stack entry */
+#define fetch_stack_string NULL
+#define fetch_stack_string_size NULL
+
+#define DEFINE_FETCH_memory(type) \
+static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
+ void *addr, void *dest, void *priv) \
+{ \
+ type retval; \
+ void __user *uaddr = get_user_vaddr((unsigned long)addr, priv); \
+ \
+ if (copy_from_user(&retval, uaddr, sizeof(type))) \
+ *(type *)dest = 0; \
+ else \
+ *(type *)dest = retval; \
+}
+DEFINE_BASIC_FETCH_FUNCS(memory)
+/*
+ * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
+ * length and relative data location.
+ */
+static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest, void *priv)
+{
+ long ret;
+ u32 rloc = *(u32 *)dest;
+ int maxlen = get_rloc_len(rloc);
+ u8 *dst = get_rloc_data(dest);
+ void __user *vaddr = get_user_vaddr((unsigned long)addr, priv);
+ void __user *src = vaddr;
+
+ if (!maxlen)
+ return;
+
+ do {
+ ret = copy_from_user(dst, src, sizeof(*dst));
+ dst++;
+ src++;
+ } while (dst[-1] && ret == 0 && (src - vaddr) < maxlen);
+
+ if (ret < 0) { /* Failed to fetch string */
+ ((u8 *)get_rloc_data(dest))[0] = '\0';
+ *(u32 *)dest = make_data_rloc(0, get_rloc_offs(rloc));
+ } else {
+ *(u32 *)dest = make_data_rloc(src - vaddr,
+ get_rloc_offs(rloc));
+ }
+}
+
+/* Return the length of string -- including null terminal byte */
+static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest, void *priv)
+{
+ int ret, len = 0;
+ u8 c;
+ void __user *vaddr = get_user_vaddr((unsigned long)addr, priv);
+
+ do {
+ ret = __copy_from_user_inatomic(&c, vaddr + len, 1);
+ len++;
+ } while (c && ret == 0 && len < MAX_STRING_SIZE);
+
+ if (ret < 0) /* Failed to check the length */
+ *(u32 *)dest = 0;
+ else
+ *(u32 *)dest = len;
+}
+
+/* Fetch type information table */
+const struct fetch_type uprobes_fetch_type_table[] = {
+ /* Special types */
+ [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
+ sizeof(u32), 1, "__data_loc char[]"),
+ [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
+ string_size, sizeof(u32), 0, "u32"),
+ /* Basic types */
+ ASSIGN_FETCH_TYPE(u8, u8, 0),
+ ASSIGN_FETCH_TYPE(u16, u16, 0),
+ ASSIGN_FETCH_TYPE(u32, u32, 0),
+ ASSIGN_FETCH_TYPE(u64, u64, 0),
+ ASSIGN_FETCH_TYPE(s8, u8, 1),
+ ASSIGN_FETCH_TYPE(s16, u16, 1),
+ ASSIGN_FETCH_TYPE(s32, u32, 1),
+ ASSIGN_FETCH_TYPE(s64, u64, 1),
+};
+
static atomic_t uprobe_buffer_ref = ATOMIC_INIT(0);
static void __percpu *uprobe_cpu_buffer;
static DEFINE_PER_CPU(struct mutex, uprobe_cpu_mutex);
@@ -546,7 +726,7 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
int cpu;
struct ftrace_event_call *call = &tu->p.call;

- dsize = __get_data_size(&tu->p, regs, NULL);
+ dsize = __get_data_size(&tu->p, regs, tu);
esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));

if (WARN_ON_ONCE(!uprobe_cpu_buffer || tu->p.size + dsize > PAGE_SIZE))
@@ -561,7 +741,7 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
* so the mutex makes sure we have sole access to it.
*/
mutex_lock(mutex);
- store_trace_args(esize, &tu->p, regs, arg_buf, dsize, NULL);
+ store_trace_args(esize, &tu->p, regs, arg_buf, dsize, tu);

size = esize + tu->p.size + dsize;
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
@@ -823,7 +1003,7 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
int cpu;
int rctx;

- dsize = __get_data_size(&tu->p, regs, NULL);
+ dsize = __get_data_size(&tu->p, regs, tu);
esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));

if (WARN_ON_ONCE(!uprobe_cpu_buffer))
@@ -843,7 +1023,7 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
* so the mutex makes sure we have sole access to it.
*/
mutex_lock(mutex);
- store_trace_args(esize, &tu->p, regs, arg_buf, dsize, NULL);
+ store_trace_args(esize, &tu->p, regs, arg_buf, dsize, tu);

preempt_disable();
head = this_cpu_ptr(call->perf_events);
--
1.7.11.7

2013-10-29 06:55:32

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 08/13] tracing/kprobes: Move common functions to trace_probe.h

From: Namhyung Kim <[email protected]>

The __get_data_size() and store_trace_args() will be used by uprobes
too. Move them to a common location.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 48 ---------------------------------------------
kernel/trace/trace_probe.h | 48 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 6d33cfee9448..2a668516f0e4 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -909,54 +909,6 @@ const struct fetch_type kprobes_fetch_type_table[] = {
ASSIGN_FETCH_TYPE(s64, u64, 1),
};

-/* Sum up total data length for dynamic arraies (strings) */
-static __kprobes int __get_data_size(struct trace_probe *tp,
- struct pt_regs *regs)
-{
- int i, ret = 0;
- u32 len;
-
- for (i = 0; i < tp->nr_args; i++)
- if (unlikely(tp->args[i].fetch_size.fn)) {
- call_fetch(&tp->args[i].fetch_size, regs, &len);
- ret += len;
- }
-
- return ret;
-}
-
-/* Store the value of each argument */
-static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp,
- struct pt_regs *regs,
- u8 *data, int maxlen)
-{
- int i;
- u32 end = tp->size;
- u32 *dl; /* Data (relative) location */
-
- for (i = 0; i < tp->nr_args; i++) {
- if (unlikely(tp->args[i].fetch_size.fn)) {
- /*
- * First, we set the relative location and
- * maximum data length to *dl
- */
- dl = (u32 *)(data + tp->args[i].offset);
- *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
- /* Then try to fetch string or dynamic array data */
- call_fetch(&tp->args[i].fetch, regs, dl);
- /* Reduce maximum length */
- end += get_rloc_len(*dl);
- maxlen -= get_rloc_len(*dl);
- /* Trick here, convert data_rloc to data_loc */
- *dl = convert_rloc_to_loc(*dl,
- ent_size + tp->args[i].offset);
- } else
- /* Just fetching data normally */
- call_fetch(&tp->args[i].fetch, regs,
- data + tp->args[i].offset);
- }
-}
-
/* Kprobe handler */
static __kprobes void
__kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs *regs,
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 63e5da4e3073..189a40baea98 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -302,3 +302,51 @@ extern ssize_t traceprobe_probes_write(struct file *file,
int (*createfn)(int, char**));

extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));
+
+/* Sum up total data length for dynamic arraies (strings) */
+static inline __kprobes int
+__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
+{
+ int i, ret = 0;
+ u32 len;
+
+ for (i = 0; i < tp->nr_args; i++)
+ if (unlikely(tp->args[i].fetch_size.fn)) {
+ call_fetch(&tp->args[i].fetch_size, regs, &len);
+ ret += len;
+ }
+
+ return ret;
+}
+
+/* Store the value of each argument */
+static inline __kprobes void
+store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
+ u8 *data, int maxlen)
+{
+ int i;
+ u32 end = tp->size;
+ u32 *dl; /* Data (relative) location */
+
+ for (i = 0; i < tp->nr_args; i++) {
+ if (unlikely(tp->args[i].fetch_size.fn)) {
+ /*
+ * First, we set the relative location and
+ * maximum data length to *dl
+ */
+ dl = (u32 *)(data + tp->args[i].offset);
+ *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
+ /* Then try to fetch string or dynamic array data */
+ call_fetch(&tp->args[i].fetch, regs, dl);
+ /* Reduce maximum length */
+ end += get_rloc_len(*dl);
+ maxlen -= get_rloc_len(*dl);
+ /* Trick here, convert data_rloc to data_loc */
+ *dl = convert_rloc_to_loc(*dl,
+ ent_size + tp->args[i].offset);
+ } else
+ /* Just fetching data normally */
+ call_fetch(&tp->args[i].fetch, regs,
+ data + tp->args[i].offset);
+ }
+}
--
1.7.11.7

2013-10-29 06:55:30

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 09/13] tracing/kprobes: Integrate duplicate set_print_fmt()

From: Namhyung Kim <[email protected]>

The set_print_fmt() functions are implemented almost same for
[ku]probes. Move it to a common place and get rid of the duplication.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 63 +--------------------------------------------
kernel/trace/trace_probe.c | 62 ++++++++++++++++++++++++++++++++++++++++++++
kernel/trace/trace_probe.h | 2 ++
kernel/trace/trace_uprobe.c | 55 +--------------------------------------
4 files changed, 66 insertions(+), 116 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 2a668516f0e4..3159b114f215 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1133,67 +1133,6 @@ static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
return 0;
}

-static int __set_print_fmt(struct trace_kprobe *tp, char *buf, int len)
-{
- int i;
- int pos = 0;
-
- const char *fmt, *arg;
-
- if (!trace_kprobe_is_return(tp)) {
- fmt = "(%lx)";
- arg = "REC->" FIELD_STRING_IP;
- } else {
- fmt = "(%lx <- %lx)";
- arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
- }
-
- /* When len=0, we just calculate the needed length */
-#define LEN_OR_ZERO (len ? len - pos : 0)
-
- pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
-
- for (i = 0; i < tp->p.nr_args; i++) {
- pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
- tp->p.args[i].name, tp->p.args[i].type->fmt);
- }
-
- pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
-
- for (i = 0; i < tp->p.nr_args; i++) {
- if (strcmp(tp->p.args[i].type->name, "string") == 0)
- pos += snprintf(buf + pos, LEN_OR_ZERO,
- ", __get_str(%s)",
- tp->p.args[i].name);
- else
- pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
- tp->p.args[i].name);
- }
-
-#undef LEN_OR_ZERO
-
- /* return the length of print_fmt */
- return pos;
-}
-
-static int set_print_fmt(struct trace_kprobe *tp)
-{
- int len;
- char *print_fmt;
-
- /* First: called with 0 length to calculate the needed length */
- len = __set_print_fmt(tp, NULL, 0);
- print_fmt = kmalloc(len + 1, GFP_KERNEL);
- if (!print_fmt)
- return -ENOMEM;
-
- /* Second: actually write the @print_fmt */
- __set_print_fmt(tp, print_fmt, len + 1);
- tp->p.call.print_fmt = print_fmt;
-
- return 0;
-}
-
#ifdef CONFIG_PERF_EVENTS

/* Kprobe profile handler */
@@ -1344,7 +1283,7 @@ static int register_kprobe_event(struct trace_kprobe *tp)
call->event.funcs = &kprobe_funcs;
call->class->define_fields = kprobe_event_define_fields;
}
- if (set_print_fmt(tp) < 0)
+ if (set_print_fmt(&tp->p, trace_kprobe_is_return(tp)) < 0)
return -ENOMEM;
ret = register_ftrace_event(&call->event);
if (!ret) {
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index b7b8bda02d6e..1ab83d4c7775 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -629,3 +629,65 @@ out:

return ret;
}
+
+static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
+ bool is_return)
+{
+ int i;
+ int pos = 0;
+
+ const char *fmt, *arg;
+
+ if (!is_return) {
+ fmt = "(%lx)";
+ arg = "REC->" FIELD_STRING_IP;
+ } else {
+ fmt = "(%lx <- %lx)";
+ arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
+ }
+
+ /* When len=0, we just calculate the needed length */
+#define LEN_OR_ZERO (len ? len - pos : 0)
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
+
+ for (i = 0; i < tp->nr_args; i++) {
+ pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
+ tp->args[i].name, tp->args[i].type->fmt);
+ }
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
+
+ for (i = 0; i < tp->nr_args; i++) {
+ if (strcmp(tp->args[i].type->name, "string") == 0)
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ ", __get_str(%s)",
+ tp->args[i].name);
+ else
+ pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
+ tp->args[i].name);
+ }
+
+#undef LEN_OR_ZERO
+
+ /* return the length of print_fmt */
+ return pos;
+}
+
+int set_print_fmt(struct trace_probe *tp, bool is_return)
+{
+ int len;
+ char *print_fmt;
+
+ /* First: called with 0 length to calculate the needed length */
+ len = __set_print_fmt(tp, NULL, 0, is_return);
+ print_fmt = kmalloc(len + 1, GFP_KERNEL);
+ if (!print_fmt)
+ return -ENOMEM;
+
+ /* Second: actually write the @print_fmt */
+ __set_print_fmt(tp, print_fmt, len + 1, is_return);
+ tp->call.print_fmt = print_fmt;
+
+ return 0;
+}
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 189a40baea98..325989f24dbf 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -350,3 +350,5 @@ store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
data + tp->args[i].offset);
}
}
+
+extern int set_print_fmt(struct trace_probe *tp, bool is_return);
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index abb95529d851..9f2d12d2311d 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -681,59 +681,6 @@ static int uprobe_event_define_fields(struct ftrace_event_call *event_call)
return 0;
}

-#define LEN_OR_ZERO (len ? len - pos : 0)
-static int __set_print_fmt(struct trace_uprobe *tu, char *buf, int len)
-{
- const char *fmt, *arg;
- int i;
- int pos = 0;
-
- if (is_ret_probe(tu)) {
- fmt = "(%lx <- %lx)";
- arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
- } else {
- fmt = "(%lx)";
- arg = "REC->" FIELD_STRING_IP;
- }
-
- /* When len=0, we just calculate the needed length */
-
- pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
-
- for (i = 0; i < tu->p.nr_args; i++) {
- pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
- tu->p.args[i].name, tu->p.args[i].type->fmt);
- }
-
- pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
-
- for (i = 0; i < tu->p.nr_args; i++) {
- pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
- tu->p.args[i].name);
- }
-
- return pos; /* return the length of print_fmt */
-}
-#undef LEN_OR_ZERO
-
-static int set_print_fmt(struct trace_uprobe *tu)
-{
- char *print_fmt;
- int len;
-
- /* First: called with 0 length to calculate the needed length */
- len = __set_print_fmt(tu, NULL, 0);
- print_fmt = kmalloc(len + 1, GFP_KERNEL);
- if (!print_fmt)
- return -ENOMEM;
-
- /* Second: actually write the @print_fmt */
- __set_print_fmt(tu, print_fmt, len + 1);
- tu->p.call.print_fmt = print_fmt;
-
- return 0;
-}
-
#ifdef CONFIG_PERF_EVENTS
static bool
__uprobe_perf_filter(struct trace_uprobe_filter *filter, struct mm_struct *mm)
@@ -965,7 +912,7 @@ static int register_uprobe_event(struct trace_uprobe *tu)
call->event.funcs = &uprobe_funcs;
call->class->define_fields = uprobe_event_define_fields;

- if (set_print_fmt(tu) < 0)
+ if (set_print_fmt(&tu->p, is_ret_probe(tu)) < 0)
return -ENOMEM;

ret = register_ftrace_event(&call->event);
--
1.7.11.7

2013-10-29 06:55:28

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 10/13] tracing/uprobes: Fetch args before reserving a ring buffer

From: Namhyung Kim <[email protected]>

Fetching from user space should be done in a non-atomic context. So
use a per-cpu buffer and copy its content to the ring buffer
atomically. Note that we can migrate during accessing user memory
thus use a per-cpu mutex to protect concurrent accesses.

This is needed since we'll be able to fetch args from an user memory
which can be swapped out. Before that uprobes could fetch args from
registers only which saved in a kernel space.

While at it, use __get_data_size() and store_trace_args() to reduce
code duplication.

Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_uprobe.c | 98 +++++++++++++++++++++++++++++++++++++--------
1 file changed, 82 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 9f2d12d2311d..c32f8f2ddc11 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -530,21 +530,44 @@ static const struct file_operations uprobe_profile_ops = {
.release = seq_release,
};

+static atomic_t uprobe_buffer_ref = ATOMIC_INIT(0);
+static void __percpu *uprobe_cpu_buffer;
+static DEFINE_PER_CPU(struct mutex, uprobe_cpu_mutex);
+
static void uprobe_trace_print(struct trace_uprobe *tu,
unsigned long func, struct pt_regs *regs)
{
struct uprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
struct ring_buffer *buffer;
- void *data;
- int size, i;
+ struct mutex *mutex;
+ void *data, *arg_buf;
+ int size, dsize, esize;
+ int cpu;
struct ftrace_event_call *call = &tu->p.call;

- size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+ dsize = __get_data_size(&tu->p, regs);
+ esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+
+ if (WARN_ON_ONCE(!uprobe_cpu_buffer || tu->p.size + dsize > PAGE_SIZE))
+ return;
+
+ cpu = raw_smp_processor_id();
+ mutex = &per_cpu(uprobe_cpu_mutex, cpu);
+ arg_buf = per_cpu_ptr(uprobe_cpu_buffer, cpu);
+
+ /*
+ * Use per-cpu buffers for fastest access, but we might migrate
+ * so the mutex makes sure we have sole access to it.
+ */
+ mutex_lock(mutex);
+ store_trace_args(esize, &tu->p, regs, arg_buf, dsize);
+
+ size = esize + tu->p.size + dsize;
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
- size + tu->p.size, 0, 0);
+ size, 0, 0);
if (!event)
- return;
+ goto out;

entry = ring_buffer_event_data(event);
if (is_ret_probe(tu)) {
@@ -556,13 +579,13 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}

- for (i = 0; i < tu->p.nr_args; i++) {
- call_fetch(&tu->p.args[i].fetch, regs,
- data + tu->p.args[i].offset);
- }
+ memcpy(data, arg_buf, tu->p.size + dsize);

if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit(buffer, event, 0, 0);
+
+out:
+ mutex_unlock(mutex);
}

/* uprobe handler */
@@ -630,6 +653,19 @@ probe_event_enable(struct trace_uprobe *tu, int flag, filter_func_t filter)
if (trace_probe_is_enabled(&tu->p))
return -EINTR;

+ if (atomic_inc_return(&uprobe_buffer_ref) == 1) {
+ int cpu;
+
+ uprobe_cpu_buffer = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
+ if (uprobe_cpu_buffer == NULL) {
+ atomic_dec(&uprobe_buffer_ref);
+ return -ENOMEM;
+ }
+
+ for_each_possible_cpu(cpu)
+ mutex_init(&per_cpu(uprobe_cpu_mutex, cpu));
+ }
+
WARN_ON(!uprobe_filter_is_empty(&tu->filter));

tu->p.flags |= flag;
@@ -646,6 +682,11 @@ static void probe_event_disable(struct trace_uprobe *tu, int flag)
if (!trace_probe_is_enabled(&tu->p))
return;

+ if (atomic_dec_and_test(&uprobe_buffer_ref)) {
+ free_percpu(uprobe_cpu_buffer);
+ uprobe_cpu_buffer = NULL;
+ }
+
WARN_ON(!uprobe_filter_is_empty(&tu->filter));

uprobe_unregister(tu->inode, tu->offset, &tu->consumer);
@@ -776,11 +817,33 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
struct ftrace_event_call *call = &tu->p.call;
struct uprobe_trace_entry_head *entry;
struct hlist_head *head;
- void *data;
- int size, rctx, i;
+ struct mutex *mutex;
+ void *data, *arg_buf;
+ int size, dsize, esize;
+ int cpu;
+ int rctx;

- size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
- size = ALIGN(size + tu->p.size + sizeof(u32), sizeof(u64)) - sizeof(u32);
+ dsize = __get_data_size(&tu->p, regs);
+ esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+
+ if (WARN_ON_ONCE(!uprobe_cpu_buffer))
+ return;
+
+ size = esize + tu->p.size + dsize;
+ size = ALIGN(size + sizeof(u32), sizeof(u64)) - sizeof(u32);
+ if (WARN_ONCE(size > PERF_MAX_TRACE_SIZE, "profile buffer not large enough"))
+ return;
+
+ cpu = raw_smp_processor_id();
+ mutex = &per_cpu(uprobe_cpu_mutex, cpu);
+ arg_buf = per_cpu_ptr(uprobe_cpu_buffer, cpu);
+
+ /*
+ * Use per-cpu buffers for fastest access, but we might migrate
+ * so the mutex makes sure we have sole access to it.
+ */
+ mutex_lock(mutex);
+ store_trace_args(esize, &tu->p, regs, arg_buf, dsize);

preempt_disable();
head = this_cpu_ptr(call->perf_events);
@@ -800,15 +863,18 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}

- for (i = 0; i < tu->p.nr_args; i++) {
- struct probe_arg *parg = &tu->p.args[i];
+ memcpy(data, arg_buf, tu->p.size + dsize);
+
+ if (size - esize > tu->p.size + dsize) {
+ int len = tu->p.size + dsize;

- call_fetch(&parg->fetch, regs, data + parg->offset);
+ memset(data + len, 0, size - esize - len);
}

perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
out:
preempt_enable();
+ mutex_unlock(mutex);
}

/* uprobe profile handler */
--
1.7.11.7

2013-10-29 06:56:31

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 07/13] tracing/uprobes: Convert to struct trace_probe

From: Namhyung Kim <[email protected]>

Convert struct trace_uprobe to make use of the common trace_probe
structure.

Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_uprobe.c | 151 ++++++++++++++++++++++----------------------
1 file changed, 75 insertions(+), 76 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index a415c5867ec5..abb95529d851 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -51,22 +51,17 @@ struct trace_uprobe_filter {
*/
struct trace_uprobe {
struct list_head list;
- struct ftrace_event_class class;
- struct ftrace_event_call call;
struct trace_uprobe_filter filter;
struct uprobe_consumer consumer;
struct inode *inode;
char *filename;
unsigned long offset;
unsigned long nhit;
- unsigned int flags; /* For TP_FLAG_* */
- ssize_t size; /* trace entry size */
- unsigned int nr_args;
- struct probe_arg args[];
+ struct trace_probe p;
};

-#define SIZEOF_TRACE_UPROBE(n) \
- (offsetof(struct trace_uprobe, args) + \
+#define SIZEOF_TRACE_UPROBE(n) \
+ (offsetof(struct trace_uprobe, p.args) + \
(sizeof(struct probe_arg) * (n)))

static int register_uprobe_event(struct trace_uprobe *tu);
@@ -114,13 +109,13 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret)
if (!tu)
return ERR_PTR(-ENOMEM);

- tu->call.class = &tu->class;
- tu->call.name = kstrdup(event, GFP_KERNEL);
- if (!tu->call.name)
+ tu->p.call.class = &tu->p.class;
+ tu->p.call.name = kstrdup(event, GFP_KERNEL);
+ if (!tu->p.call.name)
goto error;

- tu->class.system = kstrdup(group, GFP_KERNEL);
- if (!tu->class.system)
+ tu->p.class.system = kstrdup(group, GFP_KERNEL);
+ if (!tu->p.class.system)
goto error;

INIT_LIST_HEAD(&tu->list);
@@ -131,7 +126,7 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret)
return tu;

error:
- kfree(tu->call.name);
+ kfree(tu->p.call.name);
kfree(tu);

return ERR_PTR(-ENOMEM);
@@ -141,12 +136,12 @@ static void free_trace_uprobe(struct trace_uprobe *tu)
{
int i;

- for (i = 0; i < tu->nr_args; i++)
- traceprobe_free_probe_arg(&tu->args[i]);
+ for (i = 0; i < tu->p.nr_args; i++)
+ traceprobe_free_probe_arg(&tu->p.args[i]);

iput(tu->inode);
- kfree(tu->call.class->system);
- kfree(tu->call.name);
+ kfree(tu->p.call.class->system);
+ kfree(tu->p.call.name);
kfree(tu->filename);
kfree(tu);
}
@@ -156,8 +151,8 @@ static struct trace_uprobe *find_probe_event(const char *event, const char *grou
struct trace_uprobe *tu;

list_for_each_entry(tu, &uprobe_list, list)
- if (strcmp(tu->call.name, event) == 0 &&
- strcmp(tu->call.class->system, group) == 0)
+ if (strcmp(tu->p.call.name, event) == 0 &&
+ strcmp(tu->p.call.class->system, group) == 0)
return tu;

return NULL;
@@ -186,7 +181,7 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
mutex_lock(&uprobe_lock);

/* register as an event */
- old_tp = find_probe_event(tu->call.name, tu->call.class->system);
+ old_tp = find_probe_event(tu->p.call.name, tu->p.call.class->system);
if (old_tp) {
/* delete old event */
ret = unregister_trace_uprobe(old_tp);
@@ -359,34 +354,36 @@ static int create_trace_uprobe(int argc, char **argv)
/* parse arguments */
ret = 0;
for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+ struct probe_arg *parg = &tu->p.args[i];
+
/* Increment count for freeing args in error case */
- tu->nr_args++;
+ tu->p.nr_args++;

/* Parse argument name */
arg = strchr(argv[i], '=');
if (arg) {
*arg++ = '\0';
- tu->args[i].name = kstrdup(argv[i], GFP_KERNEL);
+ parg->name = kstrdup(argv[i], GFP_KERNEL);
} else {
arg = argv[i];
/* If argument name is omitted, set "argN" */
snprintf(buf, MAX_EVENT_NAME_LEN, "arg%d", i + 1);
- tu->args[i].name = kstrdup(buf, GFP_KERNEL);
+ parg->name = kstrdup(buf, GFP_KERNEL);
}

- if (!tu->args[i].name) {
+ if (!parg->name) {
pr_info("Failed to allocate argument[%d] name.\n", i);
ret = -ENOMEM;
goto error;
}

- if (!is_good_name(tu->args[i].name)) {
- pr_info("Invalid argument[%d] name: %s\n", i, tu->args[i].name);
+ if (!is_good_name(parg->name)) {
+ pr_info("Invalid argument[%d] name: %s\n", i, parg->name);
ret = -EINVAL;
goto error;
}

- if (traceprobe_conflict_field_name(tu->args[i].name, tu->args, i)) {
+ if (traceprobe_conflict_field_name(parg->name, tu->p.args, i)) {
pr_info("Argument[%d] name '%s' conflicts with "
"another field.\n", i, argv[i]);
ret = -EINVAL;
@@ -394,7 +391,8 @@ static int create_trace_uprobe(int argc, char **argv)
}

/* Parse fetch argument */
- ret = traceprobe_parse_probe_arg(arg, &tu->size, &tu->args[i], false, false);
+ ret = traceprobe_parse_probe_arg(arg, &tu->p.size, parg,
+ false, false);
if (ret) {
pr_info("Parse error at argument[%d]. (%d)\n", i, ret);
goto error;
@@ -458,11 +456,11 @@ static int probes_seq_show(struct seq_file *m, void *v)
char c = is_ret_probe(tu) ? 'r' : 'p';
int i;

- seq_printf(m, "%c:%s/%s", c, tu->call.class->system, tu->call.name);
+ seq_printf(m, "%c:%s/%s", c, tu->p.call.class->system, tu->p.call.name);
seq_printf(m, " %s:0x%p", tu->filename, (void *)tu->offset);

- for (i = 0; i < tu->nr_args; i++)
- seq_printf(m, " %s=%s", tu->args[i].name, tu->args[i].comm);
+ for (i = 0; i < tu->p.nr_args; i++)
+ seq_printf(m, " %s=%s", tu->p.args[i].name, tu->p.args[i].comm);

seq_printf(m, "\n");
return 0;
@@ -508,7 +506,7 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
{
struct trace_uprobe *tu = v;

- seq_printf(m, " %s %-44s %15lu\n", tu->filename, tu->call.name, tu->nhit);
+ seq_printf(m, " %s %-44s %15lu\n", tu->filename, tu->p.call.name, tu->nhit);
return 0;
}

@@ -540,11 +538,11 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
struct ring_buffer *buffer;
void *data;
int size, i;
- struct ftrace_event_call *call = &tu->call;
+ struct ftrace_event_call *call = &tu->p.call;

size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
- size + tu->size, 0, 0);
+ size + tu->p.size, 0, 0);
if (!event)
return;

@@ -558,8 +556,10 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}

- for (i = 0; i < tu->nr_args; i++)
- call_fetch(&tu->args[i].fetch, regs, data + tu->args[i].offset);
+ for (i = 0; i < tu->p.nr_args; i++) {
+ call_fetch(&tu->p.args[i].fetch, regs,
+ data + tu->p.args[i].offset);
+ }

if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit(buffer, event, 0, 0);
@@ -590,23 +590,24 @@ print_uprobe_event(struct trace_iterator *iter, int flags, struct trace_event *e
int i;

entry = (struct uprobe_trace_entry_head *)iter->ent;
- tu = container_of(event, struct trace_uprobe, call.event);
+ tu = container_of(event, struct trace_uprobe, p.call.event);

if (is_ret_probe(tu)) {
- if (!trace_seq_printf(s, "%s: (0x%lx <- 0x%lx)", tu->call.name,
+ if (!trace_seq_printf(s, "%s: (0x%lx <- 0x%lx)", tu->p.call.name,
entry->vaddr[1], entry->vaddr[0]))
goto partial;
data = DATAOF_TRACE_ENTRY(entry, true);
} else {
- if (!trace_seq_printf(s, "%s: (0x%lx)", tu->call.name,
+ if (!trace_seq_printf(s, "%s: (0x%lx)", tu->p.call.name,
entry->vaddr[0]))
goto partial;
data = DATAOF_TRACE_ENTRY(entry, false);
}

- for (i = 0; i < tu->nr_args; i++) {
- if (!tu->args[i].type->print(s, tu->args[i].name,
- data + tu->args[i].offset, entry))
+ for (i = 0; i < tu->p.nr_args; i++) {
+ struct probe_arg *parg = &tu->p.args[i];
+
+ if (!parg->type->print(s, parg->name, data + parg->offset, entry))
goto partial;
}

@@ -617,11 +618,6 @@ partial:
return TRACE_TYPE_PARTIAL_LINE;
}

-static inline bool is_trace_uprobe_enabled(struct trace_uprobe *tu)
-{
- return tu->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE);
-}
-
typedef bool (*filter_func_t)(struct uprobe_consumer *self,
enum uprobe_filter_ctx ctx,
struct mm_struct *mm);
@@ -631,29 +627,29 @@ probe_event_enable(struct trace_uprobe *tu, int flag, filter_func_t filter)
{
int ret = 0;

- if (is_trace_uprobe_enabled(tu))
+ if (trace_probe_is_enabled(&tu->p))
return -EINTR;

WARN_ON(!uprobe_filter_is_empty(&tu->filter));

- tu->flags |= flag;
+ tu->p.flags |= flag;
tu->consumer.filter = filter;
ret = uprobe_register(tu->inode, tu->offset, &tu->consumer);
if (ret)
- tu->flags &= ~flag;
+ tu->p.flags &= ~flag;

return ret;
}

static void probe_event_disable(struct trace_uprobe *tu, int flag)
{
- if (!is_trace_uprobe_enabled(tu))
+ if (!trace_probe_is_enabled(&tu->p))
return;

WARN_ON(!uprobe_filter_is_empty(&tu->filter));

uprobe_unregister(tu->inode, tu->offset, &tu->consumer);
- tu->flags &= ~flag;
+ tu->p.flags &= ~flag;
}

static int uprobe_event_define_fields(struct ftrace_event_call *event_call)
@@ -671,12 +667,12 @@ static int uprobe_event_define_fields(struct ftrace_event_call *event_call)
size = SIZEOF_TRACE_ENTRY(false);
}
/* Set argument names as fields */
- for (i = 0; i < tu->nr_args; i++) {
- ret = trace_define_field(event_call, tu->args[i].type->fmttype,
- tu->args[i].name,
- size + tu->args[i].offset,
- tu->args[i].type->size,
- tu->args[i].type->is_signed,
+ for (i = 0; i < tu->p.nr_args; i++) {
+ struct probe_arg *parg = &tu->p.args[i];
+
+ ret = trace_define_field(event_call, parg->type->fmttype,
+ parg->name, size + parg->offset,
+ parg->type->size, parg->type->is_signed,
FILTER_OTHER);

if (ret)
@@ -704,16 +700,16 @@ static int __set_print_fmt(struct trace_uprobe *tu, char *buf, int len)

pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);

- for (i = 0; i < tu->nr_args; i++) {
+ for (i = 0; i < tu->p.nr_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
- tu->args[i].name, tu->args[i].type->fmt);
+ tu->p.args[i].name, tu->p.args[i].type->fmt);
}

pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);

- for (i = 0; i < tu->nr_args; i++) {
+ for (i = 0; i < tu->p.nr_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
- tu->args[i].name);
+ tu->p.args[i].name);
}

return pos; /* return the length of print_fmt */
@@ -733,7 +729,7 @@ static int set_print_fmt(struct trace_uprobe *tu)

/* Second: actually write the @print_fmt */
__set_print_fmt(tu, print_fmt, len + 1);
- tu->call.print_fmt = print_fmt;
+ tu->p.call.print_fmt = print_fmt;

return 0;
}
@@ -830,14 +826,14 @@ static bool uprobe_perf_filter(struct uprobe_consumer *uc,
static void uprobe_perf_print(struct trace_uprobe *tu,
unsigned long func, struct pt_regs *regs)
{
- struct ftrace_event_call *call = &tu->call;
+ struct ftrace_event_call *call = &tu->p.call;
struct uprobe_trace_entry_head *entry;
struct hlist_head *head;
void *data;
int size, rctx, i;

size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
- size = ALIGN(size + tu->size + sizeof(u32), sizeof(u64)) - sizeof(u32);
+ size = ALIGN(size + tu->p.size + sizeof(u32), sizeof(u64)) - sizeof(u32);

preempt_disable();
head = this_cpu_ptr(call->perf_events);
@@ -857,8 +853,11 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}

- for (i = 0; i < tu->nr_args; i++)
- call_fetch(&tu->args[i].fetch, regs, data + tu->args[i].offset);
+ for (i = 0; i < tu->p.nr_args; i++) {
+ struct probe_arg *parg = &tu->p.args[i];
+
+ call_fetch(&parg->fetch, regs, data + parg->offset);
+ }

perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
out:
@@ -925,11 +924,11 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs)
tu = container_of(con, struct trace_uprobe, consumer);
tu->nhit++;

- if (tu->flags & TP_FLAG_TRACE)
+ if (tu->p.flags & TP_FLAG_TRACE)
ret |= uprobe_trace_func(tu, regs);

#ifdef CONFIG_PERF_EVENTS
- if (tu->flags & TP_FLAG_PROFILE)
+ if (tu->p.flags & TP_FLAG_PROFILE)
ret |= uprobe_perf_func(tu, regs);
#endif
return ret;
@@ -942,11 +941,11 @@ static int uretprobe_dispatcher(struct uprobe_consumer *con,

tu = container_of(con, struct trace_uprobe, consumer);

- if (tu->flags & TP_FLAG_TRACE)
+ if (tu->p.flags & TP_FLAG_TRACE)
uretprobe_trace_func(tu, func, regs);

#ifdef CONFIG_PERF_EVENTS
- if (tu->flags & TP_FLAG_PROFILE)
+ if (tu->p.flags & TP_FLAG_PROFILE)
uretprobe_perf_func(tu, func, regs);
#endif
return 0;
@@ -958,7 +957,7 @@ static struct trace_event_functions uprobe_funcs = {

static int register_uprobe_event(struct trace_uprobe *tu)
{
- struct ftrace_event_call *call = &tu->call;
+ struct ftrace_event_call *call = &tu->p.call;
int ret;

/* Initialize ftrace_event_call */
@@ -993,11 +992,11 @@ static int unregister_uprobe_event(struct trace_uprobe *tu)
int ret;

/* tu->event is unregistered in trace_remove_event_call() */
- ret = trace_remove_event_call(&tu->call);
+ ret = trace_remove_event_call(&tu->p.call);
if (ret)
return ret;
- kfree(tu->call.print_fmt);
- tu->call.print_fmt = NULL;
+ kfree(tu->p.call.print_fmt);
+ tu->p.call.print_fmt = NULL;
return 0;
}

--
1.7.11.7

2013-10-29 06:56:29

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 06/13] tracing/kprobes: Factor out struct trace_probe

From: Namhyung Kim <[email protected]>

There are functions that can be shared to both of kprobes and uprobes.
Separate common data structure to struct trace_probe and use it from
the shared functions.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 396 +++++++++++++++++++++-----------------------
kernel/trace/trace_probe.h | 20 +++
2 files changed, 213 insertions(+), 203 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index fdb6dec11592..6d33cfee9448 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -27,18 +27,12 @@
/**
* Kprobe event core functions
*/
-struct trace_probe {
+struct trace_kprobe {
struct list_head list;
struct kretprobe rp; /* Use rp.kp for kprobe use */
unsigned long nhit;
- unsigned int flags; /* For TP_FLAG_* */
const char *symbol; /* symbol name */
- struct ftrace_event_class class;
- struct ftrace_event_call call;
- struct list_head files;
- ssize_t size; /* trace entry size */
- unsigned int nr_args;
- struct probe_arg args[];
+ struct trace_probe p;
};

struct event_file_link {
@@ -46,56 +40,46 @@ struct event_file_link {
struct list_head list;
};

-#define SIZEOF_TRACE_PROBE(n) \
- (offsetof(struct trace_probe, args) + \
+#define SIZEOF_TRACE_PROBE(n) \
+ (offsetof(struct trace_kprobe, p.args) + \
(sizeof(struct probe_arg) * (n)))


-static __kprobes bool trace_probe_is_return(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_is_return(struct trace_kprobe *tk)
{
- return tp->rp.handler != NULL;
+ return tk->rp.handler != NULL;
}

-static __kprobes const char *trace_probe_symbol(struct trace_probe *tp)
+static __kprobes const char *trace_kprobe_symbol(struct trace_kprobe *tk)
{
- return tp->symbol ? tp->symbol : "unknown";
+ return tk->symbol ? tk->symbol : "unknown";
}

-static __kprobes unsigned long trace_probe_offset(struct trace_probe *tp)
+static __kprobes unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
{
- return tp->rp.kp.offset;
+ return tk->rp.kp.offset;
}

-static __kprobes bool trace_probe_is_enabled(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_has_gone(struct trace_kprobe *tk)
{
- return !!(tp->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE));
+ return !!(kprobe_gone(&tk->rp.kp));
}

-static __kprobes bool trace_probe_is_registered(struct trace_probe *tp)
-{
- return !!(tp->flags & TP_FLAG_REGISTERED);
-}
-
-static __kprobes bool trace_probe_has_gone(struct trace_probe *tp)
-{
- return !!(kprobe_gone(&tp->rp.kp));
-}
-
-static __kprobes bool trace_probe_within_module(struct trace_probe *tp,
- struct module *mod)
+static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
+ struct module *mod)
{
int len = strlen(mod->name);
- const char *name = trace_probe_symbol(tp);
+ const char *name = trace_kprobe_symbol(tk);
return strncmp(mod->name, name, len) == 0 && name[len] == ':';
}

-static __kprobes bool trace_probe_is_on_module(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
{
- return !!strchr(trace_probe_symbol(tp), ':');
+ return !!strchr(trace_kprobe_symbol(tk), ':');
}

-static int register_probe_event(struct trace_probe *tp);
-static int unregister_probe_event(struct trace_probe *tp);
+static int register_kprobe_event(struct trace_kprobe *tk);
+static int unregister_kprobe_event(struct trace_kprobe *tk);

static DEFINE_MUTEX(probe_lock);
static LIST_HEAD(probe_list);
@@ -107,14 +91,14 @@ static int kretprobe_dispatcher(struct kretprobe_instance *ri,
/*
* Allocate new trace_probe and initialize it (including kprobes).
*/
-static struct trace_probe *alloc_trace_probe(const char *group,
+static struct trace_kprobe *alloc_trace_kprobe(const char *group,
const char *event,
void *addr,
const char *symbol,
unsigned long offs,
int nargs, bool is_return)
{
- struct trace_probe *tp;
+ struct trace_kprobe *tp;
int ret = -ENOMEM;

tp = kzalloc(SIZEOF_TRACE_PROBE(nargs), GFP_KERNEL);
@@ -140,9 +124,9 @@ static struct trace_probe *alloc_trace_probe(const char *group,
goto error;
}

- tp->call.class = &tp->class;
- tp->call.name = kstrdup(event, GFP_KERNEL);
- if (!tp->call.name)
+ tp->p.call.class = &tp->p.class;
+ tp->p.call.name = kstrdup(event, GFP_KERNEL);
+ if (!tp->p.call.name)
goto error;

if (!group || !is_good_name(group)) {
@@ -150,41 +134,41 @@ static struct trace_probe *alloc_trace_probe(const char *group,
goto error;
}

- tp->class.system = kstrdup(group, GFP_KERNEL);
- if (!tp->class.system)
+ tp->p.class.system = kstrdup(group, GFP_KERNEL);
+ if (!tp->p.class.system)
goto error;

INIT_LIST_HEAD(&tp->list);
- INIT_LIST_HEAD(&tp->files);
+ INIT_LIST_HEAD(&tp->p.files);
return tp;
error:
- kfree(tp->call.name);
+ kfree(tp->p.call.name);
kfree(tp->symbol);
kfree(tp);
return ERR_PTR(ret);
}

-static void free_trace_probe(struct trace_probe *tp)
+static void free_trace_kprobe(struct trace_kprobe *tp)
{
int i;

- for (i = 0; i < tp->nr_args; i++)
- traceprobe_free_probe_arg(&tp->args[i]);
+ for (i = 0; i < tp->p.nr_args; i++)
+ traceprobe_free_probe_arg(&tp->p.args[i]);

- kfree(tp->call.class->system);
- kfree(tp->call.name);
+ kfree(tp->p.call.class->system);
+ kfree(tp->p.call.name);
kfree(tp->symbol);
kfree(tp);
}

-static struct trace_probe *find_trace_probe(const char *event,
- const char *group)
+static struct trace_kprobe *find_trace_kprobe(const char *event,
+ const char *group)
{
- struct trace_probe *tp;
+ struct trace_kprobe *tp;

list_for_each_entry(tp, &probe_list, list)
- if (strcmp(tp->call.name, event) == 0 &&
- strcmp(tp->call.class->system, group) == 0)
+ if (strcmp(tp->p.call.name, event) == 0 &&
+ strcmp(tp->p.call.class->system, group) == 0)
return tp;
return NULL;
}
@@ -194,7 +178,7 @@ static struct trace_probe *find_trace_probe(const char *event,
* if the file is NULL, enable "perf" handler, or enable "trace" handler.
*/
static int
-enable_trace_probe(struct trace_probe *tp, struct ftrace_event_file *file)
+enable_trace_kprobe(struct trace_kprobe *tp, struct ftrace_event_file *file)
{
int ret = 0;

@@ -208,14 +192,14 @@ enable_trace_probe(struct trace_probe *tp, struct ftrace_event_file *file)
}

link->file = file;
- list_add_tail_rcu(&link->list, &tp->files);
+ list_add_tail_rcu(&link->list, &tp->p.files);

- tp->flags |= TP_FLAG_TRACE;
+ tp->p.flags |= TP_FLAG_TRACE;
} else
- tp->flags |= TP_FLAG_PROFILE;
+ tp->p.flags |= TP_FLAG_PROFILE;

- if (trace_probe_is_registered(tp) && !trace_probe_has_gone(tp)) {
- if (trace_probe_is_return(tp))
+ if (trace_probe_is_registered(&tp->p) && !trace_kprobe_has_gone(tp)) {
+ if (trace_kprobe_is_return(tp))
ret = enable_kretprobe(&tp->rp);
else
ret = enable_kprobe(&tp->rp.kp);
@@ -241,14 +225,14 @@ find_event_file_link(struct trace_probe *tp, struct ftrace_event_file *file)
* if the file is NULL, disable "perf" handler, or disable "trace" handler.
*/
static int
-disable_trace_probe(struct trace_probe *tp, struct ftrace_event_file *file)
+disable_trace_kprobe(struct trace_kprobe *tp, struct ftrace_event_file *file)
{
struct event_file_link *link = NULL;
int wait = 0;
int ret = 0;

if (file) {
- link = find_event_file_link(tp, file);
+ link = find_event_file_link(&tp->p, file);
if (!link) {
ret = -EINVAL;
goto out;
@@ -256,15 +240,15 @@ disable_trace_probe(struct trace_probe *tp, struct ftrace_event_file *file)

list_del_rcu(&link->list);
wait = 1;
- if (!list_empty(&tp->files))
+ if (!list_empty(&tp->p.files))
goto out;

- tp->flags &= ~TP_FLAG_TRACE;
+ tp->p.flags &= ~TP_FLAG_TRACE;
} else
- tp->flags &= ~TP_FLAG_PROFILE;
+ tp->p.flags &= ~TP_FLAG_PROFILE;

- if (!trace_probe_is_enabled(tp) && trace_probe_is_registered(tp)) {
- if (trace_probe_is_return(tp))
+ if (!trace_probe_is_enabled(&tp->p) && trace_probe_is_registered(&tp->p)) {
+ if (trace_kprobe_is_return(tp))
disable_kretprobe(&tp->rp);
else
disable_kprobe(&tp->rp.kp);
@@ -288,33 +272,33 @@ disable_trace_probe(struct trace_probe *tp, struct ftrace_event_file *file)
}

/* Internal register function - just handle k*probes and flags */
-static int __register_trace_probe(struct trace_probe *tp)
+static int __register_trace_kprobe(struct trace_kprobe *tp)
{
int i, ret;

- if (trace_probe_is_registered(tp))
+ if (trace_probe_is_registered(&tp->p))
return -EINVAL;

- for (i = 0; i < tp->nr_args; i++)
- traceprobe_update_arg(&tp->args[i]);
+ for (i = 0; i < tp->p.nr_args; i++)
+ traceprobe_update_arg(&tp->p.args[i]);

/* Set/clear disabled flag according to tp->flag */
- if (trace_probe_is_enabled(tp))
+ if (trace_probe_is_enabled(&tp->p))
tp->rp.kp.flags &= ~KPROBE_FLAG_DISABLED;
else
tp->rp.kp.flags |= KPROBE_FLAG_DISABLED;

- if (trace_probe_is_return(tp))
+ if (trace_kprobe_is_return(tp))
ret = register_kretprobe(&tp->rp);
else
ret = register_kprobe(&tp->rp.kp);

if (ret == 0)
- tp->flags |= TP_FLAG_REGISTERED;
+ tp->p.flags |= TP_FLAG_REGISTERED;
else {
pr_warning("Could not insert probe at %s+%lu: %d\n",
- trace_probe_symbol(tp), trace_probe_offset(tp), ret);
- if (ret == -ENOENT && trace_probe_is_on_module(tp)) {
+ trace_kprobe_symbol(tp), trace_kprobe_offset(tp), ret);
+ if (ret == -ENOENT && trace_kprobe_is_on_module(tp)) {
pr_warning("This probe might be able to register after"
"target module is loaded. Continue.\n");
ret = 0;
@@ -330,14 +314,14 @@ static int __register_trace_probe(struct trace_probe *tp)
}

/* Internal unregister function - just handle k*probes and flags */
-static void __unregister_trace_probe(struct trace_probe *tp)
+static void __unregister_trace_kprobe(struct trace_kprobe *tp)
{
- if (trace_probe_is_registered(tp)) {
- if (trace_probe_is_return(tp))
+ if (trace_probe_is_registered(&tp->p)) {
+ if (trace_kprobe_is_return(tp))
unregister_kretprobe(&tp->rp);
else
unregister_kprobe(&tp->rp.kp);
- tp->flags &= ~TP_FLAG_REGISTERED;
+ tp->p.flags &= ~TP_FLAG_REGISTERED;
/* Cleanup kprobe for reuse */
if (tp->rp.kp.symbol_name)
tp->rp.kp.addr = NULL;
@@ -345,50 +329,50 @@ static void __unregister_trace_probe(struct trace_probe *tp)
}

/* Unregister a trace_probe and probe_event: call with locking probe_lock */
-static int unregister_trace_probe(struct trace_probe *tp)
+static int unregister_trace_kprobe(struct trace_kprobe *tp)
{
/* Enabled event can not be unregistered */
- if (trace_probe_is_enabled(tp))
+ if (trace_probe_is_enabled(&tp->p))
return -EBUSY;

/* Will fail if probe is being used by ftrace or perf */
- if (unregister_probe_event(tp))
+ if (unregister_kprobe_event(tp))
return -EBUSY;

- __unregister_trace_probe(tp);
+ __unregister_trace_kprobe(tp);
list_del(&tp->list);

return 0;
}

/* Register a trace_probe and probe_event */
-static int register_trace_probe(struct trace_probe *tp)
+static int register_trace_kprobe(struct trace_kprobe *tp)
{
- struct trace_probe *old_tp;
+ struct trace_kprobe *old_tp;
int ret;

mutex_lock(&probe_lock);

/* Delete old (same name) event if exist */
- old_tp = find_trace_probe(tp->call.name, tp->call.class->system);
+ old_tp = find_trace_kprobe(tp->p.call.name, tp->p.call.class->system);
if (old_tp) {
- ret = unregister_trace_probe(old_tp);
+ ret = unregister_trace_kprobe(old_tp);
if (ret < 0)
goto end;
- free_trace_probe(old_tp);
+ free_trace_kprobe(old_tp);
}

/* Register new event */
- ret = register_probe_event(tp);
+ ret = register_kprobe_event(tp);
if (ret) {
pr_warning("Failed to register probe event(%d)\n", ret);
goto end;
}

/* Register k*probe */
- ret = __register_trace_probe(tp);
+ ret = __register_trace_kprobe(tp);
if (ret < 0)
- unregister_probe_event(tp);
+ unregister_kprobe_event(tp);
else
list_add_tail(&tp->list, &probe_list);

@@ -398,11 +382,11 @@ end:
}

/* Module notifier call back, checking event on the module */
-static int trace_probe_module_callback(struct notifier_block *nb,
+static int trace_kprobe_module_callback(struct notifier_block *nb,
unsigned long val, void *data)
{
struct module *mod = data;
- struct trace_probe *tp;
+ struct trace_kprobe *tp;
int ret;

if (val != MODULE_STATE_COMING)
@@ -411,14 +395,14 @@ static int trace_probe_module_callback(struct notifier_block *nb,
/* Update probes on coming module */
mutex_lock(&probe_lock);
list_for_each_entry(tp, &probe_list, list) {
- if (trace_probe_within_module(tp, mod)) {
+ if (trace_kprobe_within_module(tp, mod)) {
/* Don't need to check busy - this should have gone. */
- __unregister_trace_probe(tp);
- ret = __register_trace_probe(tp);
+ __unregister_trace_kprobe(tp);
+ ret = __register_trace_kprobe(tp);
if (ret)
pr_warning("Failed to re-register probe %s on"
"%s: %d\n",
- tp->call.name, mod->name, ret);
+ tp->p.call.name, mod->name, ret);
}
}
mutex_unlock(&probe_lock);
@@ -426,12 +410,12 @@ static int trace_probe_module_callback(struct notifier_block *nb,
return NOTIFY_DONE;
}

-static struct notifier_block trace_probe_module_nb = {
- .notifier_call = trace_probe_module_callback,
+static struct notifier_block trace_kprobe_module_nb = {
+ .notifier_call = trace_kprobe_module_callback,
.priority = 1 /* Invoked after kprobe module callback */
};

-static int create_trace_probe(int argc, char **argv)
+static int create_trace_kprobe(int argc, char **argv)
{
/*
* Argument syntax:
@@ -451,7 +435,7 @@ static int create_trace_probe(int argc, char **argv)
* Type of args:
* FETCHARG:TYPE : use TYPE instead of unsigned long.
*/
- struct trace_probe *tp;
+ struct trace_kprobe *tp;
int i, ret = 0;
bool is_return = false, is_delete = false;
char *symbol = NULL, *event = NULL, *group = NULL;
@@ -498,16 +482,16 @@ static int create_trace_probe(int argc, char **argv)
return -EINVAL;
}
mutex_lock(&probe_lock);
- tp = find_trace_probe(event, group);
+ tp = find_trace_kprobe(event, group);
if (!tp) {
mutex_unlock(&probe_lock);
pr_info("Event %s/%s doesn't exist.\n", group, event);
return -ENOENT;
}
/* delete an event */
- ret = unregister_trace_probe(tp);
+ ret = unregister_trace_kprobe(tp);
if (ret == 0)
- free_trace_probe(tp);
+ free_trace_kprobe(tp);
mutex_unlock(&probe_lock);
return ret;
}
@@ -554,7 +538,7 @@ static int create_trace_probe(int argc, char **argv)
is_return ? 'r' : 'p', addr);
event = buf;
}
- tp = alloc_trace_probe(group, event, addr, symbol, offset, argc,
+ tp = alloc_trace_kprobe(group, event, addr, symbol, offset, argc,
is_return);
if (IS_ERR(tp)) {
pr_info("Failed to allocate trace_probe.(%d)\n",
@@ -565,36 +549,38 @@ static int create_trace_probe(int argc, char **argv)
/* parse arguments */
ret = 0;
for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+ struct probe_arg *parg = &tp->p.args[i];
+
/* Increment count for freeing args in error case */
- tp->nr_args++;
+ tp->p.nr_args++;

/* Parse argument name */
arg = strchr(argv[i], '=');
if (arg) {
*arg++ = '\0';
- tp->args[i].name = kstrdup(argv[i], GFP_KERNEL);
+ parg->name = kstrdup(argv[i], GFP_KERNEL);
} else {
arg = argv[i];
/* If argument name is omitted, set "argN" */
snprintf(buf, MAX_EVENT_NAME_LEN, "arg%d", i + 1);
- tp->args[i].name = kstrdup(buf, GFP_KERNEL);
+ parg->name = kstrdup(buf, GFP_KERNEL);
}

- if (!tp->args[i].name) {
+ if (!parg->name) {
pr_info("Failed to allocate argument[%d] name.\n", i);
ret = -ENOMEM;
goto error;
}

- if (!is_good_name(tp->args[i].name)) {
+ if (!is_good_name(parg->name)) {
pr_info("Invalid argument[%d] name: %s\n",
- i, tp->args[i].name);
+ i, parg->name);
ret = -EINVAL;
goto error;
}

- if (traceprobe_conflict_field_name(tp->args[i].name,
- tp->args, i)) {
+ if (traceprobe_conflict_field_name(parg->name,
+ tp->p.args, i)) {
pr_info("Argument[%d] name '%s' conflicts with "
"another field.\n", i, argv[i]);
ret = -EINVAL;
@@ -602,7 +588,7 @@ static int create_trace_probe(int argc, char **argv)
}

/* Parse fetch argument */
- ret = traceprobe_parse_probe_arg(arg, &tp->size, &tp->args[i],
+ ret = traceprobe_parse_probe_arg(arg, &tp->p.size, parg,
is_return, true);
if (ret) {
pr_info("Parse error at argument[%d]. (%d)\n", i, ret);
@@ -610,35 +596,35 @@ static int create_trace_probe(int argc, char **argv)
}
}

- ret = register_trace_probe(tp);
+ ret = register_trace_kprobe(tp);
if (ret)
goto error;
return 0;

error:
- free_trace_probe(tp);
+ free_trace_kprobe(tp);
return ret;
}

-static int release_all_trace_probes(void)
+static int release_all_trace_kprobes(void)
{
- struct trace_probe *tp;
+ struct trace_kprobe *tp;
int ret = 0;

mutex_lock(&probe_lock);
/* Ensure no probe is in use. */
list_for_each_entry(tp, &probe_list, list)
- if (trace_probe_is_enabled(tp)) {
+ if (trace_probe_is_enabled(&tp->p)) {
ret = -EBUSY;
goto end;
}
/* TODO: Use batch unregistration */
while (!list_empty(&probe_list)) {
- tp = list_entry(probe_list.next, struct trace_probe, list);
- ret = unregister_trace_probe(tp);
+ tp = list_entry(probe_list.next, struct trace_kprobe, list);
+ ret = unregister_trace_kprobe(tp);
if (ret)
goto end;
- free_trace_probe(tp);
+ free_trace_kprobe(tp);
}

end:
@@ -666,22 +652,22 @@ static void probes_seq_stop(struct seq_file *m, void *v)

static int probes_seq_show(struct seq_file *m, void *v)
{
- struct trace_probe *tp = v;
+ struct trace_kprobe *tp = v;
int i;

- seq_printf(m, "%c", trace_probe_is_return(tp) ? 'r' : 'p');
- seq_printf(m, ":%s/%s", tp->call.class->system, tp->call.name);
+ seq_printf(m, "%c", trace_kprobe_is_return(tp) ? 'r' : 'p');
+ seq_printf(m, ":%s/%s", tp->p.call.class->system, tp->p.call.name);

if (!tp->symbol)
seq_printf(m, " 0x%p", tp->rp.kp.addr);
else if (tp->rp.kp.offset)
- seq_printf(m, " %s+%u", trace_probe_symbol(tp),
+ seq_printf(m, " %s+%u", trace_kprobe_symbol(tp),
tp->rp.kp.offset);
else
- seq_printf(m, " %s", trace_probe_symbol(tp));
+ seq_printf(m, " %s", trace_kprobe_symbol(tp));

- for (i = 0; i < tp->nr_args; i++)
- seq_printf(m, " %s=%s", tp->args[i].name, tp->args[i].comm);
+ for (i = 0; i < tp->p.nr_args; i++)
+ seq_printf(m, " %s=%s", tp->p.args[i].name, tp->p.args[i].comm);
seq_printf(m, "\n");

return 0;
@@ -699,7 +685,7 @@ static int probes_open(struct inode *inode, struct file *file)
int ret;

if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
- ret = release_all_trace_probes();
+ ret = release_all_trace_kprobes();
if (ret < 0)
return ret;
}
@@ -711,7 +697,7 @@ static ssize_t probes_write(struct file *file, const char __user *buffer,
size_t count, loff_t *ppos)
{
return traceprobe_probes_write(file, buffer, count, ppos,
- create_trace_probe);
+ create_trace_kprobe);
}

static const struct file_operations kprobe_events_ops = {
@@ -726,9 +712,9 @@ static const struct file_operations kprobe_events_ops = {
/* Probes profiling interfaces */
static int probes_profile_seq_show(struct seq_file *m, void *v)
{
- struct trace_probe *tp = v;
+ struct trace_kprobe *tp = v;

- seq_printf(m, " %-44s %15lu %15lu\n", tp->call.name, tp->nhit,
+ seq_printf(m, " %-44s %15lu %15lu\n", tp->p.call.name, tp->nhit,
tp->rp.kp.nmissed);

return 0;
@@ -973,7 +959,7 @@ static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp,

/* Kprobe handler */
static __kprobes void
-__kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs,
+__kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs *regs,
struct ftrace_event_file *ftrace_file)
{
struct kprobe_trace_entry_head *entry;
@@ -981,7 +967,7 @@ __kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs,
struct ring_buffer *buffer;
int size, dsize, pc;
unsigned long irq_flags;
- struct ftrace_event_call *call = &tp->call;
+ struct ftrace_event_call *call = &tp->p.call;

WARN_ON(call != ftrace_file->event_call);

@@ -991,8 +977,8 @@ __kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs,
local_save_flags(irq_flags);
pc = preempt_count();

- dsize = __get_data_size(tp, regs);
- size = sizeof(*entry) + tp->size + dsize;
+ dsize = __get_data_size(&tp->p, regs);
+ size = sizeof(*entry) + tp->p.size + dsize;

event = trace_event_buffer_lock_reserve(&buffer, ftrace_file,
call->event.type,
@@ -1002,7 +988,7 @@ __kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs,

entry = ring_buffer_event_data(event);
entry->ip = (unsigned long)tp->rp.kp.addr;
- store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);

if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit_regs(buffer, event,
@@ -1010,17 +996,17 @@ __kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs,
}

static __kprobes void
-kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs)
+kprobe_trace_func(struct trace_kprobe *tp, struct pt_regs *regs)
{
struct event_file_link *link;

- list_for_each_entry_rcu(link, &tp->files, list)
+ list_for_each_entry_rcu(link, &tp->p.files, list)
__kprobe_trace_func(tp, regs, link->file);
}

/* Kretprobe handler */
static __kprobes void
-__kretprobe_trace_func(struct trace_probe *tp, struct kretprobe_instance *ri,
+__kretprobe_trace_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,
struct pt_regs *regs,
struct ftrace_event_file *ftrace_file)
{
@@ -1029,7 +1015,7 @@ __kretprobe_trace_func(struct trace_probe *tp, struct kretprobe_instance *ri,
struct ring_buffer *buffer;
int size, pc, dsize;
unsigned long irq_flags;
- struct ftrace_event_call *call = &tp->call;
+ struct ftrace_event_call *call = &tp->p.call;

WARN_ON(call != ftrace_file->event_call);

@@ -1039,8 +1025,8 @@ __kretprobe_trace_func(struct trace_probe *tp, struct kretprobe_instance *ri,
local_save_flags(irq_flags);
pc = preempt_count();

- dsize = __get_data_size(tp, regs);
- size = sizeof(*entry) + tp->size + dsize;
+ dsize = __get_data_size(&tp->p, regs);
+ size = sizeof(*entry) + tp->p.size + dsize;

event = trace_event_buffer_lock_reserve(&buffer, ftrace_file,
call->event.type,
@@ -1051,7 +1037,7 @@ __kretprobe_trace_func(struct trace_probe *tp, struct kretprobe_instance *ri,
entry = ring_buffer_event_data(event);
entry->func = (unsigned long)tp->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
- store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);

if (!filter_current_check_discard(buffer, call, entry, event))
trace_buffer_unlock_commit_regs(buffer, event,
@@ -1059,12 +1045,12 @@ __kretprobe_trace_func(struct trace_probe *tp, struct kretprobe_instance *ri,
}

static __kprobes void
-kretprobe_trace_func(struct trace_probe *tp, struct kretprobe_instance *ri,
+kretprobe_trace_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
struct event_file_link *link;

- list_for_each_entry_rcu(link, &tp->files, list)
+ list_for_each_entry_rcu(link, &tp->p.files, list)
__kretprobe_trace_func(tp, ri, regs, link->file);
}

@@ -1152,16 +1138,18 @@ static int kprobe_event_define_fields(struct ftrace_event_call *event_call)
{
int ret, i;
struct kprobe_trace_entry_head field;
- struct trace_probe *tp = (struct trace_probe *)event_call->data;
+ struct trace_kprobe *tp = (struct trace_kprobe *)event_call->data;

DEFINE_FIELD(unsigned long, ip, FIELD_STRING_IP, 0);
/* Set argument names as fields */
- for (i = 0; i < tp->nr_args; i++) {
- ret = trace_define_field(event_call, tp->args[i].type->fmttype,
- tp->args[i].name,
- sizeof(field) + tp->args[i].offset,
- tp->args[i].type->size,
- tp->args[i].type->is_signed,
+ for (i = 0; i < tp->p.nr_args; i++) {
+ struct probe_arg *parg = &tp->p.args[i];
+
+ ret = trace_define_field(event_call, parg->type->fmttype,
+ parg->name,
+ sizeof(field) + parg->offset,
+ parg->type->size,
+ parg->type->is_signed,
FILTER_OTHER);
if (ret)
return ret;
@@ -1173,17 +1161,19 @@ static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
{
int ret, i;
struct kretprobe_trace_entry_head field;
- struct trace_probe *tp = (struct trace_probe *)event_call->data;
+ struct trace_kprobe *tp = (struct trace_kprobe *)event_call->data;

DEFINE_FIELD(unsigned long, func, FIELD_STRING_FUNC, 0);
DEFINE_FIELD(unsigned long, ret_ip, FIELD_STRING_RETIP, 0);
/* Set argument names as fields */
- for (i = 0; i < tp->nr_args; i++) {
- ret = trace_define_field(event_call, tp->args[i].type->fmttype,
- tp->args[i].name,
- sizeof(field) + tp->args[i].offset,
- tp->args[i].type->size,
- tp->args[i].type->is_signed,
+ for (i = 0; i < tp->p.nr_args; i++) {
+ struct probe_arg *parg = &tp->p.args[i];
+
+ ret = trace_define_field(event_call, parg->type->fmttype,
+ parg->name,
+ sizeof(field) + parg->offset,
+ parg->type->size,
+ parg->type->is_signed,
FILTER_OTHER);
if (ret)
return ret;
@@ -1191,14 +1181,14 @@ static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
return 0;
}

-static int __set_print_fmt(struct trace_probe *tp, char *buf, int len)
+static int __set_print_fmt(struct trace_kprobe *tp, char *buf, int len)
{
int i;
int pos = 0;

const char *fmt, *arg;

- if (!trace_probe_is_return(tp)) {
+ if (!trace_kprobe_is_return(tp)) {
fmt = "(%lx)";
arg = "REC->" FIELD_STRING_IP;
} else {
@@ -1211,21 +1201,21 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len)

pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);

- for (i = 0; i < tp->nr_args; i++) {
+ for (i = 0; i < tp->p.nr_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
- tp->args[i].name, tp->args[i].type->fmt);
+ tp->p.args[i].name, tp->p.args[i].type->fmt);
}

pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);

- for (i = 0; i < tp->nr_args; i++) {
- if (strcmp(tp->args[i].type->name, "string") == 0)
+ for (i = 0; i < tp->p.nr_args; i++) {
+ if (strcmp(tp->p.args[i].type->name, "string") == 0)
pos += snprintf(buf + pos, LEN_OR_ZERO,
", __get_str(%s)",
- tp->args[i].name);
+ tp->p.args[i].name);
else
pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
- tp->args[i].name);
+ tp->p.args[i].name);
}

#undef LEN_OR_ZERO
@@ -1234,7 +1224,7 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len)
return pos;
}

-static int set_print_fmt(struct trace_probe *tp)
+static int set_print_fmt(struct trace_kprobe *tp)
{
int len;
char *print_fmt;
@@ -1247,7 +1237,7 @@ static int set_print_fmt(struct trace_probe *tp)

/* Second: actually write the @print_fmt */
__set_print_fmt(tp, print_fmt, len + 1);
- tp->call.print_fmt = print_fmt;
+ tp->p.call.print_fmt = print_fmt;

return 0;
}
@@ -1256,9 +1246,9 @@ static int set_print_fmt(struct trace_probe *tp)

/* Kprobe profile handler */
static __kprobes void
-kprobe_perf_func(struct trace_probe *tp, struct pt_regs *regs)
+kprobe_perf_func(struct trace_kprobe *tp, struct pt_regs *regs)
{
- struct ftrace_event_call *call = &tp->call;
+ struct ftrace_event_call *call = &tp->p.call;
struct kprobe_trace_entry_head *entry;
struct hlist_head *head;
int size, __size, dsize;
@@ -1268,8 +1258,8 @@ kprobe_perf_func(struct trace_probe *tp, struct pt_regs *regs)
if (hlist_empty(head))
return;

- dsize = __get_data_size(tp, regs);
- __size = sizeof(*entry) + tp->size + dsize;
+ dsize = __get_data_size(&tp->p, regs);
+ __size = sizeof(*entry) + tp->p.size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);

@@ -1279,16 +1269,16 @@ kprobe_perf_func(struct trace_probe *tp, struct pt_regs *regs)

entry->ip = (unsigned long)tp->rp.kp.addr;
memset(&entry[1], 0, dsize);
- store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}

/* Kretprobe profile handler */
static __kprobes void
-kretprobe_perf_func(struct trace_probe *tp, struct kretprobe_instance *ri,
+kretprobe_perf_func(struct trace_kprobe *tp, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
- struct ftrace_event_call *call = &tp->call;
+ struct ftrace_event_call *call = &tp->p.call;
struct kretprobe_trace_entry_head *entry;
struct hlist_head *head;
int size, __size, dsize;
@@ -1298,8 +1288,8 @@ kretprobe_perf_func(struct trace_probe *tp, struct kretprobe_instance *ri,
if (hlist_empty(head))
return;

- dsize = __get_data_size(tp, regs);
- __size = sizeof(*entry) + tp->size + dsize;
+ dsize = __get_data_size(&tp->p, regs);
+ __size = sizeof(*entry) + tp->p.size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);

@@ -1309,7 +1299,7 @@ kretprobe_perf_func(struct trace_probe *tp, struct kretprobe_instance *ri,

entry->func = (unsigned long)tp->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
- store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
+ store_trace_args(sizeof(*entry), &tp->p, regs, (u8 *)&entry[1], dsize);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}
#endif /* CONFIG_PERF_EVENTS */
@@ -1324,20 +1314,20 @@ static __kprobes
int kprobe_register(struct ftrace_event_call *event,
enum trace_reg type, void *data)
{
- struct trace_probe *tp = (struct trace_probe *)event->data;
+ struct trace_kprobe *tp = (struct trace_kprobe *)event->data;
struct ftrace_event_file *file = data;

switch (type) {
case TRACE_REG_REGISTER:
- return enable_trace_probe(tp, file);
+ return enable_trace_kprobe(tp, file);
case TRACE_REG_UNREGISTER:
- return disable_trace_probe(tp, file);
+ return disable_trace_kprobe(tp, file);

#ifdef CONFIG_PERF_EVENTS
case TRACE_REG_PERF_REGISTER:
- return enable_trace_probe(tp, NULL);
+ return enable_trace_kprobe(tp, NULL);
case TRACE_REG_PERF_UNREGISTER:
- return disable_trace_probe(tp, NULL);
+ return disable_trace_kprobe(tp, NULL);
case TRACE_REG_PERF_OPEN:
case TRACE_REG_PERF_CLOSE:
case TRACE_REG_PERF_ADD:
@@ -1351,14 +1341,14 @@ int kprobe_register(struct ftrace_event_call *event,
static __kprobes
int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
{
- struct trace_probe *tp = container_of(kp, struct trace_probe, rp.kp);
+ struct trace_kprobe *tp = container_of(kp, struct trace_kprobe, rp.kp);

tp->nhit++;

- if (tp->flags & TP_FLAG_TRACE)
+ if (tp->p.flags & TP_FLAG_TRACE)
kprobe_trace_func(tp, regs);
#ifdef CONFIG_PERF_EVENTS
- if (tp->flags & TP_FLAG_PROFILE)
+ if (tp->p.flags & TP_FLAG_PROFILE)
kprobe_perf_func(tp, regs);
#endif
return 0; /* We don't tweek kernel, so just return 0 */
@@ -1367,14 +1357,14 @@ int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
static __kprobes
int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
{
- struct trace_probe *tp = container_of(ri->rp, struct trace_probe, rp);
+ struct trace_kprobe *tp = container_of(ri->rp, struct trace_kprobe, rp);

tp->nhit++;

- if (tp->flags & TP_FLAG_TRACE)
+ if (tp->p.flags & TP_FLAG_TRACE)
kretprobe_trace_func(tp, ri, regs);
#ifdef CONFIG_PERF_EVENTS
- if (tp->flags & TP_FLAG_PROFILE)
+ if (tp->p.flags & TP_FLAG_PROFILE)
kretprobe_perf_func(tp, ri, regs);
#endif
return 0; /* We don't tweek kernel, so just return 0 */
@@ -1388,14 +1378,14 @@ static struct trace_event_functions kprobe_funcs = {
.trace = print_kprobe_event
};

-static int register_probe_event(struct trace_probe *tp)
+static int register_kprobe_event(struct trace_kprobe *tp)
{
- struct ftrace_event_call *call = &tp->call;
+ struct ftrace_event_call *call = &tp->p.call;
int ret;

/* Initialize ftrace_event_call */
INIT_LIST_HEAD(&call->class->fields);
- if (trace_probe_is_return(tp)) {
+ if (trace_kprobe_is_return(tp)) {
call->event.funcs = &kretprobe_funcs;
call->class->define_fields = kretprobe_event_define_fields;
} else {
@@ -1421,14 +1411,14 @@ static int register_probe_event(struct trace_probe *tp)
return ret;
}

-static int unregister_probe_event(struct trace_probe *tp)
+static int unregister_kprobe_event(struct trace_kprobe *tp)
{
int ret;

/* tp->event is unregistered in trace_remove_event_call() */
- ret = trace_remove_event_call(&tp->call);
+ ret = trace_remove_event_call(&tp->p.call);
if (!ret)
- kfree(tp->call.print_fmt);
+ kfree(tp->p.call.print_fmt);
return ret;
}

@@ -1438,7 +1428,7 @@ static __init int init_kprobe_trace(void)
struct dentry *d_tracer;
struct dentry *entry;

- if (register_module_notifier(&trace_probe_module_nb))
+ if (register_module_notifier(&trace_kprobe_module_nb))
return -EINVAL;

d_tracer = tracing_init_dentry();
@@ -1478,12 +1468,12 @@ static __used int kprobe_trace_selftest_target(int a1, int a2, int a3,
}

static struct ftrace_event_file *
-find_trace_probe_file(struct trace_probe *tp, struct trace_array *tr)
+find_trace_probe_file(struct trace_kprobe *tp, struct trace_array *tr)
{
struct ftrace_event_file *file;

list_for_each_entry(file, &tr->events, list)
- if (file->event_call == &tp->call)
+ if (file->event_call == &tp->p.call)
return file;

return NULL;
@@ -1497,7 +1487,7 @@ static __init int kprobe_trace_self_tests_init(void)
{
int ret, warn = 0;
int (*target)(int, int, int, int, int, int);
- struct trace_probe *tp;
+ struct trace_kprobe *tp;
struct ftrace_event_file *file;

target = kprobe_trace_selftest_target;
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 9ac7bdf607cc..63e5da4e3073 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -159,6 +159,26 @@ struct probe_arg {
const struct fetch_type *type; /* Type of this argument */
};

+struct trace_probe {
+ unsigned int flags; /* For TP_FLAG_* */
+ struct ftrace_event_class class;
+ struct ftrace_event_call call;
+ struct list_head files;
+ ssize_t size; /* trace entry size */
+ unsigned int nr_args;
+ struct probe_arg args[];
+};
+
+static inline bool trace_probe_is_enabled(struct trace_probe *tp)
+{
+ return !!(tp->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE));
+}
+
+static inline bool trace_probe_is_registered(struct trace_probe *tp)
+{
+ return !!(tp->flags & TP_FLAG_REGISTERED);
+}
+
#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type

#define DECLARE_FETCH_FUNC(method, type) \
--
1.7.11.7

2013-10-29 06:53:46

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 01/13] tracing/uprobes: Fix documentation of uprobe registration syntax

From: Namhyung Kim <[email protected]>

The uprobe syntax requires an offset after a file path not a symbol.

Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
Documentation/trace/uprobetracer.txt | 10 +++++-----
kernel/trace/trace_uprobe.c | 2 +-
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt b/Documentation/trace/uprobetracer.txt
index d9c3e682312c..8f1a8b8956fc 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -19,15 +19,15 @@ user to calculate the offset of the probepoint in the object.

Synopsis of uprobe_tracer
-------------------------
- p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
- r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe)
- -:[GRP/]EVENT : Clear uprobe or uretprobe event
+ p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
+ r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
+ -:[GRP/]EVENT : Clear uprobe or uretprobe event

GRP : Group name. If omitted, "uprobes" is the default value.
EVENT : Event name. If omitted, the event name is generated based
- on SYMBOL+offs.
+ on PATH+OFFSET.
PATH : Path to an executable or a library.
- SYMBOL[+offs] : Symbol+offset where the probe is inserted.
+ OFFSET : Offset where the probe is inserted.

FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 272261b5f94f..a415c5867ec5 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -210,7 +210,7 @@ end:

/*
* Argument syntax:
- * - Add uprobe: p|r[:[GRP/]EVENT] PATH:SYMBOL [FETCHARGS]
+ * - Add uprobe: p|r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS]
*
* - Remove uprobe: -:[GRP/]EVENT
*/
--
1.7.11.7

2013-10-29 06:57:03

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 03/13] tracing/kprobes: Move fetch functions to trace_kprobe.c

From: Hyeoncheol Lee <[email protected]>

Move kprobes-specific fetch functions to the trace_kprobe.c file.
Also define kprobes_fetch_type_table in the .c file. This table is
shared with uprobes for now, but the uprobes will get its own table
in the later patch.

This is a preparation for supporting more fetch functions to uprobes
and no functional changes are intended.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Hyeoncheol Lee <[email protected]>
[[email protected]: Split original patch into pieces as requested]
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 169 +++++++++++++++++++++++++
kernel/trace/trace_probe.c | 299 +++++---------------------------------------
kernel/trace/trace_probe.h | 132 +++++++++++++++++++
3 files changed, 335 insertions(+), 265 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 243f6834d026..1eff166990c2 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -754,6 +754,175 @@ static const struct file_operations kprobe_profile_ops = {
.release = seq_release,
};

+/*
+ * kprobes-specific fetch functions
+ */
+#define DEFINE_FETCH_stack(type) \
+__kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
+ void *offset, void *dest) \
+{ \
+ *(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
+ (unsigned int)((unsigned long)offset)); \
+}
+DEFINE_BASIC_FETCH_FUNCS(stack)
+/* No string on the stack entry */
+#define fetch_stack_string NULL
+#define fetch_stack_string_size NULL
+
+#define DEFINE_FETCH_memory(type) \
+__kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
+ void *addr, void *dest) \
+{ \
+ type retval; \
+ if (probe_kernel_address(addr, retval)) \
+ *(type *)dest = 0; \
+ else \
+ *(type *)dest = retval; \
+}
+DEFINE_BASIC_FETCH_FUNCS(memory)
+/*
+ * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
+ * length and relative data location.
+ */
+__kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
+{
+ long ret;
+ int maxlen = get_rloc_len(*(u32 *)dest);
+ u8 *dst = get_rloc_data(dest);
+ u8 *src = addr;
+ mm_segment_t old_fs = get_fs();
+
+ if (!maxlen)
+ return;
+
+ /*
+ * Try to get string again, since the string can be changed while
+ * probing.
+ */
+ set_fs(KERNEL_DS);
+ pagefault_disable();
+
+ do
+ ret = __copy_from_user_inatomic(dst++, src++, 1);
+ while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen);
+
+ dst[-1] = '\0';
+ pagefault_enable();
+ set_fs(old_fs);
+
+ if (ret < 0) { /* Failed to fetch string */
+ ((u8 *)get_rloc_data(dest))[0] = '\0';
+ *(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest));
+ } else {
+ *(u32 *)dest = make_data_rloc(src - (u8 *)addr,
+ get_rloc_offs(*(u32 *)dest));
+ }
+}
+
+/* Return the length of string -- including null terminal byte */
+__kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest)
+{
+ mm_segment_t old_fs;
+ int ret, len = 0;
+ u8 c;
+
+ old_fs = get_fs();
+ set_fs(KERNEL_DS);
+ pagefault_disable();
+
+ do {
+ ret = __copy_from_user_inatomic(&c, (u8 *)addr + len, 1);
+ len++;
+ } while (c && ret == 0 && len < MAX_STRING_SIZE);
+
+ pagefault_enable();
+ set_fs(old_fs);
+
+ if (ret < 0) /* Failed to check the length */
+ *(u32 *)dest = 0;
+ else
+ *(u32 *)dest = len;
+}
+
+/* Memory fetching by symbol */
+struct symbol_cache {
+ char *symbol;
+ long offset;
+ unsigned long addr;
+};
+
+unsigned long update_symbol_cache(struct symbol_cache *sc)
+{
+ sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
+
+ if (sc->addr)
+ sc->addr += sc->offset;
+
+ return sc->addr;
+}
+
+void free_symbol_cache(struct symbol_cache *sc)
+{
+ kfree(sc->symbol);
+ kfree(sc);
+}
+
+struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
+{
+ struct symbol_cache *sc;
+
+ if (!sym || strlen(sym) == 0)
+ return NULL;
+
+ sc = kzalloc(sizeof(struct symbol_cache), GFP_KERNEL);
+ if (!sc)
+ return NULL;
+
+ sc->symbol = kstrdup(sym, GFP_KERNEL);
+ if (!sc->symbol) {
+ kfree(sc);
+ return NULL;
+ }
+ sc->offset = offset;
+ update_symbol_cache(sc);
+
+ return sc;
+}
+
+#define DEFINE_FETCH_symbol(type) \
+__kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
+ void *data, void *dest) \
+{ \
+ struct symbol_cache *sc = data; \
+ if (sc->addr) \
+ fetch_memory_##type(regs, (void *)sc->addr, dest); \
+ else \
+ *(type *)dest = 0; \
+}
+DEFINE_BASIC_FETCH_FUNCS(symbol)
+DEFINE_FETCH_symbol(string)
+DEFINE_FETCH_symbol(string_size)
+
+/* Fetch type information table */
+const struct fetch_type kprobes_fetch_type_table[] = {
+ /* Special types */
+ [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
+ sizeof(u32), 1, "__data_loc char[]"),
+ [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
+ string_size, sizeof(u32), 0, "u32"),
+ /* Basic types */
+ ASSIGN_FETCH_TYPE(u8, u8, 0),
+ ASSIGN_FETCH_TYPE(u16, u16, 0),
+ ASSIGN_FETCH_TYPE(u32, u32, 0),
+ ASSIGN_FETCH_TYPE(u64, u64, 0),
+ ASSIGN_FETCH_TYPE(s8, u8, 1),
+ ASSIGN_FETCH_TYPE(s16, u16, 1),
+ ASSIGN_FETCH_TYPE(s32, u32, 1),
+ ASSIGN_FETCH_TYPE(s64, u64, 1),
+};
+
/* Sum up total data length for dynamic arraies (strings) */
static __kprobes int __get_data_size(struct trace_probe *tp,
struct pt_regs *regs)
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index b571e4de0769..41f654d24cd9 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -35,19 +35,15 @@ const char *reserved_field_names[] = {
FIELD_STRING_FUNC,
};

-/* Printing function type */
-#define PRINT_TYPE_FUNC_NAME(type) print_type_##type
-#define PRINT_TYPE_FMT_NAME(type) print_type_format_##type
-
/* Printing in basic type function template */
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt, cast) \
-static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
+__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
const char *name, \
- void *data, void *ent)\
+ void *data, void *ent) \
{ \
return trace_seq_printf(s, " %s=" fmt, name, (cast)*(type *)data);\
} \
-static const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
+const char PRINT_TYPE_FMT_NAME(type)[] = fmt;

DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "%x", unsigned char)
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned short)
@@ -63,25 +59,10 @@ DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%lx", unsigned long)
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%ld", long)
#endif

-static inline void *get_rloc_data(u32 *dl)
-{
- return (u8 *)dl + get_rloc_offs(*dl);
-}
-
-/* For data_loc conversion */
-static inline void *get_loc_data(u32 *dl, void *ent)
-{
- return (u8 *)ent + get_rloc_offs(*dl);
-}
-
-/* For defining macros, define string/string_size types */
-typedef u32 string;
-typedef u32 string_size;
-
/* Print type function for string type */
-static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
- const char *name,
- void *data, void *ent)
+__kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
+ const char *name,
+ void *data, void *ent)
{
int len = *(u32 *)data >> 16;

@@ -92,199 +73,25 @@ static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
(const char *)get_loc_data(data, ent));
}

-static const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
-
-#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type
-/*
- * Define macro for basic types - we don't need to define s* types, because
- * we have to care only about bitwidth at recording time.
- */
-#define DEFINE_BASIC_FETCH_FUNCS(method) \
-DEFINE_FETCH_##method(u8) \
-DEFINE_FETCH_##method(u16) \
-DEFINE_FETCH_##method(u32) \
-DEFINE_FETCH_##method(u64)
-
-#define CHECK_FETCH_FUNCS(method, fn) \
- (((FETCH_FUNC_NAME(method, u8) == fn) || \
- (FETCH_FUNC_NAME(method, u16) == fn) || \
- (FETCH_FUNC_NAME(method, u32) == fn) || \
- (FETCH_FUNC_NAME(method, u64) == fn) || \
- (FETCH_FUNC_NAME(method, string) == fn) || \
- (FETCH_FUNC_NAME(method, string_size) == fn)) \
- && (fn != NULL))
+const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";

/* Data fetch function templates */
#define DEFINE_FETCH_reg(type) \
-static __kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
+__kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_register(regs, \
(unsigned int)((unsigned long)offset)); \
}
DEFINE_BASIC_FETCH_FUNCS(reg)
-/* No string on the register */
-#define fetch_reg_string NULL
-#define fetch_reg_string_size NULL
-
-#define DEFINE_FETCH_stack(type) \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
-{ \
- *(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
- (unsigned int)((unsigned long)offset)); \
-}
-DEFINE_BASIC_FETCH_FUNCS(stack)
-/* No string on the stack entry */
-#define fetch_stack_string NULL
-#define fetch_stack_string_size NULL

#define DEFINE_FETCH_retval(type) \
-static __kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs,\
+__kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
void *dummy, void *dest) \
{ \
*(type *)dest = (type)regs_return_value(regs); \
}
DEFINE_BASIC_FETCH_FUNCS(retval)
-/* No string on the retval */
-#define fetch_retval_string NULL
-#define fetch_retval_string_size NULL
-
-#define DEFINE_FETCH_memory(type) \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
- void *addr, void *dest) \
-{ \
- type retval; \
- if (probe_kernel_address(addr, retval)) \
- *(type *)dest = 0; \
- else \
- *(type *)dest = retval; \
-}
-DEFINE_BASIC_FETCH_FUNCS(memory)
-/*
- * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
- * length and relative data location.
- */
-static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
-{
- long ret;
- int maxlen = get_rloc_len(*(u32 *)dest);
- u8 *dst = get_rloc_data(dest);
- u8 *src = addr;
- mm_segment_t old_fs = get_fs();
-
- if (!maxlen)
- return;
-
- /*
- * Try to get string again, since the string can be changed while
- * probing.
- */
- set_fs(KERNEL_DS);
- pagefault_disable();
-
- do
- ret = __copy_from_user_inatomic(dst++, src++, 1);
- while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen);
-
- dst[-1] = '\0';
- pagefault_enable();
- set_fs(old_fs);
-
- if (ret < 0) { /* Failed to fetch string */
- ((u8 *)get_rloc_data(dest))[0] = '\0';
- *(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest));
- } else {
- *(u32 *)dest = make_data_rloc(src - (u8 *)addr,
- get_rloc_offs(*(u32 *)dest));
- }
-}
-
-/* Return the length of string -- including null terminal byte */
-static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
- void *addr, void *dest)
-{
- mm_segment_t old_fs;
- int ret, len = 0;
- u8 c;
-
- old_fs = get_fs();
- set_fs(KERNEL_DS);
- pagefault_disable();
-
- do {
- ret = __copy_from_user_inatomic(&c, (u8 *)addr + len, 1);
- len++;
- } while (c && ret == 0 && len < MAX_STRING_SIZE);
-
- pagefault_enable();
- set_fs(old_fs);
-
- if (ret < 0) /* Failed to check the length */
- *(u32 *)dest = 0;
- else
- *(u32 *)dest = len;
-}
-
-/* Memory fetching by symbol */
-struct symbol_cache {
- char *symbol;
- long offset;
- unsigned long addr;
-};
-
-static unsigned long update_symbol_cache(struct symbol_cache *sc)
-{
- sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
-
- if (sc->addr)
- sc->addr += sc->offset;
-
- return sc->addr;
-}
-
-static void free_symbol_cache(struct symbol_cache *sc)
-{
- kfree(sc->symbol);
- kfree(sc);
-}
-
-static struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
-{
- struct symbol_cache *sc;
-
- if (!sym || strlen(sym) == 0)
- return NULL;
-
- sc = kzalloc(sizeof(struct symbol_cache), GFP_KERNEL);
- if (!sc)
- return NULL;
-
- sc->symbol = kstrdup(sym, GFP_KERNEL);
- if (!sc->symbol) {
- kfree(sc);
- return NULL;
- }
- sc->offset = offset;
- update_symbol_cache(sc);
-
- return sc;
-}
-
-#define DEFINE_FETCH_symbol(type) \
-static __kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs,\
- void *data, void *dest) \
-{ \
- struct symbol_cache *sc = data; \
- if (sc->addr) \
- fetch_memory_##type(regs, (void *)sc->addr, dest); \
- else \
- *(type *)dest = 0; \
-}
-DEFINE_BASIC_FETCH_FUNCS(symbol)
-DEFINE_FETCH_symbol(string)
-DEFINE_FETCH_symbol(string_size)

/* Dereference memory access function */
struct deref_fetch_param {
@@ -293,7 +100,7 @@ struct deref_fetch_param {
};

#define DEFINE_FETCH_deref(type) \
-static __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs,\
+__kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
void *data, void *dest) \
{ \
struct deref_fetch_param *dprm = data; \
@@ -334,7 +141,7 @@ struct bitfield_fetch_param {
};

#define DEFINE_FETCH_bitfield(type) \
-static __kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs,\
+__kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs,\
void *data, void *dest) \
{ \
struct bitfield_fetch_param *bprm = data; \
@@ -348,8 +155,6 @@ static __kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs,\
}

DEFINE_BASIC_FETCH_FUNCS(bitfield)
-#define fetch_bitfield_string NULL
-#define fetch_bitfield_string_size NULL

static __kprobes void
update_bitfield_fetch_param(struct bitfield_fetch_param *data)
@@ -385,52 +190,8 @@ free_bitfield_fetch_param(struct bitfield_fetch_param *data)
#define DEFAULT_FETCH_TYPE _DEFAULT_FETCH_TYPE(BITS_PER_LONG)
#define DEFAULT_FETCH_TYPE_STR __stringify(DEFAULT_FETCH_TYPE)

-#define ASSIGN_FETCH_FUNC(method, type) \
- [FETCH_MTD_##method] = FETCH_FUNC_NAME(method, type)
-
-#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \
- {.name = _name, \
- .size = _size, \
- .is_signed = sign, \
- .print = PRINT_TYPE_FUNC_NAME(ptype), \
- .fmt = PRINT_TYPE_FMT_NAME(ptype), \
- .fmttype = _fmttype, \
- .fetch = { \
-ASSIGN_FETCH_FUNC(reg, ftype), \
-ASSIGN_FETCH_FUNC(stack, ftype), \
-ASSIGN_FETCH_FUNC(retval, ftype), \
-ASSIGN_FETCH_FUNC(memory, ftype), \
-ASSIGN_FETCH_FUNC(symbol, ftype), \
-ASSIGN_FETCH_FUNC(deref, ftype), \
-ASSIGN_FETCH_FUNC(bitfield, ftype), \
- } \
- }
-
-#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
- __ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype)
-
-#define FETCH_TYPE_STRING 0
-#define FETCH_TYPE_STRSIZE 1
-
-/* Fetch type information table */
-static const struct fetch_type fetch_type_table[] = {
- /* Special types */
- [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
- sizeof(u32), 1, "__data_loc char[]"),
- [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
- string_size, sizeof(u32), 0, "u32"),
- /* Basic types */
- ASSIGN_FETCH_TYPE(u8, u8, 0),
- ASSIGN_FETCH_TYPE(u16, u16, 0),
- ASSIGN_FETCH_TYPE(u32, u32, 0),
- ASSIGN_FETCH_TYPE(u64, u64, 0),
- ASSIGN_FETCH_TYPE(s8, u8, 1),
- ASSIGN_FETCH_TYPE(s16, u16, 1),
- ASSIGN_FETCH_TYPE(s32, u32, 1),
- ASSIGN_FETCH_TYPE(s64, u64, 1),
-};
-
-static const struct fetch_type *find_fetch_type(const char *type)
+static const struct fetch_type *find_fetch_type(const char *type,
+ const struct fetch_type *ttbl)
{
int i;

@@ -451,21 +212,21 @@ static const struct fetch_type *find_fetch_type(const char *type)

switch (bs) {
case 8:
- return find_fetch_type("u8");
+ return find_fetch_type("u8", ttbl);
case 16:
- return find_fetch_type("u16");
+ return find_fetch_type("u16", ttbl);
case 32:
- return find_fetch_type("u32");
+ return find_fetch_type("u32", ttbl);
case 64:
- return find_fetch_type("u64");
+ return find_fetch_type("u64", ttbl);
default:
goto fail;
}
}

- for (i = 0; i < ARRAY_SIZE(fetch_type_table); i++)
- if (strcmp(type, fetch_type_table[i].name) == 0)
- return &fetch_type_table[i];
+ for (i = 0; i < NR_FETCH_TYPES; i++)
+ if (strcmp(type, ttbl[i].name) == 0)
+ return &ttbl[i];

fail:
return NULL;
@@ -479,16 +240,17 @@ static __kprobes void fetch_stack_address(struct pt_regs *regs,
}

static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
- fetch_func_t orig_fn)
+ fetch_func_t orig_fn,
+ const struct fetch_type *ttbl)
{
int i;

- if (type != &fetch_type_table[FETCH_TYPE_STRING])
+ if (type != &ttbl[FETCH_TYPE_STRING])
return NULL; /* Only string type needs size function */

for (i = 0; i < FETCH_MTD_END; i++)
if (type->fetch[i] == orig_fn)
- return fetch_type_table[FETCH_TYPE_STRSIZE].fetch[i];
+ return ttbl[FETCH_TYPE_STRSIZE].fetch[i];

WARN_ON(1); /* This should not happen */

@@ -561,6 +323,9 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
long offset;
char *tmp;
int ret;
+ const struct fetch_type *ttbl;
+
+ ttbl = kprobes_fetch_type_table;

ret = 0;

@@ -621,7 +386,7 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
struct deref_fetch_param *dprm;
const struct fetch_type *t2;

- t2 = find_fetch_type(NULL);
+ t2 = find_fetch_type(NULL, ttbl);
*tmp = '\0';
dprm = kzalloc(sizeof(struct deref_fetch_param), GFP_KERNEL);

@@ -692,6 +457,9 @@ int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
{
const char *t;
int ret;
+ const struct fetch_type *ttbl;
+
+ ttbl = kprobes_fetch_type_table;

if (strlen(arg) > MAX_ARGSTR_LEN) {
pr_info("Argument is too long.: %s\n", arg);
@@ -707,7 +475,7 @@ int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
arg[t - parg->comm] = '\0';
t++;
}
- parg->type = find_fetch_type(t);
+ parg->type = find_fetch_type(t, ttbl);
if (!parg->type) {
pr_info("Unsupported type: %s\n", t);
return -EINVAL;
@@ -721,7 +489,8 @@ int traceprobe_parse_probe_arg(char *arg, ssize_t *size,

if (ret >= 0) {
parg->fetch_size.fn = get_fetch_size_function(parg->type,
- parg->fetch.fn);
+ parg->fetch.fn,
+ ttbl);
parg->fetch_size.data = parg->fetch.data;
}

diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 5c7e09d10d74..8c62746e5419 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -81,11 +81,46 @@
*/
#define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs))

+static inline void *get_rloc_data(u32 *dl)
+{
+ return (u8 *)dl + get_rloc_offs(*dl);
+}
+
+/* For data_loc conversion */
+static inline void *get_loc_data(u32 *dl, void *ent)
+{
+ return (u8 *)ent + get_rloc_offs(*dl);
+}
+
+
+/* For defining macros, define string/string_size types */
+typedef u32 string;
+typedef u32 string_size;
+
/* Data fetch function type */
typedef void (*fetch_func_t)(struct pt_regs *, void *, void *);
/* Printing function type */
typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *, void *);

+/* Printing function type */
+#define PRINT_TYPE_FUNC_NAME(type) print_type_##type
+#define PRINT_TYPE_FMT_NAME(type) print_type_format_##type
+
+#define DECLARE_PRINT_TYPE_FUNC(type) \
+extern int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *, const char *, \
+ void *, void *); \
+extern const char PRINT_TYPE_FMT_NAME(type)[]
+
+DECLARE_PRINT_TYPE_FUNC(u8);
+DECLARE_PRINT_TYPE_FUNC(u16);
+DECLARE_PRINT_TYPE_FUNC(u32);
+DECLARE_PRINT_TYPE_FUNC(u64);
+DECLARE_PRINT_TYPE_FUNC(s8);
+DECLARE_PRINT_TYPE_FUNC(s16);
+DECLARE_PRINT_TYPE_FUNC(s32);
+DECLARE_PRINT_TYPE_FUNC(s64);
+DECLARE_PRINT_TYPE_FUNC(string);
+
/* Fetch types */
enum {
FETCH_MTD_reg = 0,
@@ -124,12 +159,109 @@ struct probe_arg {
const struct fetch_type *type; /* Type of this argument */
};

+#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type
+
+#define DECLARE_FETCH_FUNC(method, type) \
+extern void FETCH_FUNC_NAME(method, type)(struct pt_regs *, void *, void *)
+
+#define DECLARE_BASIC_FETCH_FUNCS(method) \
+DECLARE_FETCH_FUNC(method, u8); \
+DECLARE_FETCH_FUNC(method, u16); \
+DECLARE_FETCH_FUNC(method, u32); \
+DECLARE_FETCH_FUNC(method, u64)
+
+/*
+ * Declare common fetch functions for both of kprobes and uprobes
+ */
+DECLARE_BASIC_FETCH_FUNCS(reg);
+#define fetch_reg_string NULL
+#define fetch_reg_string_size NULL
+
+DECLARE_BASIC_FETCH_FUNCS(stack);
+#define fetch_stack_string NULL
+#define fetch_stack_string_size NULL
+
+DECLARE_BASIC_FETCH_FUNCS(retval);
+#define fetch_retval_string NULL
+#define fetch_retval_string_size NULL
+
+DECLARE_BASIC_FETCH_FUNCS(memory);
+DECLARE_FETCH_FUNC(memory, string);
+DECLARE_FETCH_FUNC(memory, string_size);
+
+DECLARE_BASIC_FETCH_FUNCS(symbol);
+DECLARE_FETCH_FUNC(symbol, string);
+DECLARE_FETCH_FUNC(symbol, string_size);
+
+DECLARE_BASIC_FETCH_FUNCS(deref);
+DECLARE_FETCH_FUNC(deref, string);
+DECLARE_FETCH_FUNC(deref, string_size);
+
+DECLARE_BASIC_FETCH_FUNCS(bitfield);
+#define fetch_bitfield_string NULL
+#define fetch_bitfield_string_size NULL
+
+/*
+ * Define macro for basic types - we don't need to define s* types, because
+ * we have to care only about bitwidth at recording time.
+ */
+#define DEFINE_BASIC_FETCH_FUNCS(method) \
+DEFINE_FETCH_##method(u8) \
+DEFINE_FETCH_##method(u16) \
+DEFINE_FETCH_##method(u32) \
+DEFINE_FETCH_##method(u64)
+
+#define CHECK_FETCH_FUNCS(method, fn) \
+ (((FETCH_FUNC_NAME(method, u8) == fn) || \
+ (FETCH_FUNC_NAME(method, u16) == fn) || \
+ (FETCH_FUNC_NAME(method, u32) == fn) || \
+ (FETCH_FUNC_NAME(method, u64) == fn) || \
+ (FETCH_FUNC_NAME(method, string) == fn) || \
+ (FETCH_FUNC_NAME(method, string_size) == fn)) \
+ && (fn != NULL))
+
+#define ASSIGN_FETCH_FUNC(method, type) \
+ [FETCH_MTD_##method] = FETCH_FUNC_NAME(method, type)
+
+#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \
+ {.name = _name, \
+ .size = _size, \
+ .is_signed = sign, \
+ .print = PRINT_TYPE_FUNC_NAME(ptype), \
+ .fmt = PRINT_TYPE_FMT_NAME(ptype), \
+ .fmttype = _fmttype, \
+ .fetch = { \
+ASSIGN_FETCH_FUNC(reg, ftype), \
+ASSIGN_FETCH_FUNC(stack, ftype), \
+ASSIGN_FETCH_FUNC(retval, ftype), \
+ASSIGN_FETCH_FUNC(memory, ftype), \
+ASSIGN_FETCH_FUNC(symbol, ftype), \
+ASSIGN_FETCH_FUNC(deref, ftype), \
+ASSIGN_FETCH_FUNC(bitfield, ftype), \
+ } \
+ }
+
+#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
+ __ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype)
+
+#define FETCH_TYPE_STRING 0
+#define FETCH_TYPE_STRSIZE 1
+
+#define NR_FETCH_TYPES 10
+
+extern const struct fetch_type kprobes_fetch_type_table[];
+
static inline __kprobes void call_fetch(struct fetch_param *fprm,
struct pt_regs *regs, void *dest)
{
return fprm->fn(regs, fprm->data, dest);
}

+struct symbol_cache;
+unsigned long update_symbol_cache(struct symbol_cache *sc);
+void free_symbol_cache(struct symbol_cache *sc);
+struct symbol_cache *alloc_symbol_cache(const char *sym, long offset);
+
/* Check the name is good for event/group/fields */
static inline int is_good_name(const char *name)
{
--
1.7.11.7

2013-10-29 06:57:00

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 05/13] tracing/kprobes: Staticize stack and memory fetch functions

From: Namhyung Kim <[email protected]>

Those fetch functions need to be implemented differently for kprobes
and uprobes. Since the deref fetch functions don't call those
directly anymore, we can make them static and implement them
separately.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 8 ++++----
kernel/trace/trace_probe.h | 8 --------
2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 1eff166990c2..fdb6dec11592 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -758,7 +758,7 @@ static const struct file_operations kprobe_profile_ops = {
* kprobes-specific fetch functions
*/
#define DEFINE_FETCH_stack(type) \
-__kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
+static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
@@ -770,7 +770,7 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
#define fetch_stack_string_size NULL

#define DEFINE_FETCH_memory(type) \
-__kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
+static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
void *addr, void *dest) \
{ \
type retval; \
@@ -784,7 +784,7 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
-__kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
void *addr, void *dest)
{
long ret;
@@ -821,7 +821,7 @@ __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
}

/* Return the length of string -- including null terminal byte */
-__kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
void *addr, void *dest)
{
mm_segment_t old_fs;
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 8c62746e5419..9ac7bdf607cc 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -177,18 +177,10 @@ DECLARE_BASIC_FETCH_FUNCS(reg);
#define fetch_reg_string NULL
#define fetch_reg_string_size NULL

-DECLARE_BASIC_FETCH_FUNCS(stack);
-#define fetch_stack_string NULL
-#define fetch_stack_string_size NULL
-
DECLARE_BASIC_FETCH_FUNCS(retval);
#define fetch_retval_string NULL
#define fetch_retval_string_size NULL

-DECLARE_BASIC_FETCH_FUNCS(memory);
-DECLARE_FETCH_FUNC(memory, string);
-DECLARE_FETCH_FUNC(memory, string_size);
-
DECLARE_BASIC_FETCH_FUNCS(symbol);
DECLARE_FETCH_FUNC(symbol, string);
DECLARE_FETCH_FUNC(symbol, string_size);
--
1.7.11.7

2013-10-29 06:57:43

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 04/13] tracing/kprobes: Add fetch{,_size} member into deref fetch method

From: Hyeoncheol Lee <[email protected]>

The deref fetch methods access a memory region but it assumes that
it's a kernel memory since uprobes does not support them.

Add ->fetch and ->fetch_size member in order to provide a proper
access methods for supporting uprobes.

Acked-by: Masami Hiramatsu <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: zhangwei(Jovi) <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Hyeoncheol Lee <[email protected]>
[[email protected]: Split original patch into pieces as requested]
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/trace/trace_probe.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 41f654d24cd9..b7b8bda02d6e 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -97,6 +97,8 @@ DEFINE_BASIC_FETCH_FUNCS(retval)
struct deref_fetch_param {
struct fetch_param orig;
long offset;
+ fetch_func_t fetch;
+ fetch_func_t fetch_size;
};

#define DEFINE_FETCH_deref(type) \
@@ -108,13 +110,26 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
call_fetch(&dprm->orig, regs, &addr); \
if (addr) { \
addr += dprm->offset; \
- fetch_memory_##type(regs, (void *)addr, dest); \
+ dprm->fetch(regs, (void *)addr, dest); \
} else \
*(type *)dest = 0; \
}
DEFINE_BASIC_FETCH_FUNCS(deref)
DEFINE_FETCH_deref(string)
-DEFINE_FETCH_deref(string_size)
+
+__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
+ void *data, void *dest)
+{
+ struct deref_fetch_param *dprm = data;
+ unsigned long addr;
+
+ call_fetch(&dprm->orig, regs, &addr);
+ if (addr && dprm->fetch_size) {
+ addr += dprm->offset;
+ dprm->fetch_size(regs, (void *)addr, dest);
+ } else
+ *(string_size *)dest = 0;
+}

static __kprobes void update_deref_fetch_param(struct deref_fetch_param *data)
{
@@ -394,6 +409,9 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
return -ENOMEM;

dprm->offset = offset;
+ dprm->fetch = t->fetch[FETCH_MTD_memory];
+ dprm->fetch_size = get_fetch_size_function(t,
+ dprm->fetch, ttbl);
ret = parse_probe_arg(arg, t2, &dprm->orig, is_return,
is_kprobe);
if (ret)
--
1.7.11.7

Subject: Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

(2013/10/29 15:53), Namhyung Kim wrote:
> Hello,
>
> This patchset implements memory (address), stack[N], deference,
> bitfield and retval (it needs uretprobe tho) fetch methods for
> uprobes. It's based on the previous work [1] done by Hyeoncheol Lee.
>
> Now kprobes and uprobes have their own fetch_type_tables and, in turn,
> memory and stack access methods. Other fetch methods are shared.
>
> For the dereference method, I added a new argument to fetch functions.
> It's because for uprobes it needs to know whether the given address is
> a file offset or a virtual address in an user process. For instance,
> in case of fetching from a memory directly (like @offset) it should
> convert the address (offset) to a virtual address of the process, but
> if it's a dereferencing, the given address already has the virtual
> address.
>
> To determine this in a fetch function, I passed a pointer to
> trace_uprobe for direct fetch, and passed NULL for dereference.
>
> The patch 1-2 are bug fixes and can be applied independently.

You'd better add [BUGFIX] and send those separately. ;)
But anyway, I'm OK to pull those first two (and others too).


> Please look at patch 10 that uses per-cpu buffer for accessing user
> memory as suggested by Steven. While I tried hard not to mess things
> up there might be a chance I did something horrible. It'd be great if
> you guys take a look and give comments.
>
>
> * v6 changes:
> - add more Ack's from Masami
> - fix ref count of uprobe_cpu_buffer (thanks to Jovi)
>
> * v5 changes:
> - use user_stack_pointer() instead of GET_USP()
> - fix a bug in 'stack' fetch method of uprobes
>
> * v4 changes:
> - add Ack's from Masami
> - rearrange patches to make it easy for simple fixes to be applied
> - update documentation
> - use per-cpu buffer for storing args (thanks to Steve!)
>
>
> [1] https://lkml.org/lkml/2012/11/14/84
>
> A simple example:
>
> # cat foo.c
> int glob = -1;
> char str[] = "hello uprobe.";
>
> struct foo {
> unsigned int unused: 2;
> unsigned int foo: 20;
> unsigned int bar: 10;
> } foo = {
> .foo = 5,
> };
>
> int main(int argc, char *argv[])
> {
> long local = 0x1234;
>
> return 127;
> }
>
> # gcc -o foo -g foo.c
>
> # objdump -d foo | grep -A9 -F '<main>'
> 00000000004004b0 <main>:
> 4004b0: 55 push %rbp
> 4004b1: 48 89 e5 mov %rsp,%rbp
> 4004b4: 89 7d ec mov %edi,-0x14(%rbp)
> 4004b7: 48 89 75 e0 mov %rsi,-0x20(%rbp)
> 4004bb: 48 c7 45 f8 34 12 00 movq $0x1234,-0x8(%rbp)
> 4004c2: 00
> 4004c3: b8 7f 00 00 00 mov $0x7f,%eax
> 4004c8: 5d pop %rbp
> 4004c9: c3 retq
>
> # nm foo | grep -e glob$ -e str -e foo
> 00000000006008bc D foo
> 00000000006008a8 D glob
> 00000000006008ac D str
>
> # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
> > str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
> Added new event:
> probe_foo:foo (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string
> bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe_foo:foo -aR sleep 1
>
> # perf record -e probe_foo:foo ./foo
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]
>
> # perf script | grep -v ^#
> foo 2008 [002 2199.867154: probe_foo:foo (4004c3)
> glob=-1 str="hello uprobe." bit=5 argc=1 local=1234

Nice ! :)

Thank you,

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-10-31 18:17:08

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 10/13] tracing/uprobes: Fetch args before reserving a ring buffer

On 10/29, Namhyung Kim wrote:
>
> @@ -630,6 +653,19 @@ probe_event_enable(struct trace_uprobe *tu, int flag, filter_func_t filter)
> if (trace_probe_is_enabled(&tu->p))
> return -EINTR;
>
> + if (atomic_inc_return(&uprobe_buffer_ref) == 1) {
> + int cpu;
> +
> + uprobe_cpu_buffer = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
> + if (uprobe_cpu_buffer == NULL) {
> + atomic_dec(&uprobe_buffer_ref);
> + return -ENOMEM;
> + }
> +
> + for_each_possible_cpu(cpu)
> + mutex_init(&per_cpu(uprobe_cpu_mutex, cpu));
> + }
> +
> WARN_ON(!uprobe_filter_is_empty(&tu->filter));
>
> tu->p.flags |= flag;
> @@ -646,6 +682,11 @@ static void probe_event_disable(struct trace_uprobe *tu, int flag)
> if (!trace_probe_is_enabled(&tu->p))
> return;
>
> + if (atomic_dec_and_test(&uprobe_buffer_ref)) {
> + free_percpu(uprobe_cpu_buffer);
> + uprobe_cpu_buffer = NULL;
> + }
> +
> WARN_ON(!uprobe_filter_is_empty(&tu->filter));

Do we really need atomic_t? probe_event_enable/disable is called under
event_mutex and we rely on this fact anyway.

Otherwise this logic looks racy even with atomic_t, another thread could
use the uninitialized uprobe_cpu_buffer/mutex if it registers another probe
and the handler runs before we complete the initialization, no?

Oleg.

2013-10-31 18:24:10

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 12/13] tracing/uprobes: Add more fetch functions

On 10/29, Namhyung Kim wrote:
>
> +static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe *tu)
> +{
> + unsigned long pgoff = addr >> PAGE_SHIFT;
> + struct vm_area_struct *vma;
> + struct address_space *mapping;
> + unsigned long vaddr = 0;
> +
> + if (tu == NULL) {
> + /* A NULL tu means that we already got the vaddr */
> + return (void __force __user *) addr;
> + }
> +
> + mapping = tu->inode->i_mapping;
> +
> + mutex_lock(&mapping->i_mmap_mutex);
> + vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
> + if (vma->vm_mm != current->mm)
> + continue;
> + if (!(vma->vm_flags & VM_READ))
> + continue;
> +
> + vaddr = offset_to_vaddr(vma, addr);
> + break;
> + }
> + mutex_unlock(&mapping->i_mmap_mutex);
> +
> + WARN_ON_ONCE(vaddr == 0);

Hmm. But unless I missed something this "addr" passed as an argument can
be wrong? And if nothing else this or another thread can unmap the vma?

Oleg.