2006-09-20 23:50:35

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

Hi,

I thought about it a little more, and here is still another enhanced version of
the Linux Markers. It also allows the possibility to insert an inlined function
inside the MARKERS surrounded by a uncontidional jump.

I just made what will be the marker loader module : it works with the
"JUMP over CALL" flavor of the marker to provide dynamic loading and unloading
of a probe (using insmod/rmmod). I included the code at the bottom of this
email, with the Makefile and a small module example. As the marker loader uses
special flags, it has to be built out of tree.
Hint : use "make TARGET=subsys_mark1" to build marker-loader.ko.

Comments/ideas are welcome.

Mathieu

---BEGIN---

--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -1082,6 +1082,8 @@ config KPROBES
for kernel debugging, non-intrusive instrumentation and testing.
If in doubt, say "N".

+source "kernel/Kconfig.marker"
+
source "ltt/Kconfig"

endmenu
--- /dev/null
+++ b/include/linux/marker.h
@@ -0,0 +1,98 @@
+/*****************************************************************************
+ * marker.h
+ *
+ * Code markup for dynamic and static tracing.
+ *
+ * Example :
+ *
+ * MARK(subsystem_event, "%d %s", someint, somestring);
+ * Where :
+ * - Subsystem is the name of your subsystem.
+ * - event is the name of the event to mark.
+ * - "%d %s" is the formatted string for printk.
+ * - someint is an integer.
+ * - somestring is a char *.
+ * - subsystem_event must be unique thorough the kernel!
+ *
+ * Update : Dynamically overridable function call based on marker mechanism
+ * from Frank Ch. Eigler <[email protected]>.
+ * Notes :
+ * To remove a probe :
+ * * set the function pointer back to __mark_empty_function
+ * * call synchronize_sched() to wait for all the functions to return
+ * * unload the module containing the function
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers <[email protected]>
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#ifdef CONFIG_MARK_SYMBOL
+#define MARK_SYM(name) __asm__ ("__mark_kprobe_" #name ":")
+#else
+#define MARK_SYM(name)
+#endif
+
+
+#ifdef CONFIG_MARK_JUMP_CALL
+#define MARK_JUMP_CALL_PROTOTYPE(name) \
+ static void \
+ (*__mark_##name##_call)(const char *fmt, ...) \
+ asm ("__mark_"#name"_call") = \
+ __mark_empty_function
+#define MARK_JUMP_CALL(name, format, args...) \
+ do { \
+ preempt_disable(); \
+ (void) (__mark_##name##_call(format, ## args)); \
+ preempt_enable_no_resched(); \
+ } while(0)
+#else
+#define MARK_JUMP_CALL_PROTOTYPE(name)
+#define MARK_JUMP_CALL(name, format, args...)
+#endif
+
+#ifdef CONFIG_MARK_JUMP_INLINE
+#define MARK_JUMP_INLINE(name, format, args...) \
+ (void) (__mark_##name##_inline(format, ## args))
+#else
+#define MARK_JUMP_INLINE(name, format, args...)
+#endif
+
+#define MARK_JUMP(name, format, args...) \
+ do { \
+ __label__ over_label, call_label, inline_label; \
+ volatile static void *__mark_##name##_jump_over \
+ asm ("__mark_"#name"_jump_over") = \
+ &&over_label; \
+ volatile static void *__mark_##name##_jump_call \
+ asm ("__mark_"#name"_jump_call") \
+ __attribute__((unused)) = \
+ &&call_label; \
+ volatile static void *__mark_##name##_jump_inline \
+ asm ("__mark_"#name"_jump_inline") \
+ __attribute__((unused)) = \
+ &&inline_label; \
+ MARK_JUMP_CALL_PROTOTYPE(name); \
+ goto *__mark_##name##_jump_over; \
+call_label: \
+ MARK_JUMP_CALL(name, format, ## args); \
+ goto over_label; \
+inline_label: \
+ MARK_JUMP_INLINE(name, format, ## args); \
+over_label: \
+ do {} while(0); \
+ } while(0)
+
+#define MARK(name, format, args...) \
+ do { \
+ __mark_check_format(format, ## args); \
+ MARK_SYM(name); \
+ MARK_JUMP(name, format, ## args); \
+ } while(0)
+
+static inline __attribute__ ((format (printf, 1, 2)))
+void __mark_check_format(const char *fmt, ...)
+{ }
+
+extern void __mark_empty_function(const char *fmt, ...);
--- a/init/main.c
+++ b/init/main.c
@@ -737,3 +737,10 @@ static int init(void * unused)

panic("No init found. Try passing init= option to kernel.");
}
+
+#ifdef CONFIG_MARK_JUMP
+void __mark_empty_function(const char *fmt, ...)
+{
+}
+EXPORT_SYMBOL(__mark_empty_function);
+#endif
--- /dev/null
+++ b/kernel/Kconfig.marker
@@ -0,0 +1,24 @@
+# Code markers configuration
+
+menu "Marker configuration"
+
+
+config MARK_SYMBOL
+ bool "Replace markers with symbols"
+ default n
+ help
+ Put symbols in place of markers, useful for kprobe.
+
+config MARK_JUMP_CALL
+ bool "Replace markers with a jump over an inactive function call"
+ default n
+ help
+ Put a jump over a call in place of markers.
+
+config MARK_JUMP_INLINE
+ bool "Replace markers with a jump over an inline function"
+ default n
+ help
+ Put a jump over an inline function.
+
+endmenu

---END---



---MARKER LOADER---
---BEGIN---

/* marker-loader.c
*
* Marker Loader
*
* Loads a function at a marker call site.
*
* (C) Copyright 2006 Mathieu Desnoyers <[email protected]>
*
* This file is released under the GPLv2.
* See the file COPYING for more details.
*/

#include <linux/marker.h>
#include <linux/module.h>
#include <linux/kallsyms.h>

/* function to install */
void do_mark1(const char *format, int value)
{
printk("value is %d\n", value);
}

static void *saved_over;

static void **target_mark_call;
static void **target_mark_jump_over;
static void **target_mark_jump_call;
static void **target_mark_jump_inline;

void show_symbol_pointers(void)
{
printk("Marker loader : Loading symbols...\n");
printk(" %s %p %p\n", __stringify(CALL), target_mark_call,
target_mark_call?*target_mark_call:0x0);
printk(" %s %p %p\n", __stringify(JUMP_OVER), target_mark_jump_over,
target_mark_jump_over?*target_mark_jump_over:0x0);
printk(" %s %p %p\n", __stringify(JUMP_CALL), target_mark_jump_call,
target_mark_jump_call?*target_mark_jump_call:0x0);
printk(" %s %p %p\n", __stringify(JUMP_INLINE), target_mark_jump_inline,
target_mark_jump_inline?*target_mark_jump_inline:0x0);
}

int mark_install_hook(void)
{
target_mark_call = (void**)kallsyms_lookup_name(__stringify(CALL));
target_mark_jump_over = (void**)kallsyms_lookup_name(__stringify(JUMP_OVER));
target_mark_jump_call = (void**)kallsyms_lookup_name(__stringify(JUMP_CALL));
target_mark_jump_inline = (void**)kallsyms_lookup_name(__stringify(JUMP_INLINE));

show_symbol_pointers();

if(!(target_mark_call && target_mark_jump_over && target_mark_jump_call &&
target_mark_jump_inline)) {
printk("Target symbols missing in kallsyms.\n");
return -EPERM;
}

printk("Installing hook\n");
*target_mark_call = (void*)do_mark1;
saved_over = *target_mark_jump_over;
*target_mark_jump_over = *target_mark_jump_call;

return 0;
}

int init_module(void)
{
return mark_install_hook();
}

void cleanup_module(void)
{
printk("Removing hook\n");
*target_mark_jump_over = saved_over;
*target_mark_call = __mark_empty_function;

/* Wait for instrumentation functions to return before quitting */
synchronize_sched();
}

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Mathieu Desnoyers");
MODULE_DESCRIPTION("Marker Loader");

---END---


-- MARKER LOADER Makefile ---
---BEGIN---

export MARKER_CFLAGS

MARKER_CFLAGS := -DCALL="__mark_$(TARGET)_call" -DJUMP_OVER="__mark_$(TARGET)_jump_over"
MARKER_CFLAGS += -DJUMP_CALL="__mark_$(TARGET)_jump_call" -DJUMP_INLINE="__mark_$(TARGET)_jump_inline"

EXTRA_CFLAGS += $(MARKER_CFLAGS)

ifneq ($(KERNELRELEASE),)

obj-m += marker-loader.o

else
KERNELDIR ?= /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)
KERNELRELEASE = $(shell cat $(KERNELDIR)/$(KBUILD_OUTPUT)/include/linux/version.h | sed -n 's/.*UTS_RELEASE.*\"\(.*\)\".*/\1/p')
ifneq ($(INSTALL_MOD_PATH),)
DEPMOD_OPT := -b $(INSTALL_MOD_PATH)
endif

default:
$(MAKE) -C $(KERNELDIR) M=$(PWD) modules

modules_install:
$(MAKE) -C $(KERNELDIR) M=$(PWD) modules_install
if [ -f $(KERNELDIR)/$(KBUILD_OUTPUT)/System.map ] ; then /sbin/depmod -ae -F $(KERNELDIR)/$(KBUILD_OUTPUT)/System.map $(DEPMOD_OPT) $(KERNELRELEASE) ; fi


clean:
$(MAKE) -C $(KERNELDIR) M=$(PWD) clean
endif

---END---

--Marker test program--
---BEGIN---
/* test-mark.c
*
*/

#include <linux/marker.h>
#include <linux/module.h>
#include <linux/proc_fs.h>
#include <linux/sched.h>

int x=7;

struct proc_dir_entry *pentry = NULL;

static int my_open(struct inode *inode, struct file *file)
{
MARK(subsys_mark1, "%d", 1);
MARK(subsys_mark2, "%d %s", 2, "blah2");
MARK(subsys_mark3, "%d %s", x, "blah3");

return -EPERM;
}


static struct file_operations my_operations = {
.open = my_open,
};

int init_module(void)
{
pentry = create_proc_entry("testmark", 0444, NULL);
if(pentry)
pentry->proc_fops = &my_operations;
return 0;
}

void cleanup_module(void)
{
remove_proc_entry("testmark", NULL);
}

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Mathieu Desnoyers");
MODULE_DESCRIPTION("Marker Test");

---END---





OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68


2006-09-21 00:13:35

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

Mathieu Desnoyers wrote:
> +#define MARK_SYM(name) __asm__ ("__mark_kprobe_" #name ":")
>

This will need to be "asm volatile()" so that gcc doesn't get rid of it
altogether. Also, there's nothing to make gcc keep this in any
particular place in the instruction stream, since it doesn't have any
data dependencies on anything else. A "memory" clobber might help, but
ideally it would have some explicit data dependency on an important
value at that point.

> +#else
> +#define MARK_SYM(name)
> +#endif
> +
> +
> +#ifdef CONFIG_MARK_JUMP_CALL
> +#define MARK_JUMP_CALL_PROTOTYPE(name) \
> + static void \
> + (*__mark_##name##_call)(const char *fmt, ...) \
> + asm ("__mark_"#name"_call") = \
> + __mark_empty_function
> +#define MARK_JUMP_CALL(name, format, args...) \
> + do { \
> + preempt_disable(); \
> + (void) (__mark_##name##_call(format, ## args)); \
>

(*foo)(args) is the preferred style for calling function pointers, since
it makes the pointerness explicit. Though in this case there's enough
other complexity that the function call syntax is pretty much irrelevant
here.

> + preempt_enable_no_resched(); \
> + } while(0)
> +#else
> +#define MARK_JUMP_CALL_PROTOTYPE(name)
> +#define MARK_JUMP_CALL(name, format, args...)
> +#endif
> +
> +#ifdef CONFIG_MARK_JUMP_INLINE
> +#define MARK_JUMP_INLINE(name, format, args...) \
> + (void) (__mark_##name##_inline(format, ## args))
> +#else
> +#define MARK_JUMP_INLINE(name, format, args...)
> +#endif
> +
> +#define MARK_JUMP(name, format, args...) \
> + do { \
> + __label__ over_label, call_label, inline_label; \
> + volatile static void *__mark_##name##_jump_over \
> + asm ("__mark_"#name"_jump_over") = \
> + &&over_label; \
> + volatile static void *__mark_##name##_jump_call \
> + asm ("__mark_"#name"_jump_call") \
> + __attribute__((unused)) = \
> + &&call_label; \
> + volatile static void *__mark_##name##_jump_inline \
> + asm ("__mark_"#name"_jump_inline") \
> + __attribute__((unused)) = \
> + &&inline_label; \
> + MARK_JUMP_CALL_PROTOTYPE(name); \
> + goto *__mark_##name##_jump_over; \
> +call_label: \
> + MARK_JUMP_CALL(name, format, ## args); \
> + goto over_label; \
> +inline_label: \
> + MARK_JUMP_INLINE(name, format, ## args); \
> +over_label: \
> + do {} while(0); \
> + } while(0)
>

I have to admit I haven't been following all this tracing stuff, but
this is pretty opaque. What's it trying to achieve? Hm, OK, I think I
see what you're getting at here - see below.

> +
> +#define MARK(name, format, args...) \
> + do { \
> + __mark_check_format(format, ## args); \
> + MARK_SYM(name); \
> + MARK_JUMP(name, format, ## args); \
> + } while(0)
>

Does this assume that the symbol injected by MARK_SYM() will label the
MARK_JUMP() code? Because it won't.

>
> printk("Installing hook\n");
> *target_mark_call = (void*)do_mark1;
> saved_over = *target_mark_jump_over;
> *target_mark_jump_over = *target_mark_jump_call;
>

So the point of this is to set up the new function, then update the
jumpover to point to it, in a way that's SMP safe? This assumes two things:

1. that your pointer updates are atomic
2. that these writes don't get reordered

1 might be safe, but I don't think its guaranteed for all
architectures. 2 is not true without some explicit barriers in there.
You'll probably need some in MARK_JUMP too.


> return 0;
> }
>
> int init_module(void)
> {
> return mark_install_hook();
> }
>
> void cleanup_module(void)
> {
> printk("Removing hook\n");
> *target_mark_jump_over = saved_over;
> *target_mark_call = __mark_empty_function;
>
> /* Wait for instrumentation functions to return before quitting */
> synchronize_sched();
> }
>

What if multiple people hook onto the same mark? Are you assuming LIFO?

J

2006-09-21 02:03:57

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

Hi Jeremy,

Thanks for the insightful comments, see below :

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
> >+#define MARK_SYM(name) __asm__ ("__mark_kprobe_" #name ":")
> >
>
> This will need to be "asm volatile()" so that gcc doesn't get rid of it
> altogether. Also, there's nothing to make gcc keep this in any
> particular place in the instruction stream, since it doesn't have any
> data dependencies on anything else. A "memory" clobber might help, but
> ideally it would have some explicit data dependency on an important
> value at that point.
>

Yup, good catch. I have not seen gcc removing this asm in my objdump however, by
I guess we cannot be sure. This MARK_SYM() is only useful for kprobe
insertion : I don't use it myself for the jump markup stuff. I don't know how
relevant it is for kprobes users for the symbol to be at a specific location,
as they don't know themself what data they are interested in and they simply
don't want to modify the instruction stream. I fact, if the asm volatile
modifies the instruction stream, it would be an unwanted side-effect :(

> >+#else
> >+#define MARK_SYM(name)
> >+#endif
> >+
> >+
> >+#ifdef CONFIG_MARK_JUMP_CALL
> >+#define MARK_JUMP_CALL_PROTOTYPE(name) \
> >+ static void \
> >+ (*__mark_##name##_call)(const char *fmt, ...) \
> >+ asm ("__mark_"#name"_call") = \
> >+ __mark_empty_function
> >+#define MARK_JUMP_CALL(name, format, args...) \
> >+ do { \
> >+ preempt_disable(); \
> >+ (void) (__mark_##name##_call(format, ## args)); \
> >
>
> (*foo)(args) is the preferred style for calling function pointers, since
> it makes the pointerness explicit. Though in this case there's enough
> other complexity that the function call syntax is pretty much irrelevant
> here.
>

Ok, will fix


> >+ preempt_enable_no_resched(); \
> >+ } while(0)
> >+#else
> >+#define MARK_JUMP_CALL_PROTOTYPE(name)
> >+#define MARK_JUMP_CALL(name, format, args...)
> >+#endif
> >+
> >+#ifdef CONFIG_MARK_JUMP_INLINE
> >+#define MARK_JUMP_INLINE(name, format, args...) \
> >+ (void) (__mark_##name##_inline(format, ## args))
> >+#else
> >+#define MARK_JUMP_INLINE(name, format, args...)
> >+#endif
> >+
> >+#define MARK_JUMP(name, format, args...) \
> >+ do { \
> >+ __label__ over_label, call_label, inline_label; \
> >+ volatile static void *__mark_##name##_jump_over \
> >+ asm ("__mark_"#name"_jump_over") = \
> >+ &&over_label; \
> >+ volatile static void *__mark_##name##_jump_call \
> >+ asm ("__mark_"#name"_jump_call") \
> >+ __attribute__((unused)) = \
> >+ &&call_label; \
> >+ volatile static void *__mark_##name##_jump_inline \
> >+ asm ("__mark_"#name"_jump_inline") \
> >+ __attribute__((unused)) = \
> >+ &&inline_label; \
> >+ MARK_JUMP_CALL_PROTOTYPE(name); \
> >+ goto *__mark_##name##_jump_over; \
> >+call_label: \
> >+ MARK_JUMP_CALL(name, format, ## args); \
> >+ goto over_label; \
> >+inline_label: \
> >+ MARK_JUMP_INLINE(name, format, ## args); \
> >+over_label: \
> >+ do {} while(0); \
> >+ } while(0)
> >
>
> I have to admit I haven't been following all this tracing stuff, but
> this is pretty opaque. What's it trying to achieve? Hm, OK, I think I
> see what you're getting at here - see below.
>

Default behavior : load "over_label" and jump to its address.
I can override dynamically the behavior by setting the jump target to either
call_label or inline_label.

The call_label is reponsible for calling an empty function by default. The
function pointer can be overridden.


> >+
> >+#define MARK(name, format, args...) \
> >+ do { \
> >+ __mark_check_format(format, ## args); \
> >+ MARK_SYM(name); \
> >+ MARK_JUMP(name, format, ## args); \
> >+ } while(0)
> >
>
> Does this assume that the symbol injected by MARK_SYM() will label the
> MARK_JUMP() code? Because it won't.
>

No, MARK_SYM() is only useful for kprobes, unrelated to MARK_JUMP.

> >
> > printk("Installing hook\n");
> > *target_mark_call = (void*)do_mark1;
> > saved_over = *target_mark_jump_over;
> > *target_mark_jump_over = *target_mark_jump_call;
> >
>
> So the point of this is to set up the new function, then update the
> jumpover to point to it, in a way that's SMP safe? This assumes two things:
>
> 1. that your pointer updates are atomic

Yes : the pointer is aligned and all of the architectures that I know write
pointer-sized information atomically when it is aligned.

> 2. that these writes don't get reordered
>

It doesn't matter :) You are absolutely right, they can get reordered, and the
fact is : we don't care. The function above sets the *target_mark_call before
the *target_mark_jump_over, so that the function pointer is set up before the
jump can call it. But imagine the inverse : the will be able to the function
call before the function call handler is set up. It really doesn't matter
because the function pointer is always pointing to a valid function : either the
"empty" default function or the inserted one.

>
>
> > return 0;
> >}
> >
> >int init_module(void)
> >{
> > return mark_install_hook();
> >}
> >
> >void cleanup_module(void)
> >{
> > printk("Removing hook\n");
> > *target_mark_jump_over = saved_over;
> > *target_mark_call = __mark_empty_function;
> >
> > /* Wait for instrumentation functions to return before quitting */
> > synchronize_sched();
> >}
> >
>
> What if multiple people hook onto the same mark? Are you assuming LIFO?
>

That's the "proof of concept module" part. I plan to do a kernel/marker.c that
will deal with all marker activation from a centralized, coherent mechanism. The
functions that will be called will simply sit in modules.

Before activating a probe, the marker.ko module will 1- take proper locking
and 2- verify that the function pointer and jmp target are at their default
values, otherwise the "install" operation fails, increment the probe-module
refcount and then set them up.

Setting them back to disabled is always the same operation : setting back the
default jmp and call values, synchronize_sched() and release the probe-module
refcount.

The marker.ko module will also deal with inline jump selection, which is the
same case as presented here, but without the function pointer, module refcount
and synchronization.

Thanks,

Mathieu


OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2006-09-21 04:54:43

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17


* Mathieu Desnoyers <[email protected]> wrote:

> +menu "Marker configuration"

> +config MARK_SYMBOL
> + bool "Replace markers with symbols"
> + Put symbols in place of markers, useful for kprobe.
> +
> +config MARK_JUMP_CALL
> + bool "Replace markers with a jump over an inactive function call"
> + Put a jump over a call in place of markers.
> +
> +config MARK_JUMP_INLINE
> + bool "Replace markers with a jump over an inline function"
> + Put a jump over an inline function.

This patch still has the fundamental failure of offering 3 annotation
methods instead of offering _one_ annotation method. I mention it again,
distros want to have _one_ method they enable, not: "oh, by the way, LTT
requires MARK_JUMP_CALL, so no matter how low-overhead MARK_SYMBOL is,
you have to enable MARK_JUMP_CALL anyway".

We have to face it, tracing is a very optional infrastructure, thus it
has to be _very low (preferably zero in most cases) overhead when
offered by a kernel binary but kept inactive by the user_ and thus you
/have to/ program on the edge to get it into the upstream kernel.

It wont be easy to achieve this, and you'll have to work with the other
tracing projects (and upstream kernel folks) to get one unified markup
mechanism agreed on, but nevertheless it's possible technologically.

and the only acceptable near-zero-overhead markup scheme proposed so far
(and suggested by me all along) is the symbol based markup method.
Symbol based markup also has the advantage that the coupling between the
kernel and the tracer moves to the symbol space (from the binary
instruction-stream space), and thus the in-kernel implementation of it
becomes alot more flexible. Flexibility of the upstream kernel design is
another thing that we require for 'very optional' features.

Yes, LTT will probably have to embrace kprobes/SystemTap to insert the
tracepoints themselves, but that's the price we get for uniformity, and
that's the price you get for _having the markers maintained upstream_.

If after that point upstream cannot optimize kprobes performance to a
sufficient level, /then/ can we think about /perhaps/ allowing direct
calls generated into the kernel image. But that decision /must/ be
driven by distributions and customers. Until then, kprobes based marking
and tracing will be 'good enough'.

It affects all tracers: SystemTap/LKST has to adapt to such a scheme
too, because currently there's no markup scheme in the kernel. So this
is not something 'against' LTT, but something /for/ a unified landscape
of tracers. (and as i mentioned it before, it will be easy for you to
offer a simple "LTT speedup patch", which distros and the upstream
kernel can consider separately. But it must be /optional/.)

So far i have not seen any real arguments against this simple but
fundamental upstream requirement which i pointed out for v0.1 already.

Ingo

2006-09-21 06:54:40

by Richard J Moore

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

Matthieu, are you aware of the work done on generalized RAS hooks for
linux? It was used in dprpbes, LTT and LKST at one time. That in its own
way dealt with some of the issues you're trying to solve. I'm not saying
it's the correct solution, but it shares a lot of common ground. Check out
"kernel hooks" aka ghki. We used it to dynamically remove dprobes from the
pagefault code path when no user probes were active. It was also used to
instrument tracepoints for different facilities that needed to share
instrumentation point. There was also a /proc interface that showed which
hooks were installed and their status. We have generic mechanisms that
would work across most architectures and optimized ones for specific
architectures. One thing we tried to do was to make sure that affect on
data cache was minimized or eliminated. We achieved that on IA32 by making
the hook mechanism a test following a load of a literal into a register.
The hook was activated and deactivated by moving 1 or 0 into a register -
i.e. by editing the literal in the instruction. (changing mov eax,0 to mov
eax,1)

You might as well see whether there are any bone on that carcass worth
picking.

Richard

2006-09-21 07:17:23

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

Mathieu Desnoyers wrote:
> Yup, good catch. I have not seen gcc removing this asm in my objdump however, by
> I guess we cannot be sure. This MARK_SYM() is only useful for kprobe
> insertion : I don't use it myself for the jump markup stuff. I don't know how
> relevant it is for kprobes users for the symbol to be at a specific location,
> as they don't know themself what data they are interested in and they simply
> don't want to modify the instruction stream. I fact, if the asm volatile
> modifies the instruction stream, it would be an unwanted side-effect :(
>

"asm volatile" isn't documented to do anything other than prevent the
asm from being removed altogether. It doesn't prevent it from being
moved elsewhere, and it doesn't imply any ordering dependency with the
code around it. So I don't think it will change the generated code, but
I also don't think it will be all that useful unless there's something
to actually make sure it's in a particular place - and that may change
codegen because it may force the compiler to not eliminate/reorder/move
the point at which you want the label.

Something like this might do it:

#define MARK_SYM(label) \
do { \
__label__ here; \
here: asm volatile(#label " = %0" : : "m" (*&&here)); \
} while(0)


This at least gives the compiler a C-level label to hang the asm from.

> It doesn't matter :) You are absolutely right, they can get reordered, and the
> fact is : we don't care. The function above sets the *target_mark_call before
> the *target_mark_jump_over, so that the function pointer is set up before the
> jump can call it. But imagine the inverse : the will be able to the function
> call before the function call handler is set up. It really doesn't matter
> because the function pointer is always pointing to a valid function : either the
> "empty" default function or the inserted one.
>

Does the local indirect jump really help? Wouldn't you do just as well
with the call? It's a jump out of line, but if it points to the null
function, it's likely to be in cache, and reducing the number of
indirect targets within a few instructions will help the CPU keep its
branch target prediction in order (modern Intel chips don't like having
too many indirect jumps within a cache line, for example).

It's a pity you can't make these all direct jumps; I guess patching the
instruction stream on an SMP system on the fly is too tricky...

(Though on x86 you could do something like make the default case 5 bytes
of nops. Then to patch it, you could patch in an int3 on the first
byte, put the relative address in the other 4 bytes, then patch the int3
back to the call/jump. The int3 handler would look to see if the fault
address is a kernel hook point, and if so, spin waiting for the *eip to
go to a call/jump, then resume the instruction.)

J

2006-09-21 12:28:18

by Roman Zippel

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

Hi,

On Thu, 21 Sep 2006, Ingo Molnar wrote:

> It affects all tracers: SystemTap/LKST has to adapt to such a scheme
> too, because currently there's no markup scheme in the kernel. So this
> is not something 'against' LTT, but something /for/ a unified landscape
> of tracers. (and as i mentioned it before, it will be easy for you to
> offer a simple "LTT speedup patch", which distros and the upstream
> kernel can consider separately. But it must be /optional/.)

Out of curiosity: How exactly would it hurt this unifiation, if you left
some of the implementation details simply to the archs?

> So far i have not seen any real arguments against this simple but
> fundamental upstream requirement which i pointed out for v0.1 already.

It's funny, after reality sets in, I'll get exactly what I asked for in
the first place, now I only have to figure out a way to do this without
getting insulted by almost everyone...

bye, Roman

2006-09-21 14:46:36

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] Linux Markers 0.4 (+dynamic probe loader) for 2.6.17

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
> >Yup, good catch. I have not seen gcc removing this asm in my objdump
> >however, by
> >I guess we cannot be sure. This MARK_SYM() is only useful for kprobe
> >insertion : I don't use it myself for the jump markup stuff. I don't know
> >how
> >relevant it is for kprobes users for the symbol to be at a specific
> >location,
> >as they don't know themself what data they are interested in and they
> >simply
> >don't want to modify the instruction stream. I fact, if the asm volatile
> >modifies the instruction stream, it would be an unwanted side-effect :(
> >
>
> "asm volatile" isn't documented to do anything other than prevent the
> asm from being removed altogether. It doesn't prevent it from being
> moved elsewhere, and it doesn't imply any ordering dependency with the
> code around it. So I don't think it will change the generated code, but
> I also don't think it will be all that useful unless there's something
> to actually make sure it's in a particular place - and that may change
> codegen because it may force the compiler to not eliminate/reorder/move
> the point at which you want the label.
>
> Something like this might do it:
>
> #define MARK_SYM(label) \
> do { \
> __label__ here; \
> here: asm volatile(#label " = %0" : : "m" (*&&here)); \
> } while(0)
>
>
> This at least gives the compiler a C-level label to hang the asm from.
>
Ok, let's do that then. Thanks for the hint.


> >It doesn't matter :) You are absolutely right, they can get reordered, and
> >the
> >fact is : we don't care. The function above sets the *target_mark_call
> >before
> >the *target_mark_jump_over, so that the function pointer is set up before
> >the
> >jump can call it. But imagine the inverse : the will be able to the
> >function
> >call before the function call handler is set up. It really doesn't matter
> >because the function pointer is always pointing to a valid function :
> >either the
> >"empty" default function or the inserted one.
> >
>
> Does the local indirect jump really help? Wouldn't you do just as well
> with the call?

Taking a function call, even if it is an empty function, will also imply the
cost of setting up the stack. I think it will be more costly than a load+jump.

> It's a jump out of line, but if it points to the null
> function, it's likely to be in cache, and reducing the number of
> indirect targets within a few instructions will help the CPU keep its
> branch target prediction in order (modern Intel chips don't like having
> too many indirect jumps within a cache line, for example).
>

Good point. However, as my tests pointed out, it seems less costly to loop
doing out of line jumps than to loop doing predicted branches. Weird, but it
seems to be the case. We should however compare the speed of the jump vs stack
setup and call to empty function.


> It's a pity you can't make these all direct jumps; I guess patching the
> instruction stream on an SMP system on the fly is too tricky...
>

This is my basic concern : teams have been working on this full-time for a few
years without success, why would I succeed at doing faset portable
code-modifying branching code in less than that ? I think that the first thing
to achieve is to provide a fast+portable way of dealing with markers and then
the architecture specific improvements will come. As my marking mechanism is
generic enough to do any symbol marking of assembly, it will be easily
customizable per architecture.


> (Though on x86 you could do something like make the default case 5 bytes
> of nops. Then to patch it, you could patch in an int3 on the first
> byte, put the relative address in the other 4 bytes, then patch the int3
> back to the call/jump. The int3 handler would look to see if the fault
> address is a kernel hook point, and if so, spin waiting for the *eip to
> go to a call/jump, then resume the instruction.)
>

Yes, many optimisations can be thought of, for many architectures. What I miss
in your idea is where the function call will be ? Probably jumped-over by a
goto after the nops (so that the compiler will put the function call rarely-used
part of the function) ?

The problem with your approach is that : as we are in preemptible code, there
can be an arbitrary thread running in the NOPs, scheduled out and stopped. It
must not come back and iret in the middle of your addresses. The same problem
exists for interrupt handlers.

Regards,

Mathieu


OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68