2010-06-09 21:39:37

by Jason Baron

[permalink] [raw]
Subject: [PATCH 00/13] jump label v9

Hi,

updates for -v9:

*build time detection of gcc 'asm goto' feature via scripts/gcc-goto.sh,
which basically tries to compile a simple 'asm goto' c program, and if
it succeeds we define CC_HAVE_ASM_GOTO. In this way we detect the 4.5 gcc
feature if its backported, as it has been done to 4.4 gcc in fedora.

*misc cleanups and simplifications.

thanks,

-Jason


David S. Miller (1):
sparc64: Add jump_label support

Jason Baron (11):
jump label: base patch
jump label: x86 support
jump label: tracepoint support
jump label: add module support
jump label: move ftrace_dyn_arch_init to common code
jump label: sort jump table at build-time
jump label: initialize workqueue tracepoints *before* they are
registered
jump label: jump_label_text_reserved() to reserve our jump points
jump label: convert jump label to use a key
jump label: convert dynamic debug to use jump labels.
jump label: add docs

Mathieu Desnoyers (1):
jump label: notifier atomic call chain notrace

Documentation/jump-label.txt | 151 +++++++++++++
Makefile | 11 +-
arch/Kconfig | 3 +
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/jump_label.h | 32 +++
arch/sparc/kernel/Makefile | 2 +
arch/sparc/kernel/jump_label.c | 38 +++
arch/sparc/kernel/module.c | 6 +
arch/x86/Kconfig | 1 +
arch/x86/include/asm/alternative.h | 14 ++
arch/x86/include/asm/jump_label.h | 47 ++++
arch/x86/kernel/Makefile | 2 +-
arch/x86/kernel/alternative.c | 72 ++++++-
arch/x86/kernel/ftrace.c | 70 +------
arch/x86/kernel/jump_label.c | 47 ++++
arch/x86/kernel/kprobes.c | 3 +-
arch/x86/kernel/module.c | 3 +
arch/x86/kernel/ptrace.c | 1 +
arch/x86/kernel/setup.c | 3 +
include/asm-generic/vmlinux.lds.h | 22 ++-
include/linux/dynamic_debug.h | 42 ++--
include/linux/jump_label.h | 59 +++++
include/linux/module.h | 5 +-
include/linux/tracepoint.h | 7 +-
kernel/Makefile | 2 +-
kernel/jump_label.c | 426 +++++++++++++++++++++++++++++++++++
kernel/kprobes.c | 3 +-
kernel/module.c | 7 +
kernel/notifier.c | 6 +-
kernel/trace/ftrace.c | 13 +-
kernel/trace/trace_workqueue.c | 10 +-
kernel/tracepoint.c | 8 +
lib/dynamic_debug.c | 42 +----
scripts/Makefile.lib | 11 +-
scripts/basic/Makefile | 2 +-
scripts/basic/hash.c | 64 ------
scripts/gcc-goto.sh | 5 +
scripts/mod/Makefile | 1 +
scripts/mod/modpost.c | 73 ++++++-
scripts/mod/modpost.h | 1 +
40 files changed, 1077 insertions(+), 237 deletions(-)
create mode 100644 Documentation/jump-label.txt
create mode 100644 arch/sparc/include/asm/jump_label.h
create mode 100644 arch/sparc/kernel/jump_label.c
create mode 100644 arch/x86/include/asm/jump_label.h
create mode 100644 arch/x86/kernel/jump_label.c
create mode 100644 include/linux/jump_label.h
create mode 100644 kernel/jump_label.c
delete mode 100644 scripts/basic/hash.c
create mode 100644 scripts/gcc-goto.sh


2010-06-09 21:39:42

by Jason Baron

[permalink] [raw]
Subject: [PATCH 01/13] jump label v9: notifier atomic call chain notrace

From: Mathieu Desnoyers <[email protected]>

In LTTng, being able to use the atomic notifier from cpu idle entry to
ensure the tracer flush the last events in the current subbuffer
requires the rcu read-side to be marked "notrace", otherwise it can end
up calling back into lockdep and the tracer.

Also apply to the the die notifier.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Jason Baron <[email protected]>
Reviewed-by: Masami Hiramatsu <[email protected]>
---
kernel/notifier.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index 2488ba7..88453a7 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -148,7 +148,7 @@ int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh,
spin_lock_irqsave(&nh->lock, flags);
ret = notifier_chain_unregister(&nh->head, n);
spin_unlock_irqrestore(&nh->lock, flags);
- synchronize_rcu();
+ synchronize_sched();
return ret;
}
EXPORT_SYMBOL_GPL(atomic_notifier_chain_unregister);
@@ -178,9 +178,9 @@ int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
{
int ret;

- rcu_read_lock();
+ rcu_read_lock_sched_notrace();
ret = notifier_call_chain(&nh->head, val, v, nr_to_call, nr_calls);
- rcu_read_unlock();
+ rcu_read_unlock_sched_notrace();
return ret;
}
EXPORT_SYMBOL_GPL(__atomic_notifier_call_chain);
--
1.7.1

2010-06-09 21:39:31

by Jason Baron

[permalink] [raw]
Subject: [PATCH 04/13] jump label v9: tracepoint support

Make use of the jump label infrastructure for tracepoints.

Signed-off-by: Jason Baron <[email protected]>
---
include/linux/tracepoint.h | 8 ++++++--
kernel/tracepoint.c | 8 ++++++++
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 9a59d1f..883df16 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -16,6 +16,7 @@

#include <linux/types.h>
#include <linux/rcupdate.h>
+#include <linux/jump_label.h>

struct module;
struct tracepoint;
@@ -144,7 +145,9 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
extern struct tracepoint __tracepoint_##name; \
static inline void trace_##name(proto) \
{ \
- if (unlikely(__tracepoint_##name.state)) \
+ JUMP_LABEL(&__tracepoint_##name, do_trace, __tracepoint_##name.state);\
+ return; \
+do_trace: \
__DO_TRACE(&__tracepoint_##name, \
TP_PROTO(data_proto), \
TP_ARGS(data_args)); \
@@ -171,7 +174,8 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
__attribute__((section("__tracepoints_strings"))) = #name; \
struct tracepoint __tracepoint_##name \
__attribute__((section("__tracepoints"), aligned(32))) = \
- { __tpstrtab_##name, 0, reg, unreg, NULL }
+ { __tpstrtab_##name, 0, reg, unreg, NULL }; \
+ DEFINE_JUMP_LABEL(name);

#define DEFINE_TRACE(name) \
DEFINE_TRACE_FN(name, NULL, NULL);
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index c77f3ec..b16d873 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -25,6 +25,7 @@
#include <linux/err.h>
#include <linux/slab.h>
#include <linux/sched.h>
+#include <linux/jump_label.h>

extern struct tracepoint __start___tracepoints[];
extern struct tracepoint __stop___tracepoints[];
@@ -263,6 +264,10 @@ static void set_tracepoint(struct tracepoint_entry **entry,
* is used.
*/
rcu_assign_pointer(elem->funcs, (*entry)->funcs);
+ if (!elem->state && active)
+ enable_jump_label(elem->name);
+ if (elem->state && !active)
+ disable_jump_label(elem->name);
elem->state = active;
}

@@ -277,6 +282,9 @@ static void disable_tracepoint(struct tracepoint *elem)
if (elem->unregfunc && elem->state)
elem->unregfunc();

+ if (elem->state)
+ disable_jump_label(elem->name);
+
elem->state = 0;
rcu_assign_pointer(elem->funcs, NULL);
}
--
1.7.1

2010-06-09 21:39:49

by Jason Baron

[permalink] [raw]
Subject: [PATCH 08/13] jump label v9: initialize workqueue tracepoints *before* they are registered

Initialize the workqueue data structures *before* they are registered
so that they are ready for callbacks.

Signed-off-by: Jason Baron <[email protected]>
---
kernel/trace/trace_workqueue.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index a7cc379..209b379 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -263,6 +263,11 @@ int __init trace_workqueue_early_init(void)
{
int ret, cpu;

+ for_each_possible_cpu(cpu) {
+ spin_lock_init(&workqueue_cpu_stat(cpu)->lock);
+ INIT_LIST_HEAD(&workqueue_cpu_stat(cpu)->list);
+ }
+
ret = register_trace_workqueue_insertion(probe_workqueue_insertion, NULL);
if (ret)
goto out;
@@ -279,11 +284,6 @@ int __init trace_workqueue_early_init(void)
if (ret)
goto no_creation;

- for_each_possible_cpu(cpu) {
- spin_lock_init(&workqueue_cpu_stat(cpu)->lock);
- INIT_LIST_HEAD(&workqueue_cpu_stat(cpu)->list);
- }
-
return 0;

no_creation:
--
1.7.1

2010-06-09 21:39:51

by Jason Baron

[permalink] [raw]
Subject: [PATCH 07/13] jump label v9: sort jump table at build-time

The jump label table is more optimal accessed if the entries are continguous.
Sorting the table accomplishes this. Do the sort at build-time. Adds a '-j'
option to 'modpost' which replaces the vmlinux, with a sorted jump label
section vmlinux. I've tested this on x86 with relocatable and it works fine
there as well. Note that I have not sorted the jump label table in modules.
The module tables tend to be smaller, and thus their is less value to this.
The kernel continues to do the sort, just in case, but at least for the vmlinux,
this is just a verfication that the jump label table has already been sorted.

Signed-off-by: Jason Baron <[email protected]>
---
Makefile | 6 ++-
include/asm-generic/vmlinux.lds.h | 18 +++++++-
scripts/mod/Makefile | 1 +
scripts/mod/modpost.c | 83 +++++++++++++++++++++++++++++++++++-
scripts/mod/modpost.h | 2 +
5 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index bf9955d..6c6e608 100644
--- a/Makefile
+++ b/Makefile
@@ -152,7 +152,7 @@ obj := $(objtree)

VPATH := $(srctree)$(if $(KBUILD_EXTMOD),:$(KBUILD_EXTMOD))

-export srctree objtree VPATH
+export srctree objtree VPATH hdr-arch


# SUBARCH tells the usermode build what the underlying arch is. That is set
@@ -850,6 +850,9 @@ define rule_vmlinux-modpost
$(Q)echo 'cmd_$@ := $(cmd_vmlinux-modpost)' > $(dot-target).cmd
endef

+quiet_cmd_sort-jump-label = SORT $@
+ cmd_sort-jump-label = $(srctree)/scripts/mod/modpost -j -s $@
+
# vmlinux image - including updated kernel symbols
vmlinux: $(vmlinux-lds) $(vmlinux-init) $(vmlinux-main) vmlinux.o $(kallsyms.o) FORCE
ifdef CONFIG_HEADERS_CHECK
@@ -863,6 +866,7 @@ ifdef CONFIG_BUILD_DOCSRC
endif
$(call vmlinux-modpost)
$(call if_changed_rule,vmlinux__)
+ $(call cmd,sort-jump-label)
$(Q)rm -f .old_version

# build vmlinux.o first to catch section mismatch errors early
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index f6137ba..457c11d 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -168,7 +168,6 @@
TRACE_PRINTKS() \
FTRACE_EVENTS() \
TRACE_SYSCALLS() \
- JUMP_TABLE() \

/*
* Data section helpers
@@ -207,7 +206,6 @@
*(__vermagic) /* Kernel version magic */ \
*(__markers_strings) /* Markers: strings */ \
*(__tracepoints_strings)/* Tracepoints: strings */ \
- *(__jump_strings)/* Jump: strings */ \
} \
\
.rodata1 : AT(ADDR(.rodata1) - LOAD_OFFSET) { \
@@ -216,6 +214,10 @@
\
BUG_TABLE \
\
+ JUMP_TABLE \
+ \
+ JUMP_STRINGS \
+ \
/* PCI quirks */ \
.pci_fixup : AT(ADDR(.pci_fixup) - LOAD_OFFSET) { \
VMLINUX_SYMBOL(__start_pci_fixups_early) = .; \
@@ -559,11 +561,21 @@
#define BUG_TABLE
#endif

-#define JUMP_TABLE() \
+#define JUMP_TABLE \
. = ALIGN(64); \
+ __jump_table : AT(ADDR(__jump_table) - LOAD_OFFSET) { \
VMLINUX_SYMBOL(__start___jump_table) = .; \
*(__jump_table) \
VMLINUX_SYMBOL(__stop___jump_table) = .; \
+ }
+
+#define JUMP_STRINGS \
+ . = ALIGN(64); \
+ __jump_strings : AT(ADDR(__jump_strings) - LOAD_OFFSET) { \
+ VMLINUX_SYMBOL(__start___jump_strings) = .; \
+ *(__jump_strings) \
+ VMLINUX_SYMBOL(__stop___jump_strings) = .; \
+ }

#ifdef CONFIG_PM_TRACE
#define TRACEDATA \
diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index ff954f8..a6dbbe6 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -1,4 +1,5 @@
hostprogs-y := modpost mk_elfconfig
+HOST_EXTRACFLAGS += -iquote "$(srctree)/arch/$(hdr-arch)/include/asm"
always := $(hostprogs-y) empty.o

modpost-objs := modpost.o file2alias.o sumversion.o
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 3318692..1ed8380 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -17,6 +17,7 @@
#include "modpost.h"
#include "../../include/generated/autoconf.h"
#include "../../include/linux/license.h"
+#include <linux/types.h>

/* Some toolchains use a `_' prefix for all user symbols. */
#ifdef CONFIG_SYMBOL_PREFIX
@@ -41,6 +42,8 @@ static int warn_unresolved = 0;
/* How a symbol is exported */
static int sec_mismatch_count = 0;
static int sec_mismatch_verbose = 1;
+/* jump label */
+static int enable_jump_label = 0;

enum export {
export_plain, export_unused, export_gpl,
@@ -315,12 +318,18 @@ void *grab_file(const char *filename, unsigned long *size)
void *map;
int fd;

- fd = open(filename, O_RDONLY);
+ if (!enable_jump_label)
+ fd = open(filename, O_RDONLY);
+ else
+ fd = open(filename, O_RDWR);
if (fd < 0 || fstat(fd, &st) != 0)
return NULL;

*size = st.st_size;
- map = mmap(NULL, *size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
+ if (!enable_jump_label)
+ map = mmap(NULL, *size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
+ else
+ map = mmap(NULL, *size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);

if (map == MAP_FAILED)
@@ -367,6 +376,63 @@ void release_file(void *file, unsigned long size)
munmap(file, size);
}

+#ifdef CONFIG_HAVE_ARCH_JUMP_LABEL
+
+#include "jump_label.h"
+
+static void swap_jump_label_entries(struct jump_entry *previous, struct jump_entry *next)
+{
+ struct jump_entry tmp;
+
+ tmp = *next;
+ *next = *previous;
+ *previous = tmp;
+}
+
+static void sort_jump_label_table(struct elf_info *info, Elf_Ehdr *hdr)
+{
+ int swapped = 0;
+ struct jump_entry *iter, *iter_next;
+ char *name, *next_name;
+ Elf_Shdr *sechdrs = info->sechdrs;
+ unsigned long jump_table, jump_table_end;
+ unsigned long jump_strings, jump_strings_addr;
+
+ if ((info->jump_sec == 0) && (info->jump_strings_sec == 0))
+ return;
+
+ jump_table = (unsigned long)hdr + sechdrs[info->jump_sec].sh_offset;
+ jump_table_end = jump_table + sechdrs[info->jump_sec].sh_size;
+ jump_strings = (unsigned long)hdr +
+ sechdrs[info->jump_strings_sec].sh_offset;
+ jump_strings_addr = sechdrs[info->jump_strings_sec].sh_addr;
+
+ do {
+ swapped = 0;
+ iter = iter_next = (struct jump_entry *)jump_table;
+ iter_next++;
+ for (; iter_next < (struct jump_entry *)jump_table_end;
+ iter++, iter_next++) {
+ name = (char *)(jump_strings + (iter->name - jump_strings_addr));
+ next_name = (char *)(jump_strings +
+ (iter_next->name - jump_strings_addr));
+ if (strcmp(name, next_name) > 0) {
+ swap_jump_label_entries(iter, iter_next);
+ swapped = 1;
+ }
+ }
+ } while (swapped == 1);
+}
+
+#else
+
+static inline void sort_jump_label_table(struct elf_info *info, Elf_Ehdr *hdr)
+{
+ return;
+}
+
+#endif
+
static int parse_elf(struct elf_info *info, const char *filename)
{
unsigned int i;
@@ -460,6 +526,10 @@ static int parse_elf(struct elf_info *info, const char *filename)
info->export_unused_gpl_sec = i;
else if (strcmp(secname, "__ksymtab_gpl_future") == 0)
info->export_gpl_future_sec = i;
+ else if (strcmp(secname, "__jump_table") == 0)
+ info->jump_sec = i;
+ else if (strcmp(secname, "__jump_strings") == 0)
+ info->jump_strings_sec = i;

if (sechdrs[i].sh_type != SHT_SYMTAB)
continue;
@@ -480,6 +550,10 @@ static int parse_elf(struct elf_info *info, const char *filename)
sym->st_value = TO_NATIVE(sym->st_value);
sym->st_size = TO_NATIVE(sym->st_size);
}
+
+ if (enable_jump_label)
+ sort_jump_label_table(info, hdr);
+
return 1;
}

@@ -1965,7 +2039,7 @@ int main(int argc, char **argv)
struct ext_sym_list *extsym_iter;
struct ext_sym_list *extsym_start = NULL;

- while ((opt = getopt(argc, argv, "i:I:e:cmsSo:awM:K:")) != -1) {
+ while ((opt = getopt(argc, argv, "i:I:e:cmsSo:awM:K:j")) != -1) {
switch (opt) {
case 'i':
kernel_read = optarg;
@@ -2003,6 +2077,9 @@ int main(int argc, char **argv)
case 'w':
warn_unresolved = 1;
break;
+ case 'j':
+ enable_jump_label = 1;
+ break;
default:
exit(1);
}
diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index be987a4..0875d4b 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -126,6 +126,8 @@ struct elf_info {
Elf_Section export_gpl_sec;
Elf_Section export_unused_gpl_sec;
Elf_Section export_gpl_future_sec;
+ Elf_Section jump_sec;
+ Elf_Section jump_strings_sec;
const char *strtab;
char *modinfo;
unsigned int modinfo_len;
--
1.7.1

2010-06-09 21:40:05

by Jason Baron

[permalink] [raw]
Subject: [PATCH 09/13] jump label v9: jump_label_text_reserved() to reserve our jump points

Add a jump_label_text_reserved(void *start, void *end), so that other
pieces of code that want to modify kernel text, can first verify that
jump label has not reserved the instruction.

Signed-off-by: Jason Baron <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
---
arch/x86/kernel/kprobes.c | 3 +-
include/linux/jump_label.h | 6 +++
kernel/jump_label.c | 83 ++++++++++++++++++++++++++++++++++++++++++++
kernel/kprobes.c | 3 +-
4 files changed, 93 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 345a4b1..87ca919 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -1201,7 +1201,8 @@ static int __kprobes copy_optimized_instructions(u8 *dest, u8 *src)
}
/* Check whether the address range is reserved */
if (ftrace_text_reserved(src, src + len - 1) ||
- alternatives_text_reserved(src, src + len - 1))
+ alternatives_text_reserved(src, src + len - 1) ||
+ jump_label_text_reserved(src, src + len - 1))
return -EBUSY;

return len;
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index dc81a45..3250793 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -28,6 +28,7 @@ extern void arch_jump_label_transform(struct jump_entry *entry,
extern const u8 *arch_get_jump_label_nop(void);

extern void jump_label_update(const char *name, enum jump_label_type type);
+extern int jump_label_text_reserved(void *start, void *end);

extern void apply_jump_label_nops(struct module *mod);

@@ -62,6 +63,11 @@ static inline int apply_jump_label_nops(struct module *mod)
return 0;
}

+static inline int jump_label_text_reserved(void *start, void *end)
+{
+ return 0;
+}
+
#endif

#endif
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 3ebe2bf..ac39662 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -181,6 +181,89 @@ void jump_label_update(const char *name, enum jump_label_type type)
mutex_unlock(&jump_label_mutex);
}

+static int addr_conflict(struct jump_entry *entry, void *start, void *end)
+{
+ if (entry->code <= (unsigned long)end &&
+ entry->code + JUMP_LABEL_NOP_SIZE > (unsigned long)start)
+ return 1;
+
+ return 0;
+}
+
+#ifdef CONFIG_MODULES
+
+static int module_conflict(void *start, void *end)
+{
+ struct hlist_head *head;
+ struct hlist_node *node, *node_next, *module_node, *module_node_next;
+ struct jump_label_entry *e;
+ struct jump_label_module_entry *e_module;
+ struct jump_entry *iter;
+ int i, count;
+ int conflict = 0;
+
+ for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
+ head = &jump_label_table[i];
+ hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
+ hlist_for_each_entry_safe(e_module, module_node,
+ module_node_next,
+ &(e->modules), hlist) {
+ count = e_module->nr_entries;
+ iter = e_module->table;
+ while (count--) {
+ if (addr_conflict(iter, start, end)) {
+ conflict = 1;
+ goto out;
+ }
+ iter++;
+ }
+ }
+ }
+ }
+out:
+ return conflict;
+}
+
+#endif
+
+/***
+ * jump_label_text_reserved - check if addr range is reserved
+ * @start: start text addr
+ * @end: end text addr
+ *
+ * checks if the text addr located between @start and @end
+ * overlaps with any of the jump label patch addresses. Code
+ * that wants to modify kernel text should first verify that
+ * it does not overlap with any of the jump label addresses.
+ *
+ * returns 1 if there is an overlap, 0 otherwise
+ */
+int jump_label_text_reserved(void *start, void *end)
+{
+ struct jump_entry *iter;
+ struct jump_entry *iter_start = __start___jump_table;
+ struct jump_entry *iter_stop = __start___jump_table;
+ int conflict = 0;
+
+ mutex_lock(&jump_label_mutex);
+ iter = iter_start;
+ while (iter < iter_stop) {
+ if (addr_conflict(iter, start, end)) {
+ conflict = 1;
+ goto out;
+ }
+ iter++;
+ }
+
+ /* now check modules */
+#ifdef CONFIG_MODULES
+ conflict = module_conflict(start, end);
+#endif
+out:
+ mutex_unlock(&jump_label_mutex);
+ return conflict;
+}
+
static int init_jump_label(void)
{
int ret;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 282035f..2f9bea6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1141,7 +1141,8 @@ int __kprobes register_kprobe(struct kprobe *p)
preempt_disable();
if (!kernel_text_address((unsigned long) p->addr) ||
in_kprobes_functions((unsigned long) p->addr) ||
- ftrace_text_reserved(p->addr, p->addr)) {
+ ftrace_text_reserved(p->addr, p->addr) ||
+ jump_label_text_reserved(p->addr, p->addr)) {
preempt_enable();
return -EINVAL;
}
--
1.7.1

2010-06-09 21:40:13

by Jason Baron

[permalink] [raw]
Subject: [PATCH 11/13] jump label v9: convert dynamic debug to use jump labels.

Convert the 'dynamic debug' infrastructure to use jump labels.

Signed-off-by: Jason Baron <[email protected]>
---
include/linux/dynamic_debug.h | 42 ++++++++++++++-------------
lib/dynamic_debug.c | 42 ++-------------------------
scripts/Makefile.lib | 11 +------
scripts/basic/Makefile | 2 +-
scripts/basic/hash.c | 64 -----------------------------------------
5 files changed, 27 insertions(+), 134 deletions(-)
delete mode 100644 scripts/basic/hash.c

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index b3cd4de..f11ce2c 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -1,6 +1,8 @@
#ifndef _DYNAMIC_DEBUG_H
#define _DYNAMIC_DEBUG_H

+#include <linux/jump_label.h>
+
/* dynamic_printk_enabled, and dynamic_printk_enabled2 are bitmasks in which
* bit n is set to 1 if any modname hashes into the bucket n, 0 otherwise. They
* use independent hash functions, to reduce the chance of false positives.
@@ -22,9 +24,7 @@ struct _ddebug {
const char *function;
const char *filename;
const char *format;
- char primary_hash;
- char secondary_hash;
- unsigned int lineno:24;
+ unsigned int lineno;
/*
* The flags field controls the behaviour at the callsite.
* The bits here are changed dynamically when the user
@@ -32,7 +32,7 @@ struct _ddebug {
*/
#define _DPRINTK_FLAGS_PRINT (1<<0) /* printk() a message using the format */
#define _DPRINTK_FLAGS_DEFAULT 0
- unsigned int flags:8;
+ char flags;
} __attribute__((aligned(8)));


@@ -42,33 +42,35 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n,
#if defined(CONFIG_DYNAMIC_DEBUG)
extern int ddebug_remove_module(char *mod_name);

-#define __dynamic_dbg_enabled(dd) ({ \
- int __ret = 0; \
- if (unlikely((dynamic_debug_enabled & (1LL << DEBUG_HASH)) && \
- (dynamic_debug_enabled2 & (1LL << DEBUG_HASH2)))) \
- if (unlikely(dd.flags)) \
- __ret = 1; \
- __ret; })
-
#define dynamic_pr_debug(fmt, ...) do { \
+ __label__ do_printk; \
+ __label__ out; \
static struct _ddebug descriptor \
__used \
__attribute__((section("__verbose"), aligned(8))) = \
- { KBUILD_MODNAME, __func__, __FILE__, fmt, DEBUG_HASH, \
- DEBUG_HASH2, __LINE__, _DPRINTK_FLAGS_DEFAULT }; \
- if (__dynamic_dbg_enabled(descriptor)) \
- printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
+ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \
+ _DPRINTK_FLAGS_DEFAULT }; \
+ JUMP_LABEL(&descriptor.flags, do_printk); \
+ goto out; \
+do_printk: \
+ printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
+out: ; \
} while (0)


#define dynamic_dev_dbg(dev, fmt, ...) do { \
+ __label__ do_printk; \
+ __label__ out; \
static struct _ddebug descriptor \
__used \
__attribute__((section("__verbose"), aligned(8))) = \
- { KBUILD_MODNAME, __func__, __FILE__, fmt, DEBUG_HASH, \
- DEBUG_HASH2, __LINE__, _DPRINTK_FLAGS_DEFAULT }; \
- if (__dynamic_dbg_enabled(descriptor)) \
- dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \
+ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \
+ _DPRINTK_FLAGS_DEFAULT }; \
+ JUMP_LABEL(&descriptor.flags, do_printk); \
+ goto out; \
+do_printk: \
+ dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \
+out: ; \
} while (0)

#else
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 3df8eb1..2ef4fbc 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -26,19 +26,11 @@
#include <linux/dynamic_debug.h>
#include <linux/debugfs.h>
#include <linux/slab.h>
+#include <linux/jump_label.h>

extern struct _ddebug __start___verbose[];
extern struct _ddebug __stop___verbose[];

-/* dynamic_debug_enabled, and dynamic_debug_enabled2 are bitmasks in which
- * bit n is set to 1 if any modname hashes into the bucket n, 0 otherwise. They
- * use independent hash functions, to reduce the chance of false positives.
- */
-long long dynamic_debug_enabled;
-EXPORT_SYMBOL_GPL(dynamic_debug_enabled);
-long long dynamic_debug_enabled2;
-EXPORT_SYMBOL_GPL(dynamic_debug_enabled2);
-
struct ddebug_table {
struct list_head link;
char *mod_name;
@@ -88,26 +80,6 @@ static char *ddebug_describe_flags(struct _ddebug *dp, char *buf,
}

/*
- * must be called with ddebug_lock held
- */
-
-static int disabled_hash(char hash, bool first_table)
-{
- struct ddebug_table *dt;
- char table_hash_value;
-
- list_for_each_entry(dt, &ddebug_tables, link) {
- if (first_table)
- table_hash_value = dt->ddebugs->primary_hash;
- else
- table_hash_value = dt->ddebugs->secondary_hash;
- if (dt->num_enabled && (hash == table_hash_value))
- return 0;
- }
- return 1;
-}
-
-/*
* Search the tables for _ddebug's which match the given
* `query' and apply the `flags' and `mask' to them. Tells
* the user which ddebug's were changed, or whether none
@@ -170,17 +142,9 @@ static void ddebug_change(const struct ddebug_query *query,
dt->num_enabled++;
dp->flags = newflags;
if (newflags) {
- dynamic_debug_enabled |=
- (1LL << dp->primary_hash);
- dynamic_debug_enabled2 |=
- (1LL << dp->secondary_hash);
+ enable_jump_label((jump_label_t)&dp->flags);
} else {
- if (disabled_hash(dp->primary_hash, true))
- dynamic_debug_enabled &=
- ~(1LL << dp->primary_hash);
- if (disabled_hash(dp->secondary_hash, false))
- dynamic_debug_enabled2 &=
- ~(1LL << dp->secondary_hash);
+ disable_jump_label((jump_label_t)&dp->flags);
}
if (verbose)
printk(KERN_INFO
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 54fd1b7..7bfcf1a 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -101,14 +101,6 @@ basename_flags = -D"KBUILD_BASENAME=KBUILD_STR($(call name-fix,$(basetarget)))"
modname_flags = $(if $(filter 1,$(words $(modname))),\
-D"KBUILD_MODNAME=KBUILD_STR($(call name-fix,$(modname)))")

-#hash values
-ifdef CONFIG_DYNAMIC_DEBUG
-debug_flags = -D"DEBUG_HASH=$(shell ./scripts/basic/hash djb2 $(@D)$(modname))"\
- -D"DEBUG_HASH2=$(shell ./scripts/basic/hash r5 $(@D)$(modname))"
-else
-debug_flags =
-endif
-
orig_c_flags = $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) $(KBUILD_SUBDIR_CCFLAGS) \
$(ccflags-y) $(CFLAGS_$(basetarget).o)
_c_flags = $(filter-out $(CFLAGS_REMOVE_$(basetarget).o), $(orig_c_flags))
@@ -152,8 +144,7 @@ endif

c_flags = -Wp,-MD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
$(__c_flags) $(modkern_cflags) \
- -D"KBUILD_STR(s)=\#s" $(basename_flags) $(modname_flags) \
- $(debug_flags)
+ -D"KBUILD_STR(s)=\#s" $(basename_flags) $(modname_flags)

a_flags = -Wp,-MD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
$(__a_flags) $(modkern_aflags)
diff --git a/scripts/basic/Makefile b/scripts/basic/Makefile
index 0955995..4c324a1 100644
--- a/scripts/basic/Makefile
+++ b/scripts/basic/Makefile
@@ -9,7 +9,7 @@
# fixdep: Used to generate dependency information during build process
# docproc: Used in Documentation/DocBook

-hostprogs-y := fixdep docproc hash
+hostprogs-y := fixdep docproc
always := $(hostprogs-y)

# fixdep is needed to compile other host programs
diff --git a/scripts/basic/hash.c b/scripts/basic/hash.c
deleted file mode 100644
index 2ef5d3f..0000000
--- a/scripts/basic/hash.c
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * Copyright (C) 2008 Red Hat, Inc., Jason Baron <[email protected]>
- *
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-
-#define DYNAMIC_DEBUG_HASH_BITS 6
-
-static const char *program;
-
-static void usage(void)
-{
- printf("Usage: %s <djb2|r5> <modname>\n", program);
- exit(1);
-}
-
-/* djb2 hashing algorithm by Dan Bernstein. From:
- * http://www.cse.yorku.ca/~oz/hash.html
- */
-
-static unsigned int djb2_hash(char *str)
-{
- unsigned long hash = 5381;
- int c;
-
- c = *str;
- while (c) {
- hash = ((hash << 5) + hash) + c;
- c = *++str;
- }
- return (unsigned int)(hash & ((1 << DYNAMIC_DEBUG_HASH_BITS) - 1));
-}
-
-static unsigned int r5_hash(char *str)
-{
- unsigned long hash = 0;
- int c;
-
- c = *str;
- while (c) {
- hash = (hash + (c << 4) + (c >> 4)) * 11;
- c = *++str;
- }
- return (unsigned int)(hash & ((1 << DYNAMIC_DEBUG_HASH_BITS) - 1));
-}
-
-int main(int argc, char *argv[])
-{
- program = argv[0];
-
- if (argc != 3)
- usage();
- if (!strcmp(argv[1], "djb2"))
- printf("%d\n", djb2_hash(argv[2]));
- else if (!strcmp(argv[1], "r5"))
- printf("%d\n", r5_hash(argv[2]));
- else
- usage();
- exit(0);
-}
-
--
1.7.1

2010-06-09 21:40:28

by Jason Baron

[permalink] [raw]
Subject: [PATCH 10/13] jump label v9: convert jump label to use a key

Convert the jump label code to use a key to identify each related set of
jump labels instead of a string. This saves space since we no longer need
to store strings and their pointers, just the keys. In addition we can
simplify the JUMP_LABEL macro to 2 arguments from 3. We use the address
of the conditional variable as the key value. If jump labels are not enabled
then we can use the conditional variable as usual.

Signed-off-by: Jason Baron <[email protected]>
---
arch/x86/include/asm/jump_label.h | 21 +++++++++++--------
include/linux/jump_label.h | 32 ++++++++----------------------
include/linux/tracepoint.h | 5 +--
kernel/jump_label.c | 38 ++++++++++++++++--------------------
kernel/tracepoint.c | 6 ++--
scripts/mod/modpost.c | 14 +-----------
scripts/mod/modpost.h | 1 -
7 files changed, 45 insertions(+), 72 deletions(-)

diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index d911d36..3bf5581 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -10,33 +10,36 @@

# define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"

-# define JUMP_LABEL(tag, label, cond) \
+# define JUMP_LABEL(key, label) \
do { \
- extern const char __jlstrtab_##tag[]; \
asm goto("1:" \
JUMP_LABEL_INITIAL_NOP \
".pushsection __jump_table, \"a\" \n\t"\
_ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
".popsection \n\t" \
- : : "i" (__jlstrtab_##tag) : : label);\
+ : : "i" (key) : : label); \
} while (0)

#endif /* __KERNEL__ */

#ifdef CONFIG_X86_64

+typedef __u64 jump_label_t;
+
struct jump_entry {
- __u64 code;
- __u64 target;
- __u64 name;
+ jump_label_t code;
+ jump_label_t target;
+ jump_label_t key;
};

#else

+typedef __u32 jump_label_t;
+
struct jump_entry {
- __u32 code;
- __u32 target;
- __u32 name;
+ jump_label_t code;
+ jump_label_t target;
+ jump_label_t key;
};

#endif
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 3250793..ebe19ae 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -18,45 +18,31 @@ struct module;
extern struct jump_entry __start___jump_table[];
extern struct jump_entry __stop___jump_table[];

-#define DEFINE_JUMP_LABEL(name) \
- const char __jlstrtab_##name[] \
- __used __attribute__((section("__jump_strings"))) = #name; \
- EXPORT_SYMBOL_GPL(__jlstrtab_##name);
-
extern void arch_jump_label_transform(struct jump_entry *entry,
enum jump_label_type type);
extern const u8 *arch_get_jump_label_nop(void);

-extern void jump_label_update(const char *name, enum jump_label_type type);
+extern void jump_label_update(jump_label_t key, enum jump_label_type type);
extern int jump_label_text_reserved(void *start, void *end);

extern void apply_jump_label_nops(struct module *mod);

-#define enable_jump_label(name) \
- jump_label_update(name, JUMP_LABEL_ENABLE);
+#define enable_jump_label(key) \
+ jump_label_update(key, JUMP_LABEL_ENABLE);

-#define disable_jump_label(name) \
- jump_label_update(name, JUMP_LABEL_DISABLE);
+#define disable_jump_label(key) \
+ jump_label_update(key, JUMP_LABEL_DISABLE);

#else

-#define DEFINE_JUMP_LABEL(name)
-
-#define JUMP_LABEL(tag, label, cond) \
+#define JUMP_LABEL(key, label) \
do { \
- if (unlikely(cond)) \
+ if (unlikely(*key)) \
goto label; \
} while (0)

-static inline int enable_jump_label(const char *name)
-{
- return 0;
-}
-
-static inline int disable_jump_label(const char *name)
-{
- return 0;
-}
+#define enable_jump_label(key)
+#define disable_jump_label(key)

static inline int apply_jump_label_nops(struct module *mod)
{
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 883df16..9df40b8 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -145,7 +145,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
extern struct tracepoint __tracepoint_##name; \
static inline void trace_##name(proto) \
{ \
- JUMP_LABEL(&__tracepoint_##name, do_trace, __tracepoint_##name.state);\
+ JUMP_LABEL(&__tracepoint_##name.state, do_trace); \
return; \
do_trace: \
__DO_TRACE(&__tracepoint_##name, \
@@ -174,8 +174,7 @@ do_trace: \
__attribute__((section("__tracepoints_strings"))) = #name; \
struct tracepoint __tracepoint_##name \
__attribute__((section("__tracepoints"), aligned(32))) = \
- { __tpstrtab_##name, 0, reg, unreg, NULL }; \
- DEFINE_JUMP_LABEL(name);
+ { __tpstrtab_##name, 0, reg, unreg, NULL };

#define DEFINE_TRACE(name) \
DEFINE_TRACE_FN(name, NULL, NULL);
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index ac39662..859a87c 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -27,7 +27,7 @@ struct jump_label_entry {
int nr_entries;
/* hang modules off here */
struct hlist_head modules;
- const char *name;
+ unsigned long key;
};

struct jump_label_module_entry {
@@ -58,8 +58,7 @@ static void sort_jump_label_entries(struct jump_entry *start, struct jump_entry
iter_next = start;
iter_next++;
for (; iter_next < stop; iter++, iter_next++) {
- if (strcmp((char *)iter->name,
- (char *)iter_next->name) > 0) {
+ if (iter->key > iter_next->key) {
swap_jump_label_entries(iter, iter_next);
swapped = 1;
}
@@ -67,29 +66,28 @@ static void sort_jump_label_entries(struct jump_entry *start, struct jump_entry
} while (swapped == 1);
}

-static struct jump_label_entry *get_jump_label_entry(const char *name)
+static struct jump_label_entry *get_jump_label_entry(jump_label_t key)
{
struct hlist_head *head;
struct hlist_node *node;
struct jump_label_entry *e;
- u32 hash = jhash(name, strlen(name), 0);
+ u32 hash = jhash((void *)&key, sizeof(jump_label_t), 0);

head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)];
hlist_for_each_entry(e, node, head, hlist) {
- if (!strcmp((char *)name, (char *)e->name))
+ if (key == e->key)
return e;
}
return NULL;
}

-static struct jump_label_entry *add_jump_label_entry(const char *name, int nr_entries, struct jump_entry *table)
+static struct jump_label_entry *add_jump_label_entry(jump_label_t key, int nr_entries, struct jump_entry *table)
{
struct hlist_head *head;
struct jump_label_entry *e;
- size_t name_len;
u32 hash;

- e = get_jump_label_entry(name);
+ e = get_jump_label_entry(key);
if (e)
return ERR_PTR(-EEXIST);

@@ -97,10 +95,9 @@ static struct jump_label_entry *add_jump_label_entry(const char *name, int nr_en
if (!e)
return ERR_PTR(-ENOMEM);

- name_len = strlen(name) + 1;
- hash = jhash(name, name_len-1, 0);
+ hash = jhash((void *)&key, sizeof(jump_label_t), 0);
head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)];
- e->name = name;
+ e->key = key;
e->table = table;
e->nr_entries = nr_entries;
INIT_HLIST_HEAD(&(e->modules));
@@ -117,17 +114,16 @@ static int build_jump_label_hashtable(struct jump_entry *start, struct jump_entr
sort_jump_label_entries(start, stop);
iter = start;
while (iter < stop) {
- entry = get_jump_label_entry((char *)iter->name);
+ entry = get_jump_label_entry(iter->key);
if (!entry) {
iter_begin = iter;
count = 0;
while ((iter < stop) &&
- (strcmp((char *)iter->name,
- (char *)iter_begin->name) == 0)) {
+ (iter->key == iter_begin->key)) {
iter++;
count++;
}
- entry = add_jump_label_entry((char *)iter_begin->name,
+ entry = add_jump_label_entry(iter_begin->key,
count, iter_begin);
if (IS_ERR(entry))
return PTR_ERR(entry);
@@ -148,7 +144,7 @@ static int build_jump_label_hashtable(struct jump_entry *start, struct jump_entr
*
*/

-void jump_label_update(const char *name, enum jump_label_type type)
+void jump_label_update(jump_label_t key, enum jump_label_type type)
{
struct jump_entry *iter;
struct jump_label_entry *entry;
@@ -157,7 +153,7 @@ void jump_label_update(const char *name, enum jump_label_type type)
int count;

mutex_lock(&jump_label_mutex);
- entry = get_jump_label_entry(name);
+ entry = get_jump_label_entry(key);
if (entry) {
count = entry->nr_entries;
iter = entry->table;
@@ -316,16 +312,16 @@ static int add_jump_label_module(struct module *mod)
mod->jump_entries + mod->num_jump_entries);
iter = mod->jump_entries;
while (iter < mod->jump_entries + mod->num_jump_entries) {
- entry = get_jump_label_entry((char *)iter->name);
+ entry = get_jump_label_entry(iter->key);
iter_begin = iter;
count = 0;
while ((iter < mod->jump_entries + mod->num_jump_entries) &&
- (strcmp((char *)iter->name, (char *)iter_begin->name) == 0)) {
+ (iter->key == iter_begin->key)) {
iter++;
count++;
}
if (!entry) {
- entry = add_jump_label_entry((char *)iter_begin->name, 0, NULL);
+ entry = add_jump_label_entry(iter_begin->key, 0, NULL);
if (IS_ERR(entry))
return PTR_ERR(entry);
}
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index b16d873..efd6896 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -265,9 +265,9 @@ static void set_tracepoint(struct tracepoint_entry **entry,
*/
rcu_assign_pointer(elem->funcs, (*entry)->funcs);
if (!elem->state && active)
- enable_jump_label(elem->name);
+ enable_jump_label((jump_label_t)&elem->state);
if (elem->state && !active)
- disable_jump_label(elem->name);
+ disable_jump_label((jump_label_t)&elem->state);
elem->state = active;
}

@@ -283,7 +283,7 @@ static void disable_tracepoint(struct tracepoint *elem)
elem->unregfunc();

if (elem->state)
- disable_jump_label(elem->name);
+ disable_jump_label((jump_label_t)&elem->state);

elem->state = 0;
rcu_assign_pointer(elem->funcs, NULL);
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 1ed8380..a03b045 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -393,19 +393,14 @@ static void sort_jump_label_table(struct elf_info *info, Elf_Ehdr *hdr)
{
int swapped = 0;
struct jump_entry *iter, *iter_next;
- char *name, *next_name;
Elf_Shdr *sechdrs = info->sechdrs;
unsigned long jump_table, jump_table_end;
- unsigned long jump_strings, jump_strings_addr;

- if ((info->jump_sec == 0) && (info->jump_strings_sec == 0))
+ if (info->jump_sec == 0)
return;

jump_table = (unsigned long)hdr + sechdrs[info->jump_sec].sh_offset;
jump_table_end = jump_table + sechdrs[info->jump_sec].sh_size;
- jump_strings = (unsigned long)hdr +
- sechdrs[info->jump_strings_sec].sh_offset;
- jump_strings_addr = sechdrs[info->jump_strings_sec].sh_addr;

do {
swapped = 0;
@@ -413,10 +408,7 @@ static void sort_jump_label_table(struct elf_info *info, Elf_Ehdr *hdr)
iter_next++;
for (; iter_next < (struct jump_entry *)jump_table_end;
iter++, iter_next++) {
- name = (char *)(jump_strings + (iter->name - jump_strings_addr));
- next_name = (char *)(jump_strings +
- (iter_next->name - jump_strings_addr));
- if (strcmp(name, next_name) > 0) {
+ if (iter->key > iter_next->key) {
swap_jump_label_entries(iter, iter_next);
swapped = 1;
}
@@ -528,8 +520,6 @@ static int parse_elf(struct elf_info *info, const char *filename)
info->export_gpl_future_sec = i;
else if (strcmp(secname, "__jump_table") == 0)
info->jump_sec = i;
- else if (strcmp(secname, "__jump_strings") == 0)
- info->jump_strings_sec = i;

if (sechdrs[i].sh_type != SHT_SYMTAB)
continue;
diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index 0875d4b..362b966 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -127,7 +127,6 @@ struct elf_info {
Elf_Section export_unused_gpl_sec;
Elf_Section export_gpl_future_sec;
Elf_Section jump_sec;
- Elf_Section jump_strings_sec;
const char *strtab;
char *modinfo;
unsigned int modinfo_len;
--
1.7.1

2010-06-09 21:40:37

by Jason Baron

[permalink] [raw]
Subject: [PATCH 12/13] jump label v9: sparc64 add jump_label support

Add jump label support for sparc64.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Jason Baron <[email protected]>
---
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/jump_label.h | 32 +++++++++++++++++++++++++++++
arch/sparc/kernel/Makefile | 2 +
arch/sparc/kernel/jump_label.c | 38 +++++++++++++++++++++++++++++++++++
arch/sparc/kernel/module.c | 6 +++++
5 files changed, 79 insertions(+), 0 deletions(-)
create mode 100644 arch/sparc/include/asm/jump_label.h
create mode 100644 arch/sparc/kernel/jump_label.c

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 6f1470b..5fb0652 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -29,6 +29,7 @@ config SPARC
select PERF_USE_VMALLOC
select HAVE_DMA_ATTRS
select HAVE_DMA_API_DEBUG
+ select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE

config SPARC32
def_bool !64BIT
diff --git a/arch/sparc/include/asm/jump_label.h b/arch/sparc/include/asm/jump_label.h
new file mode 100644
index 0000000..51781f7
--- /dev/null
+++ b/arch/sparc/include/asm/jump_label.h
@@ -0,0 +1,32 @@
+#ifndef _ASM_SPARC_JUMP_LABEL_H
+#define _ASM_SPARC_JUMP_LABEL_H
+
+#ifdef __KERNEL__
+
+#include <linux/types.h>
+#include <asm/jump_entry.h>
+
+#define JUMP_LABEL_NOP_SIZE 4
+
+#define JUMP_LABEL(tag, label) \
+ do { \
+ asm goto("1:\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ ".pushsection __jump_table, \"a\"\n\t"\
+ ".xword 1b, %l[" #label "], %c0\n\t" \
+ ".popsection \n\t" \
+ : : "i" (tag) : : label);\
+ } while (0)
+
+#endif /* __KERNEL__ */
+
+typedef __u32 jump_label_t;
+
+struct jump_entry {
+ jump_label_t code;
+ jump_label_t target;
+ jump_label_t name;
+};
+
+#endif
diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile
index 0c2dc1f..599398f 100644
--- a/arch/sparc/kernel/Makefile
+++ b/arch/sparc/kernel/Makefile
@@ -119,3 +119,5 @@ obj-$(CONFIG_COMPAT) += $(audit--y)

pc--$(CONFIG_PERF_EVENTS) := perf_event.o
obj-$(CONFIG_SPARC64) += $(pc--y)
+
+obj-$(CONFIG_SPARC64) += jump_label.o
diff --git a/arch/sparc/kernel/jump_label.c b/arch/sparc/kernel/jump_label.c
new file mode 100644
index 0000000..c1aa2d8
--- /dev/null
+++ b/arch/sparc/kernel/jump_label.c
@@ -0,0 +1,38 @@
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/cpu.h>
+
+#include <linux/jump_label.h>
+#include <linux/memory.h>
+
+#ifdef HAVE_JUMP_LABEL
+
+void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type)
+{
+ u32 val, *insn = (u32 *) entry->code;
+
+ val = *insn;
+ if (type == JUMP_LABEL_ENABLE) {
+ s32 off = (s32)entry->target - (s32)entry->code;
+ val = 0x40000000 | ((u32) off >> 2);
+ } else {
+ val = 0x01000000;
+ }
+
+ get_online_cpus();
+ mutex_lock(&text_mutex);
+ *insn = val;
+ flushi(insn);
+ mutex_unlock(&text_mutex);
+ put_online_cpus();
+}
+
+static const u8 jump_label_nop[JUMP_LABEL_NOP_SIZE] = { 0x01, 0x00, 0x00, 0x00 };
+
+const u8 *arch_get_jump_label_nop()
+{
+ return jump_label_nop;
+}
+
+#endif
diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
index f848aad..37cf439 100644
--- a/arch/sparc/kernel/module.c
+++ b/arch/sparc/kernel/module.c
@@ -18,6 +18,9 @@
#include <asm/spitfire.h>

#ifdef CONFIG_SPARC64
+
+#include <linux/jump_label.h>
+
static void *module_map(unsigned long size)
{
struct vm_struct *area;
@@ -227,6 +230,9 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
+ /* make jump label nops */
+ apply_jump_label_nops(me);
+
/* Cheetah's I-cache is fully coherent. */
if (tlb_type == spitfire) {
unsigned long va;
--
1.7.1

2010-06-09 21:40:56

by Jason Baron

[permalink] [raw]
Subject: [PATCH 13/13] jump label v9: add docs

Add jump label docs as: Documentation/jump-label.txt

Signed-off-by: Jason Baron <[email protected]>
---
Documentation/jump-label.txt | 151 ++++++++++++++++++++++++++++++++++++++++++
1 files changed, 151 insertions(+), 0 deletions(-)
create mode 100644 Documentation/jump-label.txt

diff --git a/Documentation/jump-label.txt b/Documentation/jump-label.txt
new file mode 100644
index 0000000..843da59
--- /dev/null
+++ b/Documentation/jump-label.txt
@@ -0,0 +1,151 @@
+ Jump Label
+ ----------
+
+By: Jason Baron <[email protected]>
+
+
+1) motivation
+
+
+Currently, tracepoints are implemented using a conditional. The conditional
+check requires checking a global variable for each tracepoint. Although,
+the overhead of this check is small, it increases under memory pressure. As we
+increase the number of tracepoints in the kernel this may become more of an
+issue. In addition, tracepoints are often dormant (disabled), and provide no
+direct kernel functionality. Thus, it is highly desirable to reduce their
+impact as much as possible. Although tracepoints are the original motivation
+for this work, other kernel code paths should be able to make use of the jump
+label optimization.
+
+
+2) jump label description/usage
+
+
+gcc (v4.5) adds a new 'asm goto' statement that allows branching to a label.
+http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01556.html
+
+Thus, this patch set introduces an architecture specific 'JUMP_LABEL()' macro as
+follows (x86):
+
+# define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"
+
+# define JUMP_LABEL(key, label) \
+ do { \
+ asm goto("1:" \
+ JUMP_LABEL_INITIAL_NOP \
+ ".pushsection __jump_table, \"a\" \n\t"\
+ _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
+ ".popsection \n\t" \
+ : : "i" (key) : : label); \
+ } while (0)
+
+
+For architectures that have not yet introduced jump label support its simply:
+
+#define JUMP_LABEL(key, label) \
+ if (unlikely(*key)) \
+ goto label;
+
+which then can be used as:
+
+ ....
+ JUMP_LABEL(trace_name, trace_label, jump_enabled);
+ printk("not doing tracing\n");
+ return;
+trace_label:
+ printk("doing tracing: %d\n", file);
+ ....
+
+The 'key' argument is thus a pointer to a conditional argument that can be used
+if the optimization is not enabled. Otherwise, this address serves as a unique
+key to identify the particular instance of the jump label.
+
+Thus, when tracing is disabled, we simply have a no-op followed by a jump around
+the dormant (disabled) tracing code. The 'JUMP_LABEL()' macro, produces a
+'jump_table' which has the following format:
+
+[instruction address] [jump target] [tracepoint key]
+
+Thus, to enable a tracepoint, we simply patch the 'instruction address' with
+a jump to the 'jump target'.
+
+The call to enable a jump label is: enable_jump_label(key); to disable:
+disable_jump_label(key);
+
+
+3) architecture interface
+
+
+There are a few functions and macros which arches must implement in order to
+take advantage of this optimization. As previously mentioned, if there is no
+architecture support we simply fall back to a traditional, load, test, and
+jump sequence.
+
+* add "HAVE_ARCH_JUMP_LABEL" to arch/<arch>/Kconfig to indicate support
+
+* #define JUMP_LABEL_NOP_SIZE, arch/x86/include/asm/jump_label.h
+
+* #define "JUMP_LABEL(tag, label, cond)", arch/x86/include/asm/jump_label.h
+
+* add: void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type)
+ and
+ const u8 *arch_get_jump_label_nop(void)
+
+ see: arch/x86/kernel/jump_label.c
+
+* finally add a definition for "struct jump_entry".
+ see: arch/x86/include/asm/jump_label.h
+
+
+4) Jump label analysis (x86)
+
+
+I've tested the performance of using 'get_cycles()' calls around the
+tracepoint call sites. For an Intel Core 2 Quad cpu (in cycles, averages):
+
+ idle after tbench run
+ ---- ----------------
+old code 32 88
+new code 2 4
+
+
+The performance improvement can be reproduced reliably on both Intel and AMD
+hardware.
+
+In terms of code analysis the current code for the disabled case is a 'cmpl'
+followed by a 'je' around the tracepoint code. so:
+
+cmpl - 83 3d 0e 77 87 00 00 - 7 bytes
+je - 74 3e - 2 bytes
+
+total of 9 instruction bytes.
+
+The new code is a 'nopl' followed by a 'jmp'. Thus:
+
+nopl - 0f 1f 44 00 00 - 5 bytes
+jmp - eb 3e - 2 bytes
+
+total of 7 instruction bytes.
+
+So, the new code also accounts for 2 less bytes in the instruction cache per tracepoint.
+
+The optimization depends on !CC_OPTIMIZE_FOR_SIZE. When CC_OPTIMIZE_FOR_SIZE is
+set, gcc does not always out of line the not taken label path in the same way
+that the "if unlikely()" paths are made out of line. Thus, with
+CC_OPTIMIZE_FOR_SIZE set, this optimization is not always optimal. This may be
+solved in subsequent gcc versions, that allow us to move labels out of line,
+while still optimizing for size. In the case of !CC_OPTIMIZE_FOR_SIZE this
+optimization is seen on high level benchmarks such as tbench where I can get
+~1-2% higher throughput. In addition we are within .5% of the performance of no
+tracepoints compiled in at all.
+
+
+5) Acknowledgments
+
+
+Thanks to Roland McGrath and Richard Henderson for helping come up with the
+initial 'asm goto' and jump label design.
+
+Thanks to Mathieu Desnoyers and H. Peter Anvin for calling attention to this
+issue, and outlining the requirements of a solution. Mathieu also implemened a
+solution in the form of the "Immediate Values" work.
--
1.7.1

2010-06-09 21:41:28

by Jason Baron

[permalink] [raw]
Subject: [PATCH 06/13] jump label v9: move ftrace_dyn_arch_init to common code

Move Steve's code for finding the best 5-byte no-op from ftrace.c to alternative.c.
The idea is that other consumers (in this case jump label) want to make use of
that code. I've created a global: 'char ideal_nop[5]', that is setup during
setup_arch that can be used.

Signed-off-by: Jason Baron <[email protected]>
---
arch/x86/include/asm/alternative.h | 14 +++++++
arch/x86/include/asm/jump_label.h | 10 +---
arch/x86/kernel/alternative.c | 72 +++++++++++++++++++++++++++++++++++-
arch/x86/kernel/ftrace.c | 70 +----------------------------------
arch/x86/kernel/jump_label.c | 19 +--------
arch/x86/kernel/module.c | 3 +
arch/x86/kernel/setup.c | 3 +
include/linux/jump_label.h | 9 ++++
kernel/jump_label.c | 33 ++++++++++++++++
kernel/trace/ftrace.c | 13 +------
10 files changed, 140 insertions(+), 106 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 03b6bb5..54c4952 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -4,6 +4,7 @@
#include <linux/types.h>
#include <linux/stddef.h>
#include <linux/stringify.h>
+#include <linux/jump_label.h>
#include <asm/asm.h>

/*
@@ -159,6 +160,8 @@ static inline void apply_paravirt(struct paravirt_patch_site *start,
#define __parainstructions_end NULL
#endif

+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
/*
* Clear and restore the kernel write-protection flag on the local CPU.
* Allows the kernel to edit read-only pages.
@@ -179,4 +182,15 @@ static inline void apply_paravirt(struct paravirt_patch_site *start,
extern void *text_poke(void *addr, const void *opcode, size_t len);
extern void *text_poke_smp(void *addr, const void *opcode, size_t len);

+#if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
+#define IDEAL_NOP_SIZE_5 5
+extern unsigned char ideal_nop5[IDEAL_NOP_SIZE_5];
+extern int arch_init_ideal_nop5(void);
+#else
+static inline int arch_init_ideal_nop5(void)
+{
+ return 0;
+}
+#endif
+
#endif /* _ASM_X86_ALTERNATIVE_H */
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index f8e3f82..d911d36 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -8,17 +8,13 @@

#define JUMP_LABEL_NOP_SIZE 5

-#ifdef CONFIG_X86_64
-# define JUMP_LABEL_NOP P6_NOP5
-#else
-# define JUMP_LABEL_NOP ".byte 0xe9 \n\t .long 0\n\t"
-#endif
+# define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"

-#define JUMP_LABEL(tag, label, cond) \
+# define JUMP_LABEL(tag, label, cond) \
do { \
extern const char __jlstrtab_##tag[]; \
asm goto("1:" \
- JUMP_LABEL_NOP \
+ JUMP_LABEL_INITIAL_NOP \
".pushsection __jump_table, \"a\" \n\t"\
_ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
".popsection \n\t" \
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7023773..2e44df7 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -195,7 +195,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len)

extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
extern s32 __smp_locks[], __smp_locks_end[];
-static void *text_poke_early(void *addr, const void *opcode, size_t len);
+void *text_poke_early(void *addr, const void *opcode, size_t len);

/* Replace instructions with better alternatives for this CPU type.
This runs before SMP is initialized to avoid SMP problems with
@@ -521,7 +521,7 @@ void __init alternative_instructions(void)
* instructions. And on the local CPU you need to be protected again NMI or MCE
* handlers seeing an inconsistent instruction while you patch.
*/
-static void *__init_or_module text_poke_early(void *addr, const void *opcode,
+void *__init_or_module text_poke_early(void *addr, const void *opcode,
size_t len)
{
unsigned long flags;
@@ -640,3 +640,71 @@ void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
return addr;
}

+#if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
+
+unsigned char ideal_nop5[IDEAL_NOP_SIZE_5];
+
+int __init arch_init_ideal_nop5(void)
+{
+ extern const unsigned char ftrace_test_p6nop[];
+ extern const unsigned char ftrace_test_nop5[];
+ extern const unsigned char ftrace_test_jmp[];
+ int faulted = 0;
+ unsigned long flags;
+
+ local_irq_save(flags);
+ /*
+ * There is no good nop for all x86 archs.
+ * We will default to using the P6_NOP5, but first we
+ * will test to make sure that the nop will actually
+ * work on this CPU. If it faults, we will then
+ * go to a lesser efficient 5 byte nop. If that fails
+ * we then just use a jmp as our nop. This isn't the most
+ * efficient nop, but we can not use a multi part nop
+ * since we would then risk being preempted in the middle
+ * of that nop, and if we enabled tracing then, it might
+ * cause a system crash.
+ *
+ * TODO: check the cpuid to determine the best nop.
+ */
+ asm volatile (
+ "ftrace_test_jmp:"
+ "jmp ftrace_test_p6nop\n"
+ "nop\n"
+ "nop\n"
+ "nop\n" /* 2 byte jmp + 3 bytes */
+ "ftrace_test_p6nop:"
+ P6_NOP5
+ "jmp 1f\n"
+ "ftrace_test_nop5:"
+ ".byte 0x66,0x66,0x66,0x66,0x90\n"
+ "1:"
+ ".section .fixup, \"ax\"\n"
+ "2: movl $1, %0\n"
+ " jmp ftrace_test_nop5\n"
+ "3: movl $2, %0\n"
+ " jmp 1b\n"
+ ".previous\n"
+ _ASM_EXTABLE(ftrace_test_p6nop, 2b)
+ _ASM_EXTABLE(ftrace_test_nop5, 3b)
+ : "=r"(faulted) : "0" (faulted));
+
+ switch (faulted) {
+ case 0:
+ pr_info("converting mcount calls to 0f 1f 44 00 00\n");
+ memcpy(ideal_nop5, ftrace_test_p6nop, IDEAL_NOP_SIZE_5);
+ break;
+ case 1:
+ pr_info("converting mcount calls to 66 66 66 66 90\n");
+ memcpy(ideal_nop5, ftrace_test_nop5, IDEAL_NOP_SIZE_5);
+ break;
+ case 2:
+ pr_info("converting mcount calls to jmp . + 5\n");
+ memcpy(ideal_nop5, ftrace_test_jmp, IDEAL_NOP_SIZE_5);
+ break;
+ }
+
+ local_irq_restore(flags);
+ return 0;
+}
+#endif
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index cd37469..ba2e0d9 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -257,14 +257,9 @@ do_ftrace_mod_code(unsigned long ip, void *new_code)
return mod_code_status;
}

-
-
-
-static unsigned char ftrace_nop[MCOUNT_INSN_SIZE];
-
static unsigned char *ftrace_nop_replace(void)
{
- return ftrace_nop;
+ return ideal_nop5;
}

static int
@@ -336,69 +331,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
return ret;
}

-int __init ftrace_dyn_arch_init(void *data)
-{
- extern const unsigned char ftrace_test_p6nop[];
- extern const unsigned char ftrace_test_nop5[];
- extern const unsigned char ftrace_test_jmp[];
- int faulted = 0;
-
- /*
- * There is no good nop for all x86 archs.
- * We will default to using the P6_NOP5, but first we
- * will test to make sure that the nop will actually
- * work on this CPU. If it faults, we will then
- * go to a lesser efficient 5 byte nop. If that fails
- * we then just use a jmp as our nop. This isn't the most
- * efficient nop, but we can not use a multi part nop
- * since we would then risk being preempted in the middle
- * of that nop, and if we enabled tracing then, it might
- * cause a system crash.
- *
- * TODO: check the cpuid to determine the best nop.
- */
- asm volatile (
- "ftrace_test_jmp:"
- "jmp ftrace_test_p6nop\n"
- "nop\n"
- "nop\n"
- "nop\n" /* 2 byte jmp + 3 bytes */
- "ftrace_test_p6nop:"
- P6_NOP5
- "jmp 1f\n"
- "ftrace_test_nop5:"
- ".byte 0x66,0x66,0x66,0x66,0x90\n"
- "1:"
- ".section .fixup, \"ax\"\n"
- "2: movl $1, %0\n"
- " jmp ftrace_test_nop5\n"
- "3: movl $2, %0\n"
- " jmp 1b\n"
- ".previous\n"
- _ASM_EXTABLE(ftrace_test_p6nop, 2b)
- _ASM_EXTABLE(ftrace_test_nop5, 3b)
- : "=r"(faulted) : "0" (faulted));
-
- switch (faulted) {
- case 0:
- pr_info("converting mcount calls to 0f 1f 44 00 00\n");
- memcpy(ftrace_nop, ftrace_test_p6nop, MCOUNT_INSN_SIZE);
- break;
- case 1:
- pr_info("converting mcount calls to 66 66 66 66 90\n");
- memcpy(ftrace_nop, ftrace_test_nop5, MCOUNT_INSN_SIZE);
- break;
- case 2:
- pr_info("converting mcount calls to jmp . + 5\n");
- memcpy(ftrace_nop, ftrace_test_jmp, MCOUNT_INSN_SIZE);
- break;
- }
-
- /* The return code is retured via data */
- *(unsigned long *)data = 0;
-
- return 0;
-}
#endif

#ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index c347431..2db569c 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -30,19 +30,8 @@ void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type ty
if (type == JUMP_LABEL_ENABLE) {
code.jump = 0xe9;
code.offset = entry->target - (entry->code + JUMP_LABEL_NOP_SIZE);
- } else {
-#ifdef CONFIG_X86_64
- /* opcode for P6_NOP5 */
- code.code[0] = 0x0f;
- code.code[1] = 0x1f;
- code.code[2] = 0x44;
- code.code[3] = 0x00;
- code.code[4] = 0x00;
-#else
- code.jump = 0xe9;
- code.offset = 0;
-#endif
- }
+ } else
+ memcpy(&code, ideal_nop5, JUMP_LABEL_NOP_SIZE);
get_online_cpus();
mutex_lock(&text_mutex);
text_poke_smp((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE);
@@ -50,11 +39,9 @@ void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type ty
put_online_cpus();
}

-static const u8 jump_label_nop[JUMP_LABEL_NOP_SIZE] = { 0x0f, 0x1f, 0x44, 0x00, 0x00 };
-
const u8 *arch_get_jump_label_nop()
{
- return jump_label_nop;
+ return ideal_nop5;
}

#endif
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index e0bc186..3d964dc 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -239,6 +239,9 @@ int module_finalize(const Elf_Ehdr *hdr,
apply_paravirt(pseg, pseg + para->sh_size);
}

+ /* make jump label nops */
+ apply_jump_label_nops(me);
+
return module_bug_finalize(hdr, sechdrs, me);
}

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index b4ae4ac..ecd0de5 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -49,6 +49,7 @@
#include <asm/pci-direct.h>
#include <linux/init_ohci1394_dma.h>
#include <linux/kvm_para.h>
+#include <linux/jump_label.h>

#include <linux/errno.h>
#include <linux/kernel.h>
@@ -1063,6 +1064,8 @@ void __init setup_arch(char **cmdline_p)
x86_init.oem.banner();

mcheck_init();
+
+ arch_init_ideal_nop5();
}

#ifdef CONFIG_X86_32
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index a84474c..dc81a45 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -11,6 +11,8 @@ enum jump_label_type {
JUMP_LABEL_DISABLE
};

+struct module;
+
#ifdef HAVE_JUMP_LABEL

extern struct jump_entry __start___jump_table[];
@@ -27,6 +29,8 @@ extern const u8 *arch_get_jump_label_nop(void);

extern void jump_label_update(const char *name, enum jump_label_type type);

+extern void apply_jump_label_nops(struct module *mod);
+
#define enable_jump_label(name) \
jump_label_update(name, JUMP_LABEL_ENABLE);

@@ -53,6 +57,11 @@ static inline int disable_jump_label(const char *name)
return 0;
}

+static inline int apply_jump_label_nops(struct module *mod)
+{
+ return 0;
+}
+
#endif

#endif
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 59316f0..3ebe2bf 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -184,10 +184,19 @@ void jump_label_update(const char *name, enum jump_label_type type)
static int init_jump_label(void)
{
int ret;
+ struct jump_entry *iter_start = __start___jump_table;
+ struct jump_entry *iter_stop = __stop___jump_table;
+ struct jump_entry *iter;

mutex_lock(&jump_label_mutex);
ret = build_jump_label_hashtable(__start___jump_table,
__stop___jump_table);
+ iter = iter_start;
+ while (iter < iter_stop) {
+ text_poke_early((void *)iter->code, arch_get_jump_label_nop(),
+ JUMP_LABEL_NOP_SIZE);
+ iter++;
+ }
mutex_unlock(&jump_label_mutex);
return ret;
}
@@ -298,6 +307,30 @@ static int jump_label_module_notify(struct notifier_block *self, unsigned long v
return ret;
}

+/***
+ * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop()
+ * @mod: module to patch
+ *
+ * Allow for run-time selection of the optimal nops. Before the module
+ * loads patch these with arch_get_jump_label_nop(), which is specified by
+ * the arch specific jump label code.
+ */
+void apply_jump_label_nops(struct module *mod)
+{
+ struct jump_entry *iter;
+
+ /* if the module doesn't have jump label entries, just return */
+ if (!mod->num_jump_entries)
+ return;
+
+ iter = mod->jump_entries;
+ while (iter < mod->jump_entries + mod->num_jump_entries) {
+ text_poke_early((void *)iter->code, arch_get_jump_label_nop(),
+ JUMP_LABEL_NOP_SIZE);
+ iter++;
+ }
+}
+
struct notifier_block jump_label_module_nb = {
.notifier_call = jump_label_module_notify,
.priority = 0,
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 0d88ce9..1a1cc47 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2749,20 +2749,9 @@ extern unsigned long __stop_mcount_loc[];

void __init ftrace_init(void)
{
- unsigned long count, addr, flags;
+ unsigned long count;
int ret;

- /* Keep the ftrace pointer to the stub */
- addr = (unsigned long)ftrace_stub;
-
- local_irq_save(flags);
- ftrace_dyn_arch_init(&addr);
- local_irq_restore(flags);
-
- /* ftrace_dyn_arch_init places the return code in addr */
- if (addr)
- goto failed;
-
count = __stop_mcount_loc - __start_mcount_loc;

ret = ftrace_dyn_table_alloc(count);
--
1.7.1

2010-06-09 21:41:39

by Jason Baron

[permalink] [raw]
Subject: [PATCH 03/13] jump label v9: x86 support

add x86 support for jump label. I'm keeping this patch separate so its clear to
arch maintainers what was required for x86 support this new feature. hopefully,
it wouldn't be too painful for other arches.

Signed-off-by: Jason Baron <[email protected]>
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/jump_label.h | 48 +++++++++++++++++++++++++++++
arch/x86/kernel/Makefile | 2 +-
arch/x86/kernel/jump_label.c | 60 +++++++++++++++++++++++++++++++++++++
4 files changed, 110 insertions(+), 1 deletions(-)
create mode 100644 arch/x86/include/asm/jump_label.h
create mode 100644 arch/x86/kernel/jump_label.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6f77afa..4ac6115 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -59,6 +59,7 @@ config X86
select ANON_INODES
select HAVE_ARCH_KMEMCHECK
select HAVE_USER_RETURN_NOTIFIER
+ select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE

config INSTRUCTION_DECODER
def_bool (KPROBES || PERF_EVENTS)
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
new file mode 100644
index 0000000..f8e3f82
--- /dev/null
+++ b/arch/x86/include/asm/jump_label.h
@@ -0,0 +1,48 @@
+#ifndef _ASM_X86_JUMP_LABEL_H
+#define _ASM_X86_JUMP_LABEL_H
+
+#ifdef __KERNEL__
+
+#include <linux/types.h>
+#include <asm/nops.h>
+
+#define JUMP_LABEL_NOP_SIZE 5
+
+#ifdef CONFIG_X86_64
+# define JUMP_LABEL_NOP P6_NOP5
+#else
+# define JUMP_LABEL_NOP ".byte 0xe9 \n\t .long 0\n\t"
+#endif
+
+#define JUMP_LABEL(tag, label, cond) \
+ do { \
+ extern const char __jlstrtab_##tag[]; \
+ asm goto("1:" \
+ JUMP_LABEL_NOP \
+ ".pushsection __jump_table, \"a\" \n\t"\
+ _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
+ ".popsection \n\t" \
+ : : "i" (__jlstrtab_##tag) : : label);\
+ } while (0)
+
+#endif /* __KERNEL__ */
+
+#ifdef CONFIG_X86_64
+
+struct jump_entry {
+ __u64 code;
+ __u64 target;
+ __u64 name;
+};
+
+#else
+
+struct jump_entry {
+ __u32 code;
+ __u32 target;
+ __u32 name;
+};
+
+#endif
+
+#endif
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index e77b220..44f6bea 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -32,7 +32,7 @@ GCOV_PROFILE_paravirt.o := n
obj-y := process_$(BITS).o signal.o entry_$(BITS).o
obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
obj-y += time.o ioport.o ldt.o dumpstack.o
-obj-y += setup.o x86_init.o i8259.o irqinit.o
+obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o
obj-$(CONFIG_X86_VISWS) += visws_quirks.o
obj-$(CONFIG_X86_32) += probe_roms_32.o
obj-$(CONFIG_X86_32) += sys_i386_32.o i386_ksyms_32.o
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
new file mode 100644
index 0000000..c347431
--- /dev/null
+++ b/arch/x86/kernel/jump_label.c
@@ -0,0 +1,60 @@
+/*
+ * jump label x86 support
+ *
+ * Copyright (C) 2009 Jason Baron <[email protected]>
+ *
+ */
+#include <linux/jump_label.h>
+#include <linux/memory.h>
+#include <linux/uaccess.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/jhash.h>
+#include <linux/cpu.h>
+#include <asm/kprobes.h>
+
+#ifdef HAVE_JUMP_LABEL
+
+union jump_code_union {
+ char code[JUMP_LABEL_NOP_SIZE];
+ struct {
+ char jump;
+ int offset;
+ } __attribute__((packed));
+};
+
+void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type)
+{
+ union jump_code_union code;
+
+ if (type == JUMP_LABEL_ENABLE) {
+ code.jump = 0xe9;
+ code.offset = entry->target - (entry->code + JUMP_LABEL_NOP_SIZE);
+ } else {
+#ifdef CONFIG_X86_64
+ /* opcode for P6_NOP5 */
+ code.code[0] = 0x0f;
+ code.code[1] = 0x1f;
+ code.code[2] = 0x44;
+ code.code[3] = 0x00;
+ code.code[4] = 0x00;
+#else
+ code.jump = 0xe9;
+ code.offset = 0;
+#endif
+ }
+ get_online_cpus();
+ mutex_lock(&text_mutex);
+ text_poke_smp((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE);
+ mutex_unlock(&text_mutex);
+ put_online_cpus();
+}
+
+static const u8 jump_label_nop[JUMP_LABEL_NOP_SIZE] = { 0x0f, 0x1f, 0x44, 0x00, 0x00 };
+
+const u8 *arch_get_jump_label_nop()
+{
+ return jump_label_nop;
+}
+
+#endif
--
1.7.1

2010-06-09 21:41:52

by Jason Baron

[permalink] [raw]
Subject: [PATCH 05/13] jump label v9: add module support

Add support for 'jump label' for modules.

Signed-off-by: Jason Baron <[email protected]>
---
arch/x86/kernel/ptrace.c | 1 +
include/linux/jump_label.h | 3 +-
include/linux/module.h | 5 +-
kernel/jump_label.c | 136 ++++++++++++++++++++++++++++++++++++++++++++
kernel/module.c | 7 ++
5 files changed, 150 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 70c4872..8b83cf7 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -21,6 +21,7 @@
#include <linux/signal.h>
#include <linux/perf_event.h>
#include <linux/hw_breakpoint.h>
+#include <linux/module.h>

#include <asm/uaccess.h>
#include <asm/pgtable.h>
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index e631401..a84474c 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -18,7 +18,8 @@ extern struct jump_entry __stop___jump_table[];

#define DEFINE_JUMP_LABEL(name) \
const char __jlstrtab_##name[] \
- __used __attribute__((section("__jump_strings"))) = #name;
+ __used __attribute__((section("__jump_strings"))) = #name; \
+ EXPORT_SYMBOL_GPL(__jlstrtab_##name);

extern void arch_jump_label_transform(struct jump_entry *entry,
enum jump_label_type type);
diff --git a/include/linux/module.h b/include/linux/module.h
index 8a6b9fd..403ac26 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -350,7 +350,10 @@ struct module
struct tracepoint *tracepoints;
unsigned int num_tracepoints;
#endif
-
+#ifdef HAVE_JUMP_LABEL
+ struct jump_entry *jump_entries;
+ unsigned int num_jump_entries;
+#endif
#ifdef CONFIG_TRACING
const char **trace_bprintk_fmt_start;
unsigned int num_trace_bprintk_fmt;
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 8e76e45..59316f0 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -30,6 +30,13 @@ struct jump_label_entry {
const char *name;
};

+struct jump_label_module_entry {
+ struct hlist_node hlist;
+ struct jump_entry *table;
+ int nr_entries;
+ struct module *mod;
+};
+
static void swap_jump_label_entries(struct jump_entry *previous, struct jump_entry *next)
{
struct jump_entry tmp;
@@ -159,6 +166,17 @@ void jump_label_update(const char *name, enum jump_label_type type)
arch_jump_label_transform(iter, type);
iter++;
}
+ /* eanble/disable jump labels in modules */
+ hlist_for_each_entry(e_module, module_node, &(entry->modules),
+ hlist) {
+ count = e_module->nr_entries;
+ iter = e_module->table;
+ while (count--) {
+ if (kernel_text_address(iter->code))
+ arch_jump_label_transform(iter, type);
+ iter++;
+ }
+ }
}
mutex_unlock(&jump_label_mutex);
}
@@ -175,4 +193,122 @@ static int init_jump_label(void)
}
early_initcall(init_jump_label);

+#ifdef CONFIG_MODULES
+
+static struct jump_label_module_entry *add_jump_label_module_entry(struct jump_label_entry *entry, struct jump_entry *iter_begin, int count, struct module *mod)
+{
+ struct jump_label_module_entry *e;
+
+ e = kmalloc(sizeof(struct jump_label_module_entry), GFP_KERNEL);
+ if (!e)
+ return ERR_PTR(-ENOMEM);
+ e->mod = mod;
+ e->nr_entries = count;
+ e->table = iter_begin;
+ hlist_add_head(&e->hlist, &entry->modules);
+ return e;
+}
+
+static int add_jump_label_module(struct module *mod)
+{
+ struct jump_entry *iter, *iter_begin;
+ struct jump_label_entry *entry;
+ struct jump_label_module_entry *module_entry;
+ int count;
+
+ /* if the module doesn't have jump label entries, just return */
+ if (!mod->num_jump_entries)
+ return 0;
+
+ sort_jump_label_entries(mod->jump_entries,
+ mod->jump_entries + mod->num_jump_entries);
+ iter = mod->jump_entries;
+ while (iter < mod->jump_entries + mod->num_jump_entries) {
+ entry = get_jump_label_entry((char *)iter->name);
+ iter_begin = iter;
+ count = 0;
+ while ((iter < mod->jump_entries + mod->num_jump_entries) &&
+ (strcmp((char *)iter->name, (char *)iter_begin->name) == 0)) {
+ iter++;
+ count++;
+ }
+ if (!entry) {
+ entry = add_jump_label_entry((char *)iter_begin->name, 0, NULL);
+ if (IS_ERR(entry))
+ return PTR_ERR(entry);
+ }
+ module_entry = add_jump_label_module_entry(entry, iter_begin,
+ count, mod);
+ if (IS_ERR(module_entry))
+ return PTR_ERR(module_entry);
+ }
+ return 0;
+}
+
+static void remove_jump_label_module(struct module *mod)
+{
+ struct hlist_head *head;
+ struct hlist_node *node, *node_next, *module_node, *module_node_next;
+ struct jump_label_entry *e;
+ struct jump_label_module_entry *e_module;
+ int i;
+
+ /* if the module doesn't have jump label entries, just return */
+ if (!mod->num_jump_entries)
+ return;
+
+ for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
+ head = &jump_label_table[i];
+ hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
+ hlist_for_each_entry_safe(e_module, module_node,
+ module_node_next,
+ &(e->modules), hlist) {
+ if (e_module->mod == mod) {
+ hlist_del(&e_module->hlist);
+ kfree(e_module);
+ }
+ }
+ if (hlist_empty(&e->modules) && (e->nr_entries == 0)) {
+ hlist_del(&e->hlist);
+ kfree(e);
+ }
+ }
+ }
+}
+
+static int jump_label_module_notify(struct notifier_block *self, unsigned long val, void *data)
+{
+ struct module *mod = data;
+ int ret = 0;
+
+ switch (val) {
+ case MODULE_STATE_COMING:
+ mutex_lock(&jump_label_mutex);
+ ret = add_jump_label_module(mod);
+ if (ret)
+ remove_jump_label_module(mod);
+ mutex_unlock(&jump_label_mutex);
+ break;
+ case MODULE_STATE_GOING:
+ mutex_lock(&jump_label_mutex);
+ remove_jump_label_module(mod);
+ mutex_unlock(&jump_label_mutex);
+ break;
+ }
+ return ret;
+}
+
+struct notifier_block jump_label_module_nb = {
+ .notifier_call = jump_label_module_notify,
+ .priority = 0,
+};
+
+static int init_jump_label_module(void)
+{
+ return register_module_notifier(&jump_label_module_nb);
+}
+early_initcall(init_jump_label_module);
+
+#endif /* CONFIG_MODULES */
+
#endif
diff --git a/kernel/module.c b/kernel/module.c
index 8c6b428..c31cfa9 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -55,6 +55,7 @@
#include <linux/async.h>
#include <linux/percpu.h>
#include <linux/kmemleak.h>
+#include <linux/jump_label.h>

#define CREATE_TRACE_POINTS
#include <trace/events/module.h>
@@ -2405,6 +2406,12 @@ static noinline struct module *load_module(void __user *umod,
sizeof(*mod->tracepoints),
&mod->num_tracepoints);
#endif
+#ifdef HAVE_JUMP_LABEL
+ mod->jump_entries = section_objs(hdr, sechdrs, secstrings,
+ "__jump_table",
+ sizeof(*mod->jump_entries),
+ &mod->num_jump_entries);
+#endif
#ifdef CONFIG_EVENT_TRACING
mod->trace_events = section_objs(hdr, sechdrs, secstrings,
"_ftrace_events",
--
1.7.1

2010-06-09 21:39:30

by Jason Baron

[permalink] [raw]
Subject: [PATCH 02/13] jump label v9: base patch

base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.

Signed-off-by: Jason Baron <[email protected]>
---
Makefile | 5 +
arch/Kconfig | 3 +
include/asm-generic/vmlinux.lds.h | 10 ++-
include/linux/jump_label.h | 57 ++++++++++++
kernel/Makefile | 2 +-
kernel/jump_label.c | 178 +++++++++++++++++++++++++++++++++++++
scripts/gcc-goto.sh | 5 +
7 files changed, 258 insertions(+), 2 deletions(-)
create mode 100644 include/linux/jump_label.h
create mode 100644 kernel/jump_label.c
create mode 100644 scripts/gcc-goto.sh

diff --git a/Makefile b/Makefile
index 182556b..bf9955d 100644
--- a/Makefile
+++ b/Makefile
@@ -582,6 +582,11 @@ KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
# conserve stack if available
KBUILD_CFLAGS += $(call cc-option,-fconserve-stack)

+# check for 'asm goto'
+ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC)), y)
+ KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
+endif
+
# Add user supplied CPPFLAGS, AFLAGS and CFLAGS as the last assignments
# But warn user when we do so
warn-assign = \
diff --git a/arch/Kconfig b/arch/Kconfig
index 4877a8c..1462d84 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -158,4 +158,7 @@ config HAVE_PERF_EVENTS_NMI
subsystem. Also has support for calculating CPU cycle events
to determine how many clock cycles in a given period.

+config HAVE_ARCH_JUMP_LABEL
+ bool
+
source "kernel/gcov/Kconfig"
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 48c5299..f6137ba 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -167,7 +167,8 @@
BRANCH_PROFILE() \
TRACE_PRINTKS() \
FTRACE_EVENTS() \
- TRACE_SYSCALLS()
+ TRACE_SYSCALLS() \
+ JUMP_TABLE() \

/*
* Data section helpers
@@ -206,6 +207,7 @@
*(__vermagic) /* Kernel version magic */ \
*(__markers_strings) /* Markers: strings */ \
*(__tracepoints_strings)/* Tracepoints: strings */ \
+ *(__jump_strings)/* Jump: strings */ \
} \
\
.rodata1 : AT(ADDR(.rodata1) - LOAD_OFFSET) { \
@@ -557,6 +559,12 @@
#define BUG_TABLE
#endif

+#define JUMP_TABLE() \
+ . = ALIGN(64); \
+ VMLINUX_SYMBOL(__start___jump_table) = .; \
+ *(__jump_table) \
+ VMLINUX_SYMBOL(__stop___jump_table) = .; \
+
#ifdef CONFIG_PM_TRACE
#define TRACEDATA \
. = ALIGN(4); \
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
new file mode 100644
index 0000000..e631401
--- /dev/null
+++ b/include/linux/jump_label.h
@@ -0,0 +1,57 @@
+#ifndef _LINUX_JUMP_LABEL_H
+#define _LINUX_JUMP_LABEL_H
+
+#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_HAVE_ARCH_JUMP_LABEL)
+# include <asm/jump_label.h>
+# define HAVE_JUMP_LABEL
+#endif
+
+enum jump_label_type {
+ JUMP_LABEL_ENABLE,
+ JUMP_LABEL_DISABLE
+};
+
+#ifdef HAVE_JUMP_LABEL
+
+extern struct jump_entry __start___jump_table[];
+extern struct jump_entry __stop___jump_table[];
+
+#define DEFINE_JUMP_LABEL(name) \
+ const char __jlstrtab_##name[] \
+ __used __attribute__((section("__jump_strings"))) = #name;
+
+extern void arch_jump_label_transform(struct jump_entry *entry,
+ enum jump_label_type type);
+extern const u8 *arch_get_jump_label_nop(void);
+
+extern void jump_label_update(const char *name, enum jump_label_type type);
+
+#define enable_jump_label(name) \
+ jump_label_update(name, JUMP_LABEL_ENABLE);
+
+#define disable_jump_label(name) \
+ jump_label_update(name, JUMP_LABEL_DISABLE);
+
+#else
+
+#define DEFINE_JUMP_LABEL(name)
+
+#define JUMP_LABEL(tag, label, cond) \
+do { \
+ if (unlikely(cond)) \
+ goto label; \
+} while (0)
+
+static inline int enable_jump_label(const char *name)
+{
+ return 0;
+}
+
+static inline int disable_jump_label(const char *name)
+{
+ return 0;
+}
+
+#endif
+
+#endif
diff --git a/kernel/Makefile b/kernel/Makefile
index ce53fb2..662c96d 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -10,7 +10,7 @@ obj-y = sched.o fork.o exec_domain.o panic.o printk.o \
kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
notifier.o ksysfs.o pm_qos_params.o sched_clock.o cred.o \
- async.o range.o
+ async.o range.o jump_label.o
obj-$(CONFIG_HAVE_EARLY_RES) += early_res.o
obj-y += groups.o

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
new file mode 100644
index 0000000..8e76e45
--- /dev/null
+++ b/kernel/jump_label.c
@@ -0,0 +1,178 @@
+/*
+ * jump label support
+ *
+ * Copyright (C) 2009 Jason Baron <[email protected]>
+ *
+ */
+#include <linux/jump_label.h>
+#include <linux/memory.h>
+#include <linux/uaccess.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/jhash.h>
+#include <linux/slab.h>
+
+#ifdef HAVE_JUMP_LABEL
+
+#define JUMP_LABEL_HASH_BITS 6
+#define JUMP_LABEL_TABLE_SIZE (1 << JUMP_LABEL_HASH_BITS)
+static struct hlist_head jump_label_table[JUMP_LABEL_TABLE_SIZE];
+
+/* mutex to protect coming/going of the the jump_label table */
+static DEFINE_MUTEX(jump_label_mutex);
+
+struct jump_label_entry {
+ struct hlist_node hlist;
+ struct jump_entry *table;
+ int nr_entries;
+ /* hang modules off here */
+ struct hlist_head modules;
+ const char *name;
+};
+
+static void swap_jump_label_entries(struct jump_entry *previous, struct jump_entry *next)
+{
+ struct jump_entry tmp;
+
+ tmp = *next;
+ *next = *previous;
+ *previous = tmp;
+}
+
+static void sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop)
+{
+ int swapped = 0;
+ struct jump_entry *iter;
+ struct jump_entry *iter_next;
+
+ do {
+ swapped = 0;
+ iter = start;
+ iter_next = start;
+ iter_next++;
+ for (; iter_next < stop; iter++, iter_next++) {
+ if (strcmp((char *)iter->name,
+ (char *)iter_next->name) > 0) {
+ swap_jump_label_entries(iter, iter_next);
+ swapped = 1;
+ }
+ }
+ } while (swapped == 1);
+}
+
+static struct jump_label_entry *get_jump_label_entry(const char *name)
+{
+ struct hlist_head *head;
+ struct hlist_node *node;
+ struct jump_label_entry *e;
+ u32 hash = jhash(name, strlen(name), 0);
+
+ head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)];
+ hlist_for_each_entry(e, node, head, hlist) {
+ if (!strcmp((char *)name, (char *)e->name))
+ return e;
+ }
+ return NULL;
+}
+
+static struct jump_label_entry *add_jump_label_entry(const char *name, int nr_entries, struct jump_entry *table)
+{
+ struct hlist_head *head;
+ struct jump_label_entry *e;
+ size_t name_len;
+ u32 hash;
+
+ e = get_jump_label_entry(name);
+ if (e)
+ return ERR_PTR(-EEXIST);
+
+ e = kmalloc(sizeof(struct jump_label_entry), GFP_KERNEL);
+ if (!e)
+ return ERR_PTR(-ENOMEM);
+
+ name_len = strlen(name) + 1;
+ hash = jhash(name, name_len-1, 0);
+ head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)];
+ e->name = name;
+ e->table = table;
+ e->nr_entries = nr_entries;
+ INIT_HLIST_HEAD(&(e->modules));
+ hlist_add_head(&e->hlist, head);
+ return e;
+}
+
+static int build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop)
+{
+ struct jump_entry *iter, *iter_begin;
+ struct jump_label_entry *entry;
+ int count;
+
+ sort_jump_label_entries(start, stop);
+ iter = start;
+ while (iter < stop) {
+ entry = get_jump_label_entry((char *)iter->name);
+ if (!entry) {
+ iter_begin = iter;
+ count = 0;
+ while ((iter < stop) &&
+ (strcmp((char *)iter->name,
+ (char *)iter_begin->name) == 0)) {
+ iter++;
+ count++;
+ }
+ entry = add_jump_label_entry((char *)iter_begin->name,
+ count, iter_begin);
+ if (IS_ERR(entry))
+ return PTR_ERR(entry);
+ continue;
+ }
+ WARN(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n");
+ }
+ return 0;
+}
+
+/***
+ * jump_label_update - update jump label text
+ * @name - name of the jump label
+ * @type - enum set to JUMP_LABEL_ENABLE or JUMP_LABEL_DISABLE
+ *
+ * Will enable/disable the jump for jump label @name, depending on the
+ * value of @type.
+ *
+ */
+
+void jump_label_update(const char *name, enum jump_label_type type)
+{
+ struct jump_entry *iter;
+ struct jump_label_entry *entry;
+ struct hlist_node *module_node;
+ struct jump_label_module_entry *e_module;
+ int count;
+
+ mutex_lock(&jump_label_mutex);
+ entry = get_jump_label_entry(name);
+ if (entry) {
+ count = entry->nr_entries;
+ iter = entry->table;
+ while (count--) {
+ if (kernel_text_address(iter->code))
+ arch_jump_label_transform(iter, type);
+ iter++;
+ }
+ }
+ mutex_unlock(&jump_label_mutex);
+}
+
+static int init_jump_label(void)
+{
+ int ret;
+
+ mutex_lock(&jump_label_mutex);
+ ret = build_jump_label_hashtable(__start___jump_table,
+ __stop___jump_table);
+ mutex_unlock(&jump_label_mutex);
+ return ret;
+}
+early_initcall(init_jump_label);
+
+#endif
diff --git a/scripts/gcc-goto.sh b/scripts/gcc-goto.sh
new file mode 100644
index 0000000..0bd0b04
--- /dev/null
+++ b/scripts/gcc-goto.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+# Test for gcc 'asm goto' suport
+# Copyright (C) 2010, Jason Baron <[email protected]>
+
+echo "int main(void) { entry: asm goto (\"\"::::entry); return 0; }" | $1 -x c - -o /dev/null >/dev/null 2>&1 && echo "y"
--
1.7.1

2010-06-09 21:58:17

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 01/13] jump label v9: notifier atomic call chain notrace

On Wed, Jun 09, 2010 at 05:38:52PM -0400, Jason Baron wrote:
> From: Mathieu Desnoyers <[email protected]>
>
> In LTTng, being able to use the atomic notifier from cpu idle entry to
> ensure the tracer flush the last events in the current subbuffer
> requires the rcu read-side to be marked "notrace", otherwise it can end
> up calling back into lockdep and the tracer.
>
> Also apply to the the die notifier.
>
> Signed-off-by: Mathieu Desnoyers <[email protected]>
> Signed-off-by: Jason Baron <[email protected]>
> Reviewed-by: Masami Hiramatsu <[email protected]>
> ---



I don't understand the purpose of this patch. The only notifier
I see that is used by jmp label is the module notifier, and I
don't see how it can do bad things if we don't use the notrace
version.

2010-06-09 22:35:52

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch

On Wed, Jun 09, 2010 at 05:38:57PM -0400, Jason Baron wrote:
> +static int build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop)
> +{
> + struct jump_entry *iter, *iter_begin;
> + struct jump_label_entry *entry;
> + int count;
> +
> + sort_jump_label_entries(start, stop);
> + iter = start;
> + while (iter < stop) {
> + entry = get_jump_label_entry((char *)iter->name);
> + if (!entry) {
> + iter_begin = iter;
> + count = 0;
> + while ((iter < stop) &&
> + (strcmp((char *)iter->name,
> + (char *)iter_begin->name) == 0)) {
> + iter++;
> + count++;
> + }




So, you can have multiple entries with the same name? How can that happen
in fact?




> + entry = add_jump_label_entry((char *)iter_begin->name,
> + count, iter_begin);
> + if (IS_ERR(entry))
> + return PTR_ERR(entry);
> + continue;
> + }
> + WARN(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n");



It seems you are going to endless loop in this fail case.



> + }
> + return 0;
> +}
> +
> +/***
> + * jump_label_update - update jump label text
> + * @name - name of the jump label
> + * @type - enum set to JUMP_LABEL_ENABLE or JUMP_LABEL_DISABLE
> + *
> + * Will enable/disable the jump for jump label @name, depending on the
> + * value of @type.
> + *
> + */
> +
> +void jump_label_update(const char *name, enum jump_label_type type)
> +{
> + struct jump_entry *iter;
> + struct jump_label_entry *entry;
> + struct hlist_node *module_node;
> + struct jump_label_module_entry *e_module;
> + int count;
> +
> + mutex_lock(&jump_label_mutex);
> + entry = get_jump_label_entry(name);
> + if (entry) {
> + count = entry->nr_entries;
> + iter = entry->table;
> + while (count--) {
> + if (kernel_text_address(iter->code))
> + arch_jump_label_transform(iter, type);
> + iter++;
> + }



So, this is going to patch multiple times the same value on the
same address in case you have multiple entries for the same name?

That look weird.

BTW, if you can't find the entry, you should perhaps propagate an error.



> + }
> + mutex_unlock(&jump_label_mutex);
> +}
> +
> +static int init_jump_label(void)



This can be __init.

2010-06-09 22:36:56

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch

On Wed, Jun 09, 2010 at 05:38:57PM -0400, Jason Baron wrote:
> +#define JUMP_TABLE() \
> + . = ALIGN(64); \
> + VMLINUX_SYMBOL(__start___jump_table) = .; \
> + *(__jump_table) \
> + VMLINUX_SYMBOL(__stop___jump_table) = .; \
> +



Why does it need to be aligned to 64?

Thanks.

2010-06-10 12:07:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch

On Wed, 2010-06-09 at 17:38 -0400, Jason Baron wrote:
> +static void sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop)
> +{
> + int swapped = 0;
> + struct jump_entry *iter;
> + struct jump_entry *iter_next;
> +
> + do {
> + swapped = 0;
> + iter = start;
> + iter_next = start;
> + iter_next++;
> + for (; iter_next < stop; iter++, iter_next++) {
> + if (strcmp((char *)iter->name,
> + (char *)iter_next->name) > 0) {
> + swap_jump_label_entries(iter, iter_next);
> + swapped = 1;
> + }
> + }
> + } while (swapped == 1);
> +}

Would lib/sort.c be of any help?

2010-06-10 12:12:35

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE

That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
useless...

2010-06-10 12:14:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support


* Peter Zijlstra <[email protected]> wrote:

> On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
>
> That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> useless...

Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
all the time, for the massive kernel image (and hotpath cache footprint)
savings. Is this fixable?

Thanks,

Ingo

2010-06-10 12:15:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> +#define JUMP_LABEL(tag, label, cond) \
> + do { \
> + extern const char __jlstrtab_##tag[]; \
> + asm goto("1:" \
> + JUMP_LABEL_NOP \
> + ".pushsection __jump_table, \"a\" \n\t"\
> + _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
> + ".popsection \n\t" \
> + : : "i" (__jlstrtab_##tag) : : label);\
> + } while (0)

Would it make sense to have that macro at least evaluate cond in some
way?

2010-06-10 12:19:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch

On Wed, 2010-06-09 at 17:38 -0400, Jason Baron wrote:

> +#define DEFINE_JUMP_LABEL(name) \
> + const char __jlstrtab_##name[] \
> + __used __attribute__((section("__jump_strings"))) = #name;

> +#define JUMP_LABEL(tag, label, cond) \
> +do { \
> + if (unlikely(cond)) \
> + goto label; \
> +} while (0)

> +#define JUMP_LABEL(tag, label, cond) \
> + do { \
> + extern const char __jlstrtab_##tag[]; \
> + asm goto("1:" \
> + JUMP_LABEL_NOP \
> + ".pushsection __jump_table, \"a\" \n\t"\
> + _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
> + ".popsection \n\t" \
> + : : "i" (__jlstrtab_##tag) : : label);\
> + } while (0)


s/tag/name/ ?

2010-06-10 12:22:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 04/13] jump label v9: tracepoint support

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> + JUMP_LABEL(&__tracepoint_##name, do_trace, __tracepoint_##name.state);\


> +#define JUMP_LABEL(tag, label, cond) \
> + do { \
> + extern const char __jlstrtab_##tag[]; \


So that results in:

extern const char __jlstrtab_&__tracepoint_FOO[]; ?

2010-06-10 12:33:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> add x86 support for jump label. I'm keeping this patch separate so its clear to
> arch maintainers what was required for x86 support this new feature. hopefully,
> it wouldn't be too painful for other arches.

Well patch 10 really rather spoils that, doesn't it?

2010-06-10 12:38:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 10/13] jump label v9: convert jump label to use a key

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> -#define JUMP_LABEL(tag, label, cond) \
> +#define JUMP_LABEL(key, label) \
> do { \
> - if (unlikely(cond)) \
> + if (unlikely(*key)) \
> goto label; \
> } while (0)

s/key/cond_var/ or something like that?

2010-06-10 12:44:05

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 10/13] jump label v9: convert jump label to use a key

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> -static inline int enable_jump_label(const char *name)
> -{
> - return 0;
> -}
> -
> -static inline int disable_jump_label(const char *name)
> -{
> - return 0;
> -}
> +#define enable_jump_label(key)
> +#define disable_jump_label(key)

I would expect enable_jump_label() to look something like:

#define enable_jump_label(cond_var) \
do { \
*(cond_var) = 1; \
} while (0)

That way the HAVE_JUMP_LABEL and !HAVE_JUMP_LABEL code has similar
effects.

2010-06-10 12:49:23

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 13/13] jump label v9: add docs

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> +The optimization depends on !CC_OPTIMIZE_FOR_SIZE. When CC_OPTIMIZE_FOR_SIZE is
> +set, gcc does not always out of line the not taken label path in the same way
> +that the "if unlikely()" paths are made out of line. Thus, with
> +CC_OPTIMIZE_FOR_SIZE set, this optimization is not always optimal. This may be
> +solved in subsequent gcc versions, that allow us to move labels out of line,
> +while still optimizing for size. In the case of !CC_OPTIMIZE_FOR_SIZE this
> +optimization is seen on high level benchmarks such as tbench where I can get
> +~1-2% higher throughput. In addition we are within .5% of the performance of no
> +tracepoints compiled in at all.

But does it generate invalid code for whatever -f flag triggers this
(-fno-reorder-blocks ?)

2010-06-10 13:26:23

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, Jun 10, 2010 at 02:14:40PM +0200, Ingo Molnar wrote:
>
> * Peter Zijlstra <[email protected]> wrote:
>
> > On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > > + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
> >
> > That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> > useless...
>
> Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
> all the time, for the massive kernel image (and hotpath cache footprint)
> savings. Is this fixable?

Actually the big distros (RHEL, SLES) disable it all the time now. It costs you
in some benchmarks. The code generated by -Os is often terrible.

Nearly everytime I investigate some bad asm code being generated by gcc
it goes away when that flag is disabled.

A much better to get smaller kernel images is to do more __cold
annotations for slow paths. Newer gcc will then simply only do -Os for these
functions.

It's already done for __init.

-Andi
--
[email protected] -- Speaking for myself only.

2010-06-10 13:58:14

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 10/13] jump label v9: convert jump label to use a key

On Thu, Jun 10, 2010 at 02:43:46PM +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > -static inline int enable_jump_label(const char *name)
> > -{
> > - return 0;
> > -}
> > -
> > -static inline int disable_jump_label(const char *name)
> > -{
> > - return 0;
> > -}
> > +#define enable_jump_label(key)
> > +#define disable_jump_label(key)
>
> I would expect enable_jump_label() to look something like:
>
> #define enable_jump_label(cond_var) \
> do { \
> *(cond_var) = 1; \
> } while (0)
>
> That way the HAVE_JUMP_LABEL and !HAVE_JUMP_LABEL code has similar
> effects.

right. I was going to clean that up in a followup. But you are right - I
think it makes the code much clearer. will fix.

thanks,

-Jason

2010-06-10 14:12:35

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, 2010-06-10 at 15:26 +0200, Andi Kleen wrote:
> The code generated by -Os is often terrible.

Is anybody on the gcc side of things looking into curing that?

I mean, what's the point of having an -Os if its useless in practise.

2010-06-10 14:28:28

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, Jun 10, 2010 at 04:12:11PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-06-10 at 15:26 +0200, Andi Kleen wrote:
> > The code generated by -Os is often terrible.
>
> Is anybody on the gcc side of things looking into curing that?

The problem is that the smallest code is often terrible.
You could often be much better with spending a few more bytes.
But -Os means "smallest"

On the other hand -Os could be likely made smaller
(it often still is not very good), but I fear that would
make things even worse.

We probably would need a -Osmall-but-not-terrible or so,
but that's not there.
>
> I mean, what's the point of having an -Os if its useless in practise.

I think __hot / __cold but keeping the default at -O2 is a better
approach anyways. Hot paths should be -O2. It just needs some more work.

It already works for __init/__exit at least.

-Andi
--
[email protected] -- Speaking for myself only.

2010-06-10 14:46:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 10/13] jump label v9: convert jump label to use a key

On Thu, 2010-06-10 at 09:57 -0400, Jason Baron wrote:

> right. I was going to clean that up in a followup. But you are right - I
> think it makes the code much clearer. will fix.

Please fold this patch into the previous patches. That makes the series
much more readable.

2010-06-10 15:05:04

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, Jun 10, 2010 at 02:14:40PM +0200, Ingo Molnar wrote:
> * Peter Zijlstra <[email protected]> wrote:
>
> > On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > > + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
> >
> > That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> > useless...
>
> Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
> all the time, for the massive kernel image (and hotpath cache footprint)
> savings. Is this fixable?
>
> Thanks,
>
> Ingo
>

When I tested 'jump label' with CC_OPTIMIZE_FOR_SIZE, I saw a small
performance drop , b/c there is less block re-ordering happening. There
was a small gcc patch that Richard Henderson wrote to allow block
re-ordering while still taking size into account. However, it did
increase the text size somewhat from what current optimize for size
does. This small patch combined with jump labels resulted in the
expected performance gains. However, to keep the size to what it is now
a much more involved gcc patch would be required.

Thus, until this additional gcc optimization is done, I've posted this
as depends on !CC_OPTIMIZE_FOR_SIZE. If this gcc work is done, we can
lift this restriction. As mentioned, a number of large distros are
compiled !CC_OPTIMIZE_FOR_SIZE (rhel, sles), so we should still get a lot of
benefit from this. Also, if people find these patches useful we might
create some more impetus for the gcc work. So i see this as an iterative
thing.

thanks,

-Jason

2010-06-10 15:35:16

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 01/13] jump label v9: notifier atomic call chain notrace

On Wed, Jun 09, 2010 at 11:58:13PM +0200, Frederic Weisbecker wrote:
> On Wed, Jun 09, 2010 at 05:38:52PM -0400, Jason Baron wrote:
> > From: Mathieu Desnoyers <[email protected]>
> >
> > In LTTng, being able to use the atomic notifier from cpu idle entry to
> > ensure the tracer flush the last events in the current subbuffer
> > requires the rcu read-side to be marked "notrace", otherwise it can end
> > up calling back into lockdep and the tracer.
> >
> > Also apply to the the die notifier.
> >
> > Signed-off-by: Mathieu Desnoyers <[email protected]>
> > Signed-off-by: Jason Baron <[email protected]>
> > Reviewed-by: Masami Hiramatsu <[email protected]>
> > ---
>
>
>
> I don't understand the purpose of this patch. The only notifier
> I see that is used by jmp label is the module notifier, and I
> don't see how it can do bad things if we don't use the notrace
> version.
>

hmmm...yeah, this patch was introduced b/c of the use of the 'int3'
notifier when updating the kernel text. We got into a recursive loop
without it. However, since the current implementation is using
stop_machine() to do text updates, I don't think this patch is necessary
until we have the more optimal text updating mechanism in place. I'll
re-test without this patch.

thanks,

-Jason

2010-06-10 15:38:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support


* Andi Kleen <[email protected]> wrote:

> > Hm, we need more than a comment for that - distros enable
> > CC_OPTIMIZE_FOR_SIZE all the time, for the massive kernel image (and
> > hotpath cache footprint) savings. Is this fixable?
>
> Actually the big distros (RHEL, SLES) disable it all the time now. [...]

Not all the big distros, Fedora certainly doesnt.

> [...] It costs you in some benchmarks. [...]

Microbenchmarks mostly, see below.

> The code generated by -Os is often terrible.

But it results in a kernel .text that is ~30% smaller, so considering the fact
that most real-life kernel code execution is instruction-cache-cold it's
generally a win.

But micro-benchmarks are instruction-cache-hot so that's where the pressure to
remove the flag comes from.

> Nearly everytime I investigate some bad asm code being generated by gcc it
> goes away when that flag is disabled.

That is not fixing anything, it is working bugs around.

> A much better to get smaller kernel images is to do more __cold annotations
> for slow paths. Newer gcc will then simply only do -Os for these functions.

That's an opt-in method and we cannot reach the kinds of 30% code size
reductions that -Os can achieve. Most code in the kernel is not cache-hot,
even on microbenchmarks.

A much better model would be to actively mark hot codepaths with a __hot
attribute instead. Then the code size difference can be considered on a case
by case basis.

And where GCC produces indefensibly crap code there GCC needs to be fixed.
Crap code often increases size so the fix would increase the efficiency of
-Os.

Thanks,

Ingo

2010-06-10 15:45:33

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch

On Thu, Jun 10, 2010 at 12:35:49AM +0200, Frederic Weisbecker wrote:
> On Wed, Jun 09, 2010 at 05:38:57PM -0400, Jason Baron wrote:
> > +static int build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop)
> > +{
> > + struct jump_entry *iter, *iter_begin;
> > + struct jump_label_entry *entry;
> > + int count;
> > +
> > + sort_jump_label_entries(start, stop);
> > + iter = start;
> > + while (iter < stop) {
> > + entry = get_jump_label_entry((char *)iter->name);
> > + if (!entry) {
> > + iter_begin = iter;
> > + count = 0;
> > + while ((iter < stop) &&
> > + (strcmp((char *)iter->name,
> > + (char *)iter_begin->name) == 0)) {
> > + iter++;
> > + count++;
> > + }
>
>
>
>
> So, you can have multiple entries with the same name? How can that happen
> in fact?
>
>

this is the case where a single tracepoint such as kmalloc(), is used in
all over the kernel. So, there is one name or key value associated with
a kmalloc tracepoint. however, we have to patch the jump or nop into a
bunch of places in the kernel text.

>
>
> > + entry = add_jump_label_entry((char *)iter_begin->name,
> > + count, iter_begin);
> > + if (IS_ERR(entry))
> > + return PTR_ERR(entry);
> > + continue;
> > + }
> > + WARN(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n");
>
>
>
> It seems you are going to endless loop in this fail case.
>

agreed. I need to stick that 'WARN' into the else clause of "if
(!entry)" and return an error.

>
>
> > + }
> > + return 0;
> > +}
> > +
> > +/***
> > + * jump_label_update - update jump label text
> > + * @name - name of the jump label
> > + * @type - enum set to JUMP_LABEL_ENABLE or JUMP_LABEL_DISABLE
> > + *
> > + * Will enable/disable the jump for jump label @name, depending on the
> > + * value of @type.
> > + *
> > + */
> > +
> > +void jump_label_update(const char *name, enum jump_label_type type)
> > +{
> > + struct jump_entry *iter;
> > + struct jump_label_entry *entry;
> > + struct hlist_node *module_node;
> > + struct jump_label_module_entry *e_module;
> > + int count;
> > +
> > + mutex_lock(&jump_label_mutex);
> > + entry = get_jump_label_entry(name);
> > + if (entry) {
> > + count = entry->nr_entries;
> > + iter = entry->table;
> > + while (count--) {
> > + if (kernel_text_address(iter->code))
> > + arch_jump_label_transform(iter, type);
> > + iter++;
> > + }
>
>
>
> So, this is going to patch multiple times the same value on the
> same address in case you have multiple entries for the same name?
>
> That look weird.

no. as mentioned before, there are multiple text addresses potentially
associated with a single jump label conditional variable. So we need to
patch all of them.

>
> BTW, if you can't find the entry, you should perhaps propagate an error.
>
>
>
> > + }
> > + mutex_unlock(&jump_label_mutex);
> > +}
> > +
> > +static int init_jump_label(void)
>
>
>
> This can be __init.
>

ok.

thanks for the review.

-Jason

2010-06-10 16:18:44

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

* Jason Baron ([email protected]) wrote:
> On Thu, Jun 10, 2010 at 02:14:40PM +0200, Ingo Molnar wrote:
> > * Peter Zijlstra <[email protected]> wrote:
> >
> > > On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > > > + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
> > >
> > > That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> > > useless...
> >
> > Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
> > all the time, for the massive kernel image (and hotpath cache footprint)
> > savings. Is this fixable?
> >
> > Thanks,
> >
> > Ingo
> >
>
> When I tested 'jump label' with CC_OPTIMIZE_FOR_SIZE, I saw a small
> performance drop , b/c there is less block re-ordering happening.

Is this a performance drop compared to a jump-label-less kernel or
compared to -O2 kernel compiled with jump labels ? Or both ?

Mathieu


--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2010-06-10 16:23:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch


* Jason Baron <[email protected]> wrote:

> > > + entry = add_jump_label_entry((char *)iter_begin->name,
> > > + count, iter_begin);
> > > + if (IS_ERR(entry))
> > > + return PTR_ERR(entry);
> > > + continue;
> > > + }
> > > + WARN(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n");
> >
> >
> >
> > It seems you are going to endless loop in this fail case.
> >
>
> agreed. I need to stick that 'WARN' into the else clause of "if
> (!entry)" and return an error.

and make that WARN_ONCE() as well. If it ever triggers it should show up once
and not compound the bug with spamming the console and the logs.

Thanks,

Ingo

2010-06-10 16:24:58

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, Jun 10, 2010 at 05:37:42PM +0200, Ingo Molnar wrote:
>
> > [...] It costs you in some benchmarks. [...]
>
> Microbenchmarks mostly, see below.

I didn't make these decisions, but I assume who made them had good reasons
and enough data on larger benchmarks too.

> > A much better to get smaller kernel images is to do more __cold annotations
> > for slow paths. Newer gcc will then simply only do -Os for these functions.
>
> That's an opt-in method and we cannot reach the kinds of 30% code size
> reductions that -Os can achieve. Most code in the kernel is not cache-hot,
> even on microbenchmarks.

Maybe, maybe not. But yes it can be approached from both ways.

Personally I would prefer to simply write less bloated code to get
code reductions. Simpler code is often faster too.

>
> A much better model would be to actively mark hot codepaths with a __hot
> attribute instead. Then the code size difference can be considered on a case
> by case basis.

Yes that works too for those who still use -Os.

e.g. marking the scheduler and a few mm hot paths this way would certain make sense.

>
> And where GCC produces indefensibly crap code there GCC needs to be fixed.
> Crap code often increases size so the fix would increase the efficiency of
> -Os.

In some cases agreed, but common cases it's really: you asked for the smallest
you got it, even if it's slow. It's not -Odwim.

One standard example here is a division by constant. The shortest way is
using DIVI/IDIV if it's not 2^n and small enough, but it's really quite slow
in hardware. If you spend a few more bytes you can do much better for a wide
range of constants.

Most likely we would need a new -O flag to avoid such cases.

BTW I experimented with marking a few common cases like this (e.g. time unit
conversion) hot, but gcc currently has trouble with __hot on inlines. So you
would always need to mark the caller.

-Andi

--
[email protected] -- Speaking for myself only.

2010-06-10 16:29:18

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, 2010-06-10 at 15:26 +0200, Andi Kleen wrote:
> On Thu, Jun 10, 2010 at 02:14:40PM +0200, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <[email protected]> wrote:
> >
> > > On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > > > + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
> > >
> > > That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> > > useless...
> >
> > Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
> > all the time, for the massive kernel image (and hotpath cache footprint)
> > savings. Is this fixable?
>
> Actually the big distros (RHEL, SLES) disable it all the time now. It costs you
> in some benchmarks. The code generated by -Os is often terrible.
>
> Nearly everytime I investigate some bad asm code being generated by gcc
> it goes away when that flag is disabled.

I agree to the above statement. It was this option that first caused the
crazy mcount prefix code that totally broke the function tracer on i386.
With this option, some functions would send a copy of the return address
to mcount and not an actual pointer to the return address on stack. This
would confuse the function tracer and we ended up with a kernel panic.

-Os really needs to be fixed.

-- Steve

2010-06-10 17:11:29

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 02/13] jump label v9: base patch

On Thu, Jun 10, 2010 at 11:44:52AM -0400, Jason Baron wrote:
> On Thu, Jun 10, 2010 at 12:35:49AM +0200, Frederic Weisbecker wrote:
> > On Wed, Jun 09, 2010 at 05:38:57PM -0400, Jason Baron wrote:
> > > +static int build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop)
> > > +{
> > > + struct jump_entry *iter, *iter_begin;
> > > + struct jump_label_entry *entry;
> > > + int count;
> > > +
> > > + sort_jump_label_entries(start, stop);
> > > + iter = start;
> > > + while (iter < stop) {
> > > + entry = get_jump_label_entry((char *)iter->name);
> > > + if (!entry) {
> > > + iter_begin = iter;
> > > + count = 0;
> > > + while ((iter < stop) &&
> > > + (strcmp((char *)iter->name,
> > > + (char *)iter_begin->name) == 0)) {
> > > + iter++;
> > > + count++;
> > > + }
> >
> >
> >
> >
> > So, you can have multiple entries with the same name? How can that happen
> > in fact?
> >
> >
>
> this is the case where a single tracepoint such as kmalloc(), is used in
> all over the kernel. So, there is one name or key value associated with
> a kmalloc tracepoint. however, we have to patch the jump or nop into a
> bunch of places in the kernel text.



Ok, sounds fair then.

Thanks.

2010-06-11 00:52:55

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On Thu, Jun 10, 2010 at 12:13:39PM -0400, Mathieu Desnoyers wrote:
> * Jason Baron ([email protected]) wrote:
> > On Thu, Jun 10, 2010 at 02:14:40PM +0200, Ingo Molnar wrote:
> > > * Peter Zijlstra <[email protected]> wrote:
> > >
> > > > On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> > > > > + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
> > > >
> > > > That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> > > > useless...
> > >
> > > Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
> > > all the time, for the massive kernel image (and hotpath cache footprint)
> > > savings. Is this fixable?
> > >
> > > Thanks,
> > >
> > > Ingo
> > >
> >
> > When I tested 'jump label' with CC_OPTIMIZE_FOR_SIZE, I saw a small
> > performance drop , b/c there is less block re-ordering happening.
>
> Is this a performance drop compared to a jump-label-less kernel or
> compared to -O2 kernel compiled with jump labels ? Or both ?
>
> Mathieu
>

Hi Mathieu,

So I'm quoting tbench benchmark here. The performance drop was jump
label vs. all jump label patches backed out on -Os. If we move to -02,
both the no jump label patches and the jump label patches applied are
faster than all jump label patches backed out on -Os.

so:

jump labels -02 > no jump labels -02 > no jump labels -0s > jump lables
-Os

thanks,

-Jason

2010-06-11 06:19:35

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

On 06/10/2010 05:14 AM, Ingo Molnar wrote:
>
> * Peter Zijlstra <[email protected]> wrote:
>
>> On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
>>> + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
>>
>> That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
>> useless...
>
> Hm, we need more than a comment for that - distros enable CC_OPTIMIZE_FOR_SIZE
> all the time, for the massive kernel image (and hotpath cache footprint)
> savings. Is this fixable?
>

Actually the current reports from the gcc community is that gcc 4.5.0 +
-Os produces a broken kernel even without asm goto:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44129

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2010-06-11 07:58:43

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support


* H. Peter Anvin <[email protected]> wrote:

> On 06/10/2010 05:14 AM, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <[email protected]> wrote:
> >
> >> On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> >>> + select HAVE_ARCH_JUMP_LABEL if !CC_OPTIMIZE_FOR_SIZE
> >>
> >> That deserves a comment somewhere, it basically makes OPTIMIZE_FOR_SIZE
> >> useless...
> >
> > Hm, we need more than a comment for that - distros enable
> > CC_OPTIMIZE_FOR_SIZE all the time, for the massive kernel image (and
> > hotpath cache footprint) savings. Is this fixable?
>
> Actually the current reports from the gcc community is that gcc 4.5.0 + -Os
> produces a broken kernel even without asm goto:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44129

Well, most distros havent switched to gcc 4.5 yet (even rawhide is still on
4.4 + backports) and i guess the lack of testing shows. New GCC versions have
a long history of causing bugs in the kernel.

Thanks,

Ingo

2010-06-11 08:13:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support


* Andi Kleen <[email protected]> wrote:

> > > A much better to get smaller kernel images is to do more __cold
> > > annotations for slow paths. Newer gcc will then simply only do -Os for
> > > these functions.
> >
> > That's an opt-in method and we cannot reach the kinds of 30% code size
> > reductions that -Os can achieve. Most code in the kernel is not cache-hot,
> > even on microbenchmarks.
>
> Maybe, maybe not. But yes it can be approached from both ways.

You dont seem to have understood my point: there's a big difference between an
opt-in and an opt-out model.

What you are arguing for is a 'bloaty code generator by default' model and
that model sucks.

Trying to achieve reductions by opt-in marking functions as a 'please reduce
it' __cold marker is a losing battle: most new kernel code is 'cold' and
should be reduced, yet most new code does not (and will not) come with __cold
markers.

The proper model is to assume that everything should be conservatively
size-reduced (because, almost by definition, 90% of new kernel code should
stay small and should stay out of the way), and where benchmarks+importance
proves it we can allow bloatier code generator via __hot.

Important codepaths can get __hot annotations just as much as they are
receiving 'inline' optimizations and other kinds of hand-tuning attention.

> Personally I would prefer to simply write less bloated code to get code
> reductions. Simpler code is often faster too.

You are posing this as an if-else choice, while in reality both should be
done: the best result is to write simpler/faster code _and_ to have a
compact-by-default code generator too ...

> > A much better model would be to actively mark hot codepaths with a __hot
> > attribute instead. Then the code size difference can be considered on a
> > case by case basis.
>
> Yes that works too for those who still use -Os.
>
> e.g. marking the scheduler and a few mm hot paths this way would certain
> make sense.

Possibly, but not without substantiating the rather vague statements you have
made so far.

If you are sending such per function annotation patches then you need to come
up with actual hard numbers as well. One convenient way to measure such things
is a before/after "perf stat --repeat" run - as the noise estimations can be
compared and we can see whether there's a provable effect. (And, of course,
disassembly of GCC suckage is helpful as well.)

Thanks,

Ingo

2010-06-11 08:30:34

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 03/13] jump label v9: x86 support

> What you are arguing for is a 'bloaty code generator by default' model and
> that model sucks.

I am arguing for a "non sucky code by default" model.

It is widely known that "sucky code by default" sucks already,
that is why the big distros made their choice.

Anyways luckily the default is all config options so we don't need
to agree on this (and the best choice likely varies by workload
too)

> Possibly, but not without substantiating the rather vague statements you have
> made so far.

Yes, more data with recent builds is needed for concrete changes.

BTW afaik the "icache over everything" model was never really
substantiated by all that much data either, just somehow
it became dogma.

I must say I was a bit burned by doing annotations -- i added
unlikely() originally and as far as I can see most unlikely()s
are quite useless today because they do nothing the compiler
doesn't do already so I would prefer to not repeat that.

So my personal preference is actually less annotations over more.

-Andi

--
[email protected] -- Speaking for myself only.

2010-06-15 03:47:49

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9


Jason, I'm really at wits end about this patch set. To say
that trying to test our your patches is frustrating for me
so far would be an understatement.

Nothing you ever post builds for me, not one patch set has
built properly.

I can also tell that you're just blindly making changes to the
sparc bits and not trying to build test them at all:

1) Even though you created the jump_label_t, and made it properly
a u32 on sparc, you left the assembler using ".xword" to
record the entries.

2) The sparc "struct jump_label" still calls it's third member "name",
it needs to be "key" or else the build breaks.

3) Eventhough the sparc JUMP_LABEL macro was fixed to have two args,
the first arg was left as "tag" instead of being renamed to "key"
and that name change propaged into the asm in the macro expansion.

I took care of that locally to try and test this, but then I hit the
current major problem which is that you're using things like
text_poke_early() unconditionally, but that is an X86-only facility
implemented by x86's alternative mechanism.

Also, kernel/jump_label.c only gets the ERR_PTR() definitions
indirectly on the x86 platform, it needs to include linux/err.h
directly to make sure those things are available on every platform.

You gave me the impression a few iterations ago that you were doing
build testing on sparc64 using cross-compilers, or that you would
start to do so. You're obviously not, could you please start doing so
and let me know when you've at least build tested your jump-label
patch series on sparc64 and at least one architecture that lacks
jump-label support?

Thanks.

2010-06-15 14:29:04

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9

On Mon, Jun 14, 2010 at 08:47:59PM -0700, David Miller wrote:
> Jason, I'm really at wits end about this patch set. To say
> that trying to test our your patches is frustrating for me
> so far would be an understatement.
>
> Nothing you ever post builds for me, not one patch set has
> built properly.
>
> I can also tell that you're just blindly making changes to the
> sparc bits and not trying to build test them at all:
>
> 1) Even though you created the jump_label_t, and made it properly
> a u32 on sparc, you left the assembler using ".xword" to
> record the entries.
>
> 2) The sparc "struct jump_label" still calls it's third member "name",
> it needs to be "key" or else the build breaks.
>
> 3) Eventhough the sparc JUMP_LABEL macro was fixed to have two args,
> the first arg was left as "tag" instead of being renamed to "key"
> and that name change propaged into the asm in the macro expansion.
>
> I took care of that locally to try and test this, but then I hit the
> current major problem which is that you're using things like
> text_poke_early() unconditionally, but that is an X86-only facility
> implemented by x86's alternative mechanism.
>
> Also, kernel/jump_label.c only gets the ERR_PTR() definitions
> indirectly on the x86 platform, it needs to include linux/err.h
> directly to make sure those things are available on every platform.
>
> You gave me the impression a few iterations ago that you were doing
> build testing on sparc64 using cross-compilers, or that you would
> start to do so. You're obviously not, could you please start doing so
> and let me know when you've at least build tested your jump-label
> patch series on sparc64 and at least one architecture that lacks
> jump-label support?
>
> Thanks.

Hi David,

Yes, I've tried to help re-write the sparc bits to the current api.
However, I did not test the sparc enabled jump-label bits, b/c I need an
updated cross compiler to do so (that has jump label support). However, I
certainly did build test the patches on powerpc, which lacks jump-label support,
so I know it builds on at least one architecture that lacks jump-label support
as you've mentioned. And I did this specifically, since you requested this
testing.

I guess I was hoping we could work more collaboratively on the sparc
bits. A couple lines of code have fixed the issues that you've brought up.
Sorry, if i mislead you. I really just want to do what is best for the linux
kernel, if that's going off and figuring out how to compile a new sparc
enabled jump label compiler for sparc, I will do it. And I do agree,
that leaving text_poke_early() is my mistake. However, maybe we can
discuss this issue? For example, the reason I have it in the code is b/c
x86 determines the best no-op at run-time. Are other architectures going
to have to require this kind of functionality. Or like sparc, are we
going to be able to generally hard-code the nops on non-x86 at
compile-time?

thanks. And again I apologize for any wasted cycles that I've caused.

-Jason

2010-06-15 15:44:46

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9

* Jason Baron ([email protected]) wrote:
> On Mon, Jun 14, 2010 at 08:47:59PM -0700, David Miller wrote:
> > Jason, I'm really at wits end about this patch set. To say
> > that trying to test our your patches is frustrating for me
> > so far would be an understatement.
> >
> > Nothing you ever post builds for me, not one patch set has
> > built properly.
> >
> > I can also tell that you're just blindly making changes to the
> > sparc bits and not trying to build test them at all:
> >
> > 1) Even though you created the jump_label_t, and made it properly
> > a u32 on sparc, you left the assembler using ".xword" to
> > record the entries.
> >
> > 2) The sparc "struct jump_label" still calls it's third member "name",
> > it needs to be "key" or else the build breaks.
> >
> > 3) Eventhough the sparc JUMP_LABEL macro was fixed to have two args,
> > the first arg was left as "tag" instead of being renamed to "key"
> > and that name change propaged into the asm in the macro expansion.
> >
> > I took care of that locally to try and test this, but then I hit the
> > current major problem which is that you're using things like
> > text_poke_early() unconditionally, but that is an X86-only facility
> > implemented by x86's alternative mechanism.
> >
> > Also, kernel/jump_label.c only gets the ERR_PTR() definitions
> > indirectly on the x86 platform, it needs to include linux/err.h
> > directly to make sure those things are available on every platform.
> >
> > You gave me the impression a few iterations ago that you were doing
> > build testing on sparc64 using cross-compilers, or that you would
> > start to do so. You're obviously not, could you please start doing so
> > and let me know when you've at least build tested your jump-label
> > patch series on sparc64 and at least one architecture that lacks
> > jump-label support?
> >
> > Thanks.
>
> Hi David,
>
> Yes, I've tried to help re-write the sparc bits to the current api.
> However, I did not test the sparc enabled jump-label bits, b/c I need an
> updated cross compiler to do so (that has jump label support). However, I
> certainly did build test the patches on powerpc, which lacks jump-label support,
> so I know it builds on at least one architecture that lacks jump-label support
> as you've mentioned. And I did this specifically, since you requested this
> testing.
>
> I guess I was hoping we could work more collaboratively on the sparc
> bits. A couple lines of code have fixed the issues that you've brought up.
> Sorry, if i mislead you. I really just want to do what is best for the linux
> kernel, if that's going off and figuring out how to compile a new sparc
> enabled jump label compiler for sparc, I will do it.

Hi Jason,

It makes me wonder if anyone had success building a gcc 4.5
Intel-to-sparc64 cross-compiler ? Usually, the crosstool-like suites are
a few versions behind. I'm aware that this is not trivial, as
cross-compilers have a tendency to refuse to get built in certain
occasions (such as being a recent less tested version). I'd recommend
you focus on this as a first step before resubmitting.

> And I do agree,
> that leaving text_poke_early() is my mistake. However, maybe we can
> discuss this issue? For example, the reason I have it in the code is b/c
> x86 determines the best no-op at run-time. Are other architectures going
> to have to require this kind of functionality. Or like sparc, are we
> going to be able to generally hard-code the nops on non-x86 at
> compile-time?

You might want to create "dumb" text_poke() and text_poke_early()
implementations on other architectures that wraps kernel text updates
pretty simply. The implementation is trivial if the architecture does
not write-protect the text pages, but becomes more evolved when it does.

Thanks,

Mathieu

>
> thanks. And again I apologize for any wasted cycles that I've caused.
>
> -Jason
>
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2010-06-15 17:12:56

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9

From: Jason Baron <[email protected]>
Date: Tue, 15 Jun 2010 10:28:11 -0400

> For example, the reason I have it in the code is b/c x86 determines
> the best no-op at run-time. Are other architectures going to have to
> require this kind of functionality. Or like sparc, are we going to
> be able to generally hard-code the nops on non-x86 at compile-time?

I think most architectures will use a constant nop sequence, in fact
x86 is the only one I can think of that needs variable nop sequences.

Why not abstract this behind some asm/jump_label.h macro just like
everything else? "jump_label_text_poke_early()" or similar.
On sparc64 I would define this to:

#include <asm/system.h>

static inline void jump_label_text_poke_early(void *addr, const void *opcode, size_t len)
{
u32 new_insn = *(u32 *)opcode;
u32 *insn_p = (u32 *) addr;

*insn_p = new_insn;
flushi(insn_p);
}

2010-06-15 17:29:01

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9

On 06/15/2010 10:13 AM, David Miller wrote:
> From: Jason Baron <[email protected]>
> Date: Tue, 15 Jun 2010 10:28:11 -0400
>
>> For example, the reason I have it in the code is b/c x86 determines
>> the best no-op at run-time. Are other architectures going to have to
>> require this kind of functionality. Or like sparc, are we going to
>> be able to generally hard-code the nops on non-x86 at compile-time?
>
> I think most architectures will use a constant nop sequence, in fact
> x86 is the only one I can think of that needs variable nop sequences.
>

One could potentially see that for other variable-length-instructions
architectures, e.g. S390, m68k or ARM/Thumb2 as well.

For fixed-length-instructions architectures, well, it shouldn't be an issue.

-hpa

2010-06-18 03:45:42

by Tony Breeds

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9

On Tue, Jun 15, 2010 at 11:44:40AM -0400, Mathieu Desnoyers wrote:

> It makes me wonder if anyone had success building a gcc 4.5
> Intel-to-sparc64 cross-compiler ? Usually, the crosstool-like suites are
> a few versions behind. I'm aware that this is not trivial, as
> cross-compilers have a tendency to refuse to get built in certain
> occasions (such as being a recent less tested version). I'd recommend
> you focus on this as a first step before resubmitting.

I'm building (and uploading to kernel.org[1]) 4.4.4 and 4.5.0 toolchains.
They're untested on anything otyher than the system I built them on.

Give it 24 hours and then try'em

Yours Tony
[1] http://kernel.org/pub/tools/crosstool/files/bin/${host_arch}/ [2]
[2] currently only i686 and x86_64, powerpc64 are comming.

2010-06-18 15:18:48

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 00/13] jump label v9

* Tony Breeds ([email protected]) wrote:
> On Tue, Jun 15, 2010 at 11:44:40AM -0400, Mathieu Desnoyers wrote:
>
> > It makes me wonder if anyone had success building a gcc 4.5
> > Intel-to-sparc64 cross-compiler ? Usually, the crosstool-like suites are
> > a few versions behind. I'm aware that this is not trivial, as
> > cross-compilers have a tendency to refuse to get built in certain
> > occasions (such as being a recent less tested version). I'd recommend
> > you focus on this as a first step before resubmitting.
>
> I'm building (and uploading to kernel.org[1]) 4.4.4 and 4.5.0 toolchains.
> They're untested on anything otyher than the system I built them on.

Cool !!!

Thanks a lot Tony.

Mathieu

>
> Give it 24 hours and then try'em
>
> Yours Tony
> [1] http://kernel.org/pub/tools/crosstool/files/bin/${host_arch}/ [2]
> [2] currently only i686 and x86_64, powerpc64 are comming.

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2010-06-19 03:24:52

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 06/13] jump label v9: move ftrace_dyn_arch_init to common code

On Wed, 2010-06-09 at 17:39 -0400, Jason Baron wrote:
> Move Steve's code for finding the best 5-byte no-op from ftrace.c to alternative.c.
> The idea is that other consumers (in this case jump label) want to make use of
> that code. I've created a global: 'char ideal_nop[5]', that is setup during
> setup_arch that can be used.
>
> Signed-off-by: Jason Baron <[email protected]>

> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 0d88ce9..1a1cc47 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2749,20 +2749,9 @@ extern unsigned long __stop_mcount_loc[];
>
> void __init ftrace_init(void)
> {
> - unsigned long count, addr, flags;
> + unsigned long count;
> int ret;
>
> - /* Keep the ftrace pointer to the stub */
> - addr = (unsigned long)ftrace_stub;
> -
> - local_irq_save(flags);
> - ftrace_dyn_arch_init(&addr);
> - local_irq_restore(flags);
> -
> - /* ftrace_dyn_arch_init places the return code in addr */
> - if (addr)
> - goto failed;
> -
> count = __stop_mcount_loc - __start_mcount_loc;
>
> ret = ftrace_dyn_table_alloc(count);

Um, you just removed the ftrace arch specific setup call for dynamic
ftrace. Although, I'm thinking this may not be needed since all archs
currently have just a stub. With the exception of ARM which seems to
call a "ftrace_mcount_set()" that git grep can not find.

Thus, if you remove this, then remove it from all archs.

-- Steve