2011-03-09 21:11:44

by Jason Baron

[permalink] [raw]
Subject: [PATCH 0/2] jump label: update for .39

Hi,

Re-fresh of updates against latest -tip tree.

I've tried to split this update up somewhat, but I've only succeeded to split
out the dynamic debug bits. The interface changes and re-write are quite
intertwined.

I believe this update should address all the comments from the previous posting
except for Mathieu's request for a section of jump label pointers that point to
the jump label structures (since the compiler might leave gaps in the jump label
structures). I've got a prototype patch to address this issue but its somewhat
invasive, and thus I'd like to leave it as a follow-up item. I have to date,
not seen this issue in practice.

thanks,

-Jason


Jason Baron (2):
jump label: introduce static_branch() interface
dynamic debug: add jump label support

arch/mips/include/asm/jump_label.h | 22 +-
arch/mips/kernel/jump_label.c | 2 +-
arch/sparc/include/asm/jump_label.h | 25 +-
arch/x86/include/asm/alternative.h | 3 +-
arch/x86/include/asm/jump_label.h | 26 +-
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/jump_label.c | 2 +-
arch/x86/kernel/module.c | 1 +
include/asm-generic/vmlinux.lds.h | 14 +-
include/linux/dynamic_debug.h | 44 +++-
include/linux/jump_label.h | 86 ++++---
include/linux/jump_label_ref.h | 44 ---
include/linux/perf_event.h | 26 +-
include/linux/tracepoint.h | 22 +-
kernel/jump_label.c | 537 ++++++++++++++---------------------
kernel/perf_event.c | 4 +-
kernel/tracepoint.c | 23 +-
lib/dynamic_debug.c | 4 +-
18 files changed, 391 insertions(+), 496 deletions(-)
delete mode 100644 include/linux/jump_label_ref.h


2011-03-09 21:11:53

by Jason Baron

[permalink] [raw]
Subject: [PATCH 1/2] jump label: introduce static_branch() interface

Introduce:

static __always_inline bool static_branch(struct jump_label_key *key);

instead of the old JUMP_LABEL(key, label) macro.

In this way, jump labels become really easy to use:

Define:

struct jump_label_key jump_key;

Can be used as:

if (static_branch(&jump_key))
do unlikely code

enable/disale via:

jump_label_inc(&jump_key);
jump_label_dec(&jump_key);

that's it!

For the jump labels disabled case, the static_branch() becomes an
atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
atomic_dec() operations. We show testing results for this change below.

Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.

Since we now require a 'struct jump_label_key *key', we can store a pointer into
the jump table addresses. In this way, we can enable/disable jump labels, in
basically constant time. This change allows us to completely remove the previous
hashtable scheme. Thanks to Peter Zijlstra for this re-write.

Testing:

I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
configurations, where tracepoints were disabled.

jump label configured in
avg: 815.6

jump label *not* configured in (using atomic reads)
avg: 800.1

jump label *not* configured in (regular reads)
avg: 803.4

thanks,

-Jason


Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Jason Baron <[email protected]>
Suggested-by: H. Peter Anvin <[email protected]>
Tested-by: David Daney <[email protected]>
---
arch/mips/include/asm/jump_label.h | 22 +-
arch/mips/kernel/jump_label.c | 2 +-
arch/sparc/include/asm/jump_label.h | 25 +-
arch/x86/include/asm/alternative.h | 3 +-
arch/x86/include/asm/jump_label.h | 26 +-
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/jump_label.c | 2 +-
arch/x86/kernel/module.c | 1 +
include/asm-generic/vmlinux.lds.h | 14 +-
include/linux/dynamic_debug.h | 2 -
include/linux/jump_label.h | 86 ++++---
include/linux/jump_label_ref.h | 44 ---
include/linux/perf_event.h | 26 +-
include/linux/tracepoint.h | 22 +-
kernel/jump_label.c | 537 ++++++++++++++---------------------
kernel/perf_event.c | 4 +-
kernel/tracepoint.c | 23 +-
17 files changed, 352 insertions(+), 489 deletions(-)
delete mode 100644 include/linux/jump_label_ref.h

diff --git a/arch/mips/include/asm/jump_label.h b/arch/mips/include/asm/jump_label.h
index 7622ccf..1881b31 100644
--- a/arch/mips/include/asm/jump_label.h
+++ b/arch/mips/include/asm/jump_label.h
@@ -20,16 +20,18 @@
#define WORD_INSN ".word"
#endif

-#define JUMP_LABEL(key, label) \
- do { \
- asm goto("1:\tnop\n\t" \
- "nop\n\t" \
- ".pushsection __jump_table, \"a\"\n\t" \
- WORD_INSN " 1b, %l[" #label "], %0\n\t" \
- ".popsection\n\t" \
- : : "i" (key) : : label); \
- } while (0)
-
+static __always_inline bool arch_static_branch(struct jump_label_key *key)
+{
+ asm goto("1:\tnop\n\t"
+ "nop\n\t"
+ ".pushsection __jump_table, \"aw\"\n\t"
+ WORD_INSN " 1b, %l[l_yes], %0\n\t"
+ ".popsection\n\t"
+ : : "i" (key) : : l_yes);
+ return false;
+l_yes:
+ return true;
+}

#endif /* __KERNEL__ */

diff --git a/arch/mips/kernel/jump_label.c b/arch/mips/kernel/jump_label.c
index 6001610..09ac7ca 100644
--- a/arch/mips/kernel/jump_label.c
+++ b/arch/mips/kernel/jump_label.c
@@ -6,11 +6,11 @@
* Copyright (c) 2010 Cavium Networks, Inc.
*/

-#include <linux/jump_label.h>
#include <linux/kernel.h>
#include <linux/memory.h>
#include <linux/mutex.h>
#include <linux/types.h>
+#include <linux/jump_label.h>
#include <linux/cpu.h>

#include <asm/cacheflush.h>
diff --git a/arch/sparc/include/asm/jump_label.h b/arch/sparc/include/asm/jump_label.h
index 427d468..fc73a82 100644
--- a/arch/sparc/include/asm/jump_label.h
+++ b/arch/sparc/include/asm/jump_label.h
@@ -7,17 +7,20 @@

#define JUMP_LABEL_NOP_SIZE 4

-#define JUMP_LABEL(key, label) \
- do { \
- asm goto("1:\n\t" \
- "nop\n\t" \
- "nop\n\t" \
- ".pushsection __jump_table, \"a\"\n\t"\
- ".align 4\n\t" \
- ".word 1b, %l[" #label "], %c0\n\t" \
- ".popsection \n\t" \
- : : "i" (key) : : label);\
- } while (0)
+static __always_inline bool arch_static_branch(struct jump_label_key *key)
+{
+ asm goto("1:\n\t"
+ "nop\n\t"
+ "nop\n\t"
+ ".pushsection __jump_table, \"aw\"\n\t"
+ ".align 4\n\t"
+ ".word 1b, %l[l_yes], %c0\n\t"
+ ".popsection \n\t"
+ : : "i" (key) : : l_yes);
+ return false;
+l_yes:
+ return true;
+}

#endif /* __KERNEL__ */

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 13009d1..8cdd1e2 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -4,7 +4,6 @@
#include <linux/types.h>
#include <linux/stddef.h>
#include <linux/stringify.h>
-#include <linux/jump_label.h>
#include <asm/asm.h>

/*
@@ -191,7 +190,7 @@ extern void *text_poke(void *addr, const void *opcode, size_t len);
extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
extern void text_poke_smp_batch(struct text_poke_param *params, int n);

-#if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
+#if defined(CONFIG_DYNAMIC_FTRACE) || defined(CONFIG_JUMP_LABEL)
#define IDEAL_NOP_SIZE_5 5
extern unsigned char ideal_nop5[IDEAL_NOP_SIZE_5];
extern void arch_init_ideal_nop5(void);
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index 574dbc2..f217cee 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -5,20 +5,24 @@

#include <linux/types.h>
#include <asm/nops.h>
+#include <asm/asm.h>

#define JUMP_LABEL_NOP_SIZE 5

-# define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"
-
-# define JUMP_LABEL(key, label) \
- do { \
- asm goto("1:" \
- JUMP_LABEL_INITIAL_NOP \
- ".pushsection __jump_table, \"aw\" \n\t"\
- _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
- ".popsection \n\t" \
- : : "i" (key) : : label); \
- } while (0)
+#define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"
+
+static __always_inline bool arch_static_branch(struct jump_label_key *key)
+{
+ asm goto("1:"
+ JUMP_LABEL_INITIAL_NOP
+ ".pushsection __jump_table, \"aw\" \n\t"
+ _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
+ ".popsection \n\t"
+ : : "i" (key) : : l_yes);
+ return false;
+l_yes:
+ return true;
+}

#endif /* __KERNEL__ */

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7038b95..cb1f834 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -674,7 +674,7 @@ void __kprobes text_poke_smp_batch(struct text_poke_param *params, int n)
__stop_machine(stop_machine_text_poke, (void *)&tpp, NULL);
}

-#if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
+#if defined(CONFIG_DYNAMIC_FTRACE) || defined(CONFIG_JUMP_LABEL)

#ifdef CONFIG_X86_64
unsigned char ideal_nop5[5] = { 0x66, 0x66, 0x66, 0x66, 0x90 };
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index 961b6b3..dfa4c3c 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -4,13 +4,13 @@
* Copyright (C) 2009 Jason Baron <[email protected]>
*
*/
-#include <linux/jump_label.h>
#include <linux/memory.h>
#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/list.h>
#include <linux/jhash.h>
#include <linux/cpu.h>
+#include <linux/jump_label.h>
#include <asm/kprobes.h>
#include <asm/alternative.h>

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index ab23f1a..52f256f 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -24,6 +24,7 @@
#include <linux/bug.h>
#include <linux/mm.h>
#include <linux/gfp.h>
+#include <linux/jump_label.h>

#include <asm/system.h>
#include <asm/page.h>
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index fe77e33..0ab5310 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -170,6 +170,10 @@
STRUCT_ALIGN(); \
*(__tracepoints) \
/* implement dynamic printk debug */ \
+ . = ALIGN(8); \
+ VMLINUX_SYMBOL(__start___jump_table) = .; \
+ *(__jump_table) \
+ VMLINUX_SYMBOL(__stop___jump_table) = .; \
. = ALIGN(8); \
VMLINUX_SYMBOL(__start___verbose) = .; \
*(__verbose) \
@@ -228,8 +232,6 @@
\
BUG_TABLE \
\
- JUMP_TABLE \
- \
/* PCI quirks */ \
.pci_fixup : AT(ADDR(.pci_fixup) - LOAD_OFFSET) { \
VMLINUX_SYMBOL(__start_pci_fixups_early) = .; \
@@ -583,14 +585,6 @@
#define BUG_TABLE
#endif

-#define JUMP_TABLE \
- . = ALIGN(8); \
- __jump_table : AT(ADDR(__jump_table) - LOAD_OFFSET) { \
- VMLINUX_SYMBOL(__start___jump_table) = .; \
- *(__jump_table) \
- VMLINUX_SYMBOL(__stop___jump_table) = .; \
- }
-
#ifdef CONFIG_PM_TRACE
#define TRACEDATA \
. = ALIGN(4); \
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 1c70028..8906d52 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -1,8 +1,6 @@
#ifndef _DYNAMIC_DEBUG_H
#define _DYNAMIC_DEBUG_H

-#include <linux/jump_label.h>
-
/* dynamic_printk_enabled, and dynamic_printk_enabled2 are bitmasks in which
* bit n is set to 1 if any modname hashes into the bucket n, 0 otherwise. They
* use independent hash functions, to reduce the chance of false positives.
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 7880f18..7c3889b 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -2,19 +2,39 @@
#define _LINUX_JUMP_LABEL_H

#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
+
+struct jump_label_key {
+ atomic_t enabled;
+ struct jump_entry *entries;
+#ifdef CONFIG_MODULES
+ struct jump_label_mod *next;
+#endif
+};
+
# include <asm/jump_label.h>
# define HAVE_JUMP_LABEL
#endif

enum jump_label_type {
+ JUMP_LABEL_DISABLE = 0,
JUMP_LABEL_ENABLE,
- JUMP_LABEL_DISABLE
};

struct module;

#ifdef HAVE_JUMP_LABEL

+#ifdef CONFIG_MODULES
+#define JUMP_LABEL_INIT {{ 0 }, NULL, NULL}
+#else
+#define JUMP_LABEL_INIT {{ 0 }, NULL}
+#endif
+
+static __always_inline bool static_branch(struct jump_label_key *key)
+{
+ return arch_static_branch(key);
+}
+
extern struct jump_entry __start___jump_table[];
extern struct jump_entry __stop___jump_table[];

@@ -23,37 +43,37 @@ extern void jump_label_unlock(void);
extern void arch_jump_label_transform(struct jump_entry *entry,
enum jump_label_type type);
extern void arch_jump_label_text_poke_early(jump_label_t addr);
-extern void jump_label_update(unsigned long key, enum jump_label_type type);
-extern void jump_label_apply_nops(struct module *mod);
extern int jump_label_text_reserved(void *start, void *end);
+extern void jump_label_inc(struct jump_label_key *key);
+extern void jump_label_dec(struct jump_label_key *key);
+extern bool jump_label_enabled(struct jump_label_key *key);
+extern void jump_label_apply_nops(struct module *mod);

-#define jump_label_enable(key) \
- jump_label_update((unsigned long)key, JUMP_LABEL_ENABLE);
+#else

-#define jump_label_disable(key) \
- jump_label_update((unsigned long)key, JUMP_LABEL_DISABLE);
+#include <asm/atomic.h>

-#else
+#define JUMP_LABEL_INIT {ATOMIC_INIT(0)}

-#define JUMP_LABEL(key, label) \
-do { \
- if (unlikely(*key)) \
- goto label; \
-} while (0)
+struct jump_label_key {
+ atomic_t enabled;
+};

-#define jump_label_enable(cond_var) \
-do { \
- *(cond_var) = 1; \
-} while (0)
+static __always_inline bool static_branch(struct jump_label_key *key)
+{
+ if (unlikely(atomic_read(&key->enabled)))
+ return true;
+ return false;
+}

-#define jump_label_disable(cond_var) \
-do { \
- *(cond_var) = 0; \
-} while (0)
+static inline void jump_label_inc(struct jump_label_key *key)
+{
+ atomic_inc(&key->enabled);
+}

-static inline int jump_label_apply_nops(struct module *mod)
+static inline void jump_label_dec(struct jump_label_key *key)
{
- return 0;
+ atomic_dec(&key->enabled);
}

static inline int jump_label_text_reserved(void *start, void *end)
@@ -64,16 +84,16 @@ static inline int jump_label_text_reserved(void *start, void *end)
static inline void jump_label_lock(void) {}
static inline void jump_label_unlock(void) {}

-#endif
+static inline bool jump_label_enabled(struct jump_label_key *key)
+{
+ return !!atomic_read(&key->enabled);
+}

-#define COND_STMT(key, stmt) \
-do { \
- __label__ jl_enabled; \
- JUMP_LABEL(key, jl_enabled); \
- if (0) { \
-jl_enabled: \
- stmt; \
- } \
-} while (0)
+static inline int jump_label_apply_nops(struct module *mod)
+{
+ return 0;
+}
+
+#endif

#endif
diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h
deleted file mode 100644
index e5d012a..0000000
--- a/include/linux/jump_label_ref.h
+++ /dev/null
@@ -1,44 +0,0 @@
-#ifndef _LINUX_JUMP_LABEL_REF_H
-#define _LINUX_JUMP_LABEL_REF_H
-
-#include <linux/jump_label.h>
-#include <asm/atomic.h>
-
-#ifdef HAVE_JUMP_LABEL
-
-static inline void jump_label_inc(atomic_t *key)
-{
- if (atomic_add_return(1, key) == 1)
- jump_label_enable(key);
-}
-
-static inline void jump_label_dec(atomic_t *key)
-{
- if (atomic_dec_and_test(key))
- jump_label_disable(key);
-}
-
-#else /* !HAVE_JUMP_LABEL */
-
-static inline void jump_label_inc(atomic_t *key)
-{
- atomic_inc(key);
-}
-
-static inline void jump_label_dec(atomic_t *key)
-{
- atomic_dec(key);
-}
-
-#undef JUMP_LABEL
-#define JUMP_LABEL(key, label) \
-do { \
- if (unlikely(__builtin_choose_expr( \
- __builtin_types_compatible_p(typeof(key), atomic_t *), \
- atomic_read((atomic_t *)(key)), *(key)))) \
- goto label; \
-} while (0)
-
-#endif /* HAVE_JUMP_LABEL */
-
-#endif /* _LINUX_JUMP_LABEL_REF_H */
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 614615b..95be5de 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -505,7 +505,7 @@ struct perf_guest_info_callbacks {
#include <linux/ftrace.h>
#include <linux/cpu.h>
#include <linux/irq_work.h>
-#include <linux/jump_label_ref.h>
+#include <linux/jump_label.h>
#include <asm/atomic.h>
#include <asm/local.h>

@@ -1036,7 +1036,7 @@ static inline int is_software_event(struct perf_event *event)
return event->pmu->task_ctx_nr == perf_sw_context;
}

-extern atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX];
+extern struct jump_label_key perf_swevent_enabled[PERF_COUNT_SW_MAX];

extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64);

@@ -1065,22 +1065,21 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
{
struct pt_regs hot_regs;

- JUMP_LABEL(&perf_swevent_enabled[event_id], have_event);
- return;
-
-have_event:
- if (!regs) {
- perf_fetch_caller_regs(&hot_regs);
- regs = &hot_regs;
+ if (static_branch(&perf_swevent_enabled[event_id])) {
+ if (!regs) {
+ perf_fetch_caller_regs(&hot_regs);
+ regs = &hot_regs;
+ }
+ __perf_sw_event(event_id, nr, nmi, regs, addr);
}
- __perf_sw_event(event_id, nr, nmi, regs, addr);
}

-extern atomic_t perf_sched_events;
+extern struct jump_label_key perf_sched_events;

static inline void perf_event_task_sched_in(struct task_struct *task)
{
- COND_STMT(&perf_sched_events, __perf_event_task_sched_in(task));
+ if (static_branch(&perf_sched_events))
+ __perf_event_task_sched_in(task);
}

static inline
@@ -1088,7 +1087,8 @@ void perf_event_task_sched_out(struct task_struct *task, struct task_struct *nex
{
perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, NULL, 0);

- COND_STMT(&perf_sched_events, __perf_event_task_sched_out(task, next));
+ if (static_branch(&perf_sched_events))
+ __perf_event_task_sched_out(task, next);
}

extern void perf_event_mmap(struct vm_area_struct *vma);
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 97c84a5..d530a44 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -29,7 +29,7 @@ struct tracepoint_func {

struct tracepoint {
const char *name; /* Tracepoint name */
- int state; /* State. */
+ struct jump_label_key key;
void (*regfunc)(void);
void (*unregfunc)(void);
struct tracepoint_func __rcu *funcs;
@@ -146,9 +146,7 @@ void tracepoint_update_probe_range(struct tracepoint * const *begin,
extern struct tracepoint __tracepoint_##name; \
static inline void trace_##name(proto) \
{ \
- JUMP_LABEL(&__tracepoint_##name.state, do_trace); \
- return; \
-do_trace: \
+ if (static_branch(&__tracepoint_##name.key)) \
__DO_TRACE(&__tracepoint_##name, \
TP_PROTO(data_proto), \
TP_ARGS(data_args), \
@@ -176,14 +174,14 @@ do_trace: \
* structures, so we create an array of pointers that will be used for iteration
* on the tracepoints.
*/
-#define DEFINE_TRACE_FN(name, reg, unreg) \
- static const char __tpstrtab_##name[] \
- __attribute__((section("__tracepoints_strings"))) = #name; \
- struct tracepoint __tracepoint_##name \
- __attribute__((section("__tracepoints"))) = \
- { __tpstrtab_##name, 0, reg, unreg, NULL }; \
- static struct tracepoint * const __tracepoint_ptr_##name __used \
- __attribute__((section("__tracepoints_ptrs"))) = \
+#define DEFINE_TRACE_FN(name, reg, unreg) \
+ static const char __tpstrtab_##name[] \
+ __attribute__((section("__tracepoints_strings"))) = #name; \
+ struct tracepoint __tracepoint_##name \
+ __attribute__((section("__tracepoints"))) = \
+ { __tpstrtab_##name, JUMP_LABEL_INIT, reg, unreg, NULL };\
+ static struct tracepoint * const __tracepoint_ptr_##name __used \
+ __attribute__((section("__tracepoints_ptrs"))) = \
&__tracepoint_##name;

#define DEFINE_TRACE(name) \
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 3b79bd9..b122097 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -2,43 +2,23 @@
* jump label support
*
* Copyright (C) 2009 Jason Baron <[email protected]>
+ * Copyright (C) 2011 Peter Zijlstra <[email protected]>
*
*/
-#include <linux/jump_label.h>
#include <linux/memory.h>
#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/list.h>
-#include <linux/jhash.h>
#include <linux/slab.h>
#include <linux/sort.h>
#include <linux/err.h>
+#include <linux/jump_label.h>

#ifdef HAVE_JUMP_LABEL

-#define JUMP_LABEL_HASH_BITS 6
-#define JUMP_LABEL_TABLE_SIZE (1 << JUMP_LABEL_HASH_BITS)
-static struct hlist_head jump_label_table[JUMP_LABEL_TABLE_SIZE];
-
/* mutex to protect coming/going of the the jump_label table */
static DEFINE_MUTEX(jump_label_mutex);

-struct jump_label_entry {
- struct hlist_node hlist;
- struct jump_entry *table;
- int nr_entries;
- /* hang modules off here */
- struct hlist_head modules;
- unsigned long key;
-};
-
-struct jump_label_module_entry {
- struct hlist_node hlist;
- struct jump_entry *table;
- int nr_entries;
- struct module *mod;
-};
-
void jump_label_lock(void)
{
mutex_lock(&jump_label_mutex);
@@ -49,6 +29,11 @@ void jump_label_unlock(void)
mutex_unlock(&jump_label_mutex);
}

+bool jump_label_enabled(struct jump_label_key *key)
+{
+ return !!atomic_read(&key->enabled);
+}
+
static int jump_label_cmp(const void *a, const void *b)
{
const struct jump_entry *jea = a;
@@ -64,7 +49,7 @@ static int jump_label_cmp(const void *a, const void *b)
}

static void
-sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop)
+jump_label_sort_entries(struct jump_entry *start, struct jump_entry *stop)
{
unsigned long size;

@@ -73,118 +58,25 @@ sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop)
sort(start, size, sizeof(struct jump_entry), jump_label_cmp, NULL);
}

-static struct jump_label_entry *get_jump_label_entry(jump_label_t key)
-{
- struct hlist_head *head;
- struct hlist_node *node;
- struct jump_label_entry *e;
- u32 hash = jhash((void *)&key, sizeof(jump_label_t), 0);
-
- head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)];
- hlist_for_each_entry(e, node, head, hlist) {
- if (key == e->key)
- return e;
- }
- return NULL;
-}
+static void jump_label_update(struct jump_label_key *key, int enable);

-static struct jump_label_entry *
-add_jump_label_entry(jump_label_t key, int nr_entries, struct jump_entry *table)
+void jump_label_inc(struct jump_label_key *key)
{
- struct hlist_head *head;
- struct jump_label_entry *e;
- u32 hash;
-
- e = get_jump_label_entry(key);
- if (e)
- return ERR_PTR(-EEXIST);
-
- e = kmalloc(sizeof(struct jump_label_entry), GFP_KERNEL);
- if (!e)
- return ERR_PTR(-ENOMEM);
-
- hash = jhash((void *)&key, sizeof(jump_label_t), 0);
- head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)];
- e->key = key;
- e->table = table;
- e->nr_entries = nr_entries;
- INIT_HLIST_HEAD(&(e->modules));
- hlist_add_head(&e->hlist, head);
- return e;
-}
+ if (atomic_inc_not_zero(&key->enabled))
+ return;

-static int
-build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop)
-{
- struct jump_entry *iter, *iter_begin;
- struct jump_label_entry *entry;
- int count;
-
- sort_jump_label_entries(start, stop);
- iter = start;
- while (iter < stop) {
- entry = get_jump_label_entry(iter->key);
- if (!entry) {
- iter_begin = iter;
- count = 0;
- while ((iter < stop) &&
- (iter->key == iter_begin->key)) {
- iter++;
- count++;
- }
- entry = add_jump_label_entry(iter_begin->key,
- count, iter_begin);
- if (IS_ERR(entry))
- return PTR_ERR(entry);
- } else {
- WARN_ONCE(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n");
- return -1;
- }
- }
- return 0;
+ jump_label_lock();
+ if (atomic_add_return(1, &key->enabled) == 1)
+ jump_label_update(key, JUMP_LABEL_ENABLE);
+ jump_label_unlock();
}

-/***
- * jump_label_update - update jump label text
- * @key - key value associated with a a jump label
- * @type - enum set to JUMP_LABEL_ENABLE or JUMP_LABEL_DISABLE
- *
- * Will enable/disable the jump for jump label @key, depending on the
- * value of @type.
- *
- */
-
-void jump_label_update(unsigned long key, enum jump_label_type type)
+void jump_label_dec(struct jump_label_key *key)
{
- struct jump_entry *iter;
- struct jump_label_entry *entry;
- struct hlist_node *module_node;
- struct jump_label_module_entry *e_module;
- int count;
+ if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex))
+ return;

- jump_label_lock();
- entry = get_jump_label_entry((jump_label_t)key);
- if (entry) {
- count = entry->nr_entries;
- iter = entry->table;
- while (count--) {
- if (kernel_text_address(iter->code))
- arch_jump_label_transform(iter, type);
- iter++;
- }
- /* eanble/disable jump labels in modules */
- hlist_for_each_entry(e_module, module_node, &(entry->modules),
- hlist) {
- count = e_module->nr_entries;
- iter = e_module->table;
- while (count--) {
- if (iter->key &&
- kernel_text_address(iter->code))
- arch_jump_label_transform(iter, type);
- iter++;
- }
- }
- }
+ jump_label_update(key, JUMP_LABEL_DISABLE);
jump_label_unlock();
}

@@ -197,77 +89,33 @@ static int addr_conflict(struct jump_entry *entry, void *start, void *end)
return 0;
}

-#ifdef CONFIG_MODULES
-
-static int module_conflict(void *start, void *end)
+static int __jump_label_text_reserved(struct jump_entry *iter_start,
+ struct jump_entry *iter_stop, void *start, void *end)
{
- struct hlist_head *head;
- struct hlist_node *node, *node_next, *module_node, *module_node_next;
- struct jump_label_entry *e;
- struct jump_label_module_entry *e_module;
struct jump_entry *iter;
- int i, count;
- int conflict = 0;
-
- for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
- head = &jump_label_table[i];
- hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
- hlist_for_each_entry_safe(e_module, module_node,
- module_node_next,
- &(e->modules), hlist) {
- count = e_module->nr_entries;
- iter = e_module->table;
- while (count--) {
- if (addr_conflict(iter, start, end)) {
- conflict = 1;
- goto out;
- }
- iter++;
- }
- }
- }
- }
-out:
- return conflict;
-}
-
-#endif
-
-/***
- * jump_label_text_reserved - check if addr range is reserved
- * @start: start text addr
- * @end: end text addr
- *
- * checks if the text addr located between @start and @end
- * overlaps with any of the jump label patch addresses. Code
- * that wants to modify kernel text should first verify that
- * it does not overlap with any of the jump label addresses.
- * Caller must hold jump_label_mutex.
- *
- * returns 1 if there is an overlap, 0 otherwise
- */
-int jump_label_text_reserved(void *start, void *end)
-{
- struct jump_entry *iter;
- struct jump_entry *iter_start = __start___jump_table;
- struct jump_entry *iter_stop = __start___jump_table;
- int conflict = 0;

iter = iter_start;
while (iter < iter_stop) {
- if (addr_conflict(iter, start, end)) {
- conflict = 1;
- goto out;
- }
+ if (addr_conflict(iter, start, end))
+ return 1;
iter++;
}

- /* now check modules */
-#ifdef CONFIG_MODULES
- conflict = module_conflict(start, end);
-#endif
-out:
- return conflict;
+ return 0;
+}
+
+static void __jump_label_update(struct jump_label_key *key,
+ struct jump_entry *entry, int enable)
+{
+ for (; entry->key == (jump_label_t)(unsigned long)key; entry++) {
+ /*
+ * entry->code set to 0 invalidates module init text sections
+ * kernel_text_address() verifies we are not in core kernel
+ * init code, see jump_label_invalidate_module_init().
+ */
+ if (entry->code && kernel_text_address(entry->code))
+ arch_jump_label_transform(entry, enable);
+ }
}

/*
@@ -277,145 +125,172 @@ void __weak arch_jump_label_text_poke_early(jump_label_t addr)
{
}

-static __init int init_jump_label(void)
+static __init int jump_label_init(void)
{
- int ret;
struct jump_entry *iter_start = __start___jump_table;
struct jump_entry *iter_stop = __stop___jump_table;
+ struct jump_label_key *key = NULL;
struct jump_entry *iter;

jump_label_lock();
- ret = build_jump_label_hashtable(__start___jump_table,
- __stop___jump_table);
- iter = iter_start;
- while (iter < iter_stop) {
+ jump_label_sort_entries(iter_start, iter_stop);
+
+ for (iter = iter_start; iter < iter_stop; iter++) {
arch_jump_label_text_poke_early(iter->code);
- iter++;
+ if (iter->key == (jump_label_t)(unsigned long)key)
+ continue;
+
+ key = (struct jump_label_key *)(unsigned long)iter->key;
+ atomic_set(&key->enabled, 0);
+ key->entries = iter;
+#ifdef CONFIG_MODULES
+ key->next = NULL;
+#endif
}
jump_label_unlock();
- return ret;
+
+ return 0;
}
-early_initcall(init_jump_label);
+early_initcall(jump_label_init);

#ifdef CONFIG_MODULES

-static struct jump_label_module_entry *
-add_jump_label_module_entry(struct jump_label_entry *entry,
- struct jump_entry *iter_begin,
- int count, struct module *mod)
-{
- struct jump_label_module_entry *e;
-
- e = kmalloc(sizeof(struct jump_label_module_entry), GFP_KERNEL);
- if (!e)
- return ERR_PTR(-ENOMEM);
- e->mod = mod;
- e->nr_entries = count;
- e->table = iter_begin;
- hlist_add_head(&e->hlist, &entry->modules);
- return e;
-}
+struct jump_label_mod {
+ struct jump_label_mod *next;
+ struct jump_entry *entries;
+ struct module *mod;
+};

-static int add_jump_label_module(struct module *mod)
+static int __jump_label_mod_text_reserved(void *start, void *end)
{
- struct jump_entry *iter, *iter_begin;
- struct jump_label_entry *entry;
- struct jump_label_module_entry *module_entry;
- int count;
+ struct module *mod;

- /* if the module doesn't have jump label entries, just return */
- if (!mod->num_jump_entries)
+ mod = __module_text_address((unsigned long)start);
+ if (!mod)
return 0;

- sort_jump_label_entries(mod->jump_entries,
- mod->jump_entries + mod->num_jump_entries);
- iter = mod->jump_entries;
- while (iter < mod->jump_entries + mod->num_jump_entries) {
- entry = get_jump_label_entry(iter->key);
- iter_begin = iter;
- count = 0;
- while ((iter < mod->jump_entries + mod->num_jump_entries) &&
- (iter->key == iter_begin->key)) {
- iter++;
- count++;
- }
- if (!entry) {
- entry = add_jump_label_entry(iter_begin->key, 0, NULL);
- if (IS_ERR(entry))
- return PTR_ERR(entry);
- }
- module_entry = add_jump_label_module_entry(entry, iter_begin,
- count, mod);
- if (IS_ERR(module_entry))
- return PTR_ERR(module_entry);
+ WARN_ON_ONCE(__module_text_address((unsigned long)end) != mod);
+
+ return __jump_label_text_reserved(mod->jump_entries,
+ mod->jump_entries + mod->num_jump_entries,
+ start, end);
+}
+
+static void __jump_label_mod_update(struct jump_label_key *key, int enable)
+{
+ struct jump_label_mod *mod = key->next;
+
+ while (mod) {
+ __jump_label_update(key, mod->entries, enable);
+ mod = mod->next;
}
- return 0;
}

-static void remove_jump_label_module(struct module *mod)
+/***
+ * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop()
+ * @mod: module to patch
+ *
+ * Allow for run-time selection of the optimal nops. Before the module
+ * loads patch these with arch_get_jump_label_nop(), which is specified by
+ * the arch specific jump label code.
+ */
+void jump_label_apply_nops(struct module *mod)
{
- struct hlist_head *head;
- struct hlist_node *node, *node_next, *module_node, *module_node_next;
- struct jump_label_entry *e;
- struct jump_label_module_entry *e_module;
- int i;
+ struct jump_entry *iter_start = mod->jump_entries;
+ struct jump_entry *iter_stop = iter_start + mod->num_jump_entries;
+ struct jump_entry *iter;

/* if the module doesn't have jump label entries, just return */
- if (!mod->num_jump_entries)
+ if (iter_start == iter_stop)
return;

- for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
- head = &jump_label_table[i];
- hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
- hlist_for_each_entry_safe(e_module, module_node,
- module_node_next,
- &(e->modules), hlist) {
- if (e_module->mod == mod) {
- hlist_del(&e_module->hlist);
- kfree(e_module);
- }
- }
- if (hlist_empty(&e->modules) && (e->nr_entries == 0)) {
- hlist_del(&e->hlist);
- kfree(e);
- }
+ jump_label_sort_entries(iter_start, iter_stop);
+
+ for (iter = iter_start; iter < iter_stop; iter++)
+ arch_jump_label_text_poke_early(iter->code);
+}
+
+static int jump_label_add_module(struct module *mod)
+{
+ struct jump_entry *iter_start = mod->jump_entries;
+ struct jump_entry *iter_stop = iter_start + mod->num_jump_entries;
+ struct jump_entry *iter;
+ struct jump_label_key *key = NULL;
+ struct jump_label_mod *jlm;
+
+ for (iter = iter_start; iter < iter_stop; iter++) {
+ if (iter->key == (jump_label_t)(unsigned long)key)
+ continue;
+
+ key = (struct jump_label_key *)(unsigned long)iter->key;
+
+ if (__module_address(iter->key) == mod) {
+ atomic_set(&key->enabled, 0);
+ key->entries = iter;
+ key->next = NULL;
+ continue;
}
+
+ jlm = kzalloc(sizeof(struct jump_label_mod), GFP_KERNEL);
+ if (!jlm)
+ return -ENOMEM;
+
+ jlm->mod = mod;
+ jlm->entries = iter;
+ jlm->next = key->next;
+ key->next = jlm;
+
+ if (jump_label_enabled(key))
+ __jump_label_update(key, iter, JUMP_LABEL_ENABLE);
}
+
+ return 0;
}

-static void remove_jump_label_module_init(struct module *mod)
+static void jump_label_del_module(struct module *mod)
{
- struct hlist_head *head;
- struct hlist_node *node, *node_next, *module_node, *module_node_next;
- struct jump_label_entry *e;
- struct jump_label_module_entry *e_module;
+ struct jump_entry *iter_start = mod->jump_entries;
+ struct jump_entry *iter_stop = iter_start + mod->num_jump_entries;
struct jump_entry *iter;
- int i, count;
+ struct jump_label_key *key = NULL;
+ struct jump_label_mod *jlm, **prev;

- /* if the module doesn't have jump label entries, just return */
- if (!mod->num_jump_entries)
- return;
+ for (iter = iter_start; iter < iter_stop; iter++) {
+ if (iter->key == (jump_label_t)(unsigned long)key)
+ continue;
+
+ key = (struct jump_label_key *)(unsigned long)iter->key;
+
+ if (__module_address(iter->key) == mod)
+ continue;
+
+ prev = &key->next;
+ jlm = key->next;
+
+ while (jlm && jlm->mod != mod) {
+ prev = &jlm->next;
+ jlm = jlm->next;
+ }

- for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
- head = &jump_label_table[i];
- hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
- hlist_for_each_entry_safe(e_module, module_node,
- module_node_next,
- &(e->modules), hlist) {
- if (e_module->mod != mod)
- continue;
- count = e_module->nr_entries;
- iter = e_module->table;
- while (count--) {
- if (within_module_init(iter->code, mod))
- iter->key = 0;
- iter++;
- }
- }
+ if (jlm) {
+ *prev = jlm->next;
+ kfree(jlm);
}
}
}

+static void jump_label_invalidate_module_init(struct module *mod)
+{
+ struct jump_entry *iter_start = mod->jump_entries;
+ struct jump_entry *iter_stop = iter_start + mod->num_jump_entries;
+ struct jump_entry *iter;
+
+ for (iter = iter_start; iter < iter_stop; iter++) {
+ if (within_module_init(iter->code, mod))
+ iter->code = 0;
+ }
+}
+
static int
jump_label_module_notify(struct notifier_block *self, unsigned long val,
void *data)
@@ -426,59 +301,77 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val,
switch (val) {
case MODULE_STATE_COMING:
jump_label_lock();
- ret = add_jump_label_module(mod);
+ ret = jump_label_add_module(mod);
if (ret)
- remove_jump_label_module(mod);
+ jump_label_del_module(mod);
jump_label_unlock();
break;
case MODULE_STATE_GOING:
jump_label_lock();
- remove_jump_label_module(mod);
+ jump_label_del_module(mod);
jump_label_unlock();
break;
case MODULE_STATE_LIVE:
jump_label_lock();
- remove_jump_label_module_init(mod);
+ jump_label_invalidate_module_init(mod);
jump_label_unlock();
break;
}
- return ret;
-}
-
-/***
- * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop()
- * @mod: module to patch
- *
- * Allow for run-time selection of the optimal nops. Before the module
- * loads patch these with arch_get_jump_label_nop(), which is specified by
- * the arch specific jump label code.
- */
-void jump_label_apply_nops(struct module *mod)
-{
- struct jump_entry *iter;
-
- /* if the module doesn't have jump label entries, just return */
- if (!mod->num_jump_entries)
- return;

- iter = mod->jump_entries;
- while (iter < mod->jump_entries + mod->num_jump_entries) {
- arch_jump_label_text_poke_early(iter->code);
- iter++;
- }
+ return notifier_from_errno(ret);
}

struct notifier_block jump_label_module_nb = {
.notifier_call = jump_label_module_notify,
- .priority = 0,
+ .priority = 1, /* higher than tracepoints */
};

-static __init int init_jump_label_module(void)
+static __init int jump_label_init_module(void)
{
return register_module_notifier(&jump_label_module_nb);
}
-early_initcall(init_jump_label_module);
+early_initcall(jump_label_init_module);

#endif /* CONFIG_MODULES */

+/***
+ * jump_label_text_reserved - check if addr range is reserved
+ * @start: start text addr
+ * @end: end text addr
+ *
+ * checks if the text addr located between @start and @end
+ * overlaps with any of the jump label patch addresses. Code
+ * that wants to modify kernel text should first verify that
+ * it does not overlap with any of the jump label addresses.
+ * Caller must hold jump_label_mutex.
+ *
+ * returns 1 if there is an overlap, 0 otherwise
+ */
+int jump_label_text_reserved(void *start, void *end)
+{
+ int ret = __jump_label_text_reserved(__start___jump_table,
+ __stop___jump_table, start, end);
+
+ if (ret)
+ return ret;
+
+#ifdef CONFIG_MODULES
+ ret = __jump_label_mod_text_reserved(start, end);
+#endif
+ return ret;
+}
+
+static void jump_label_update(struct jump_label_key *key, int enable)
+{
+ struct jump_entry *entry = key->entries;
+
+ /* if there are no users, entry can be NULL */
+ if (entry)
+ __jump_label_update(key, entry, enable);
+
+#ifdef CONFIG_MODULES
+ __jump_label_mod_update(key, enable);
+#endif
+}
+
#endif
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index ed253aa..d6073df 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -125,7 +125,7 @@ enum event_type_t {
* perf_sched_events : >0 events exist
* perf_cgroup_events: >0 per-cpu cgroup events exist on this cpu
*/
-atomic_t perf_sched_events __read_mostly;
+struct jump_label_key perf_sched_events __read_mostly;
static DEFINE_PER_CPU(atomic_t, perf_cgroup_events);

static atomic_t nr_mmap_events __read_mostly;
@@ -5406,7 +5406,7 @@ fail:
return err;
}

-atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX];
+struct jump_label_key perf_swevent_enabled[PERF_COUNT_SW_MAX];

static void sw_perf_event_destroy(struct perf_event *event)
{
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 68187af..b219f14 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -251,9 +251,9 @@ static void set_tracepoint(struct tracepoint_entry **entry,
{
WARN_ON(strcmp((*entry)->name, elem->name) != 0);

- if (elem->regfunc && !elem->state && active)
+ if (elem->regfunc && !jump_label_enabled(&elem->key) && active)
elem->regfunc();
- else if (elem->unregfunc && elem->state && !active)
+ else if (elem->unregfunc && jump_label_enabled(&elem->key) && !active)
elem->unregfunc();

/*
@@ -264,13 +264,10 @@ static void set_tracepoint(struct tracepoint_entry **entry,
* is used.
*/
rcu_assign_pointer(elem->funcs, (*entry)->funcs);
- if (!elem->state && active) {
- jump_label_enable(&elem->state);
- elem->state = active;
- } else if (elem->state && !active) {
- jump_label_disable(&elem->state);
- elem->state = active;
- }
+ if (active && !jump_label_enabled(&elem->key))
+ jump_label_inc(&elem->key);
+ else if (!active && jump_label_enabled(&elem->key))
+ jump_label_dec(&elem->key);
}

/*
@@ -281,13 +278,11 @@ static void set_tracepoint(struct tracepoint_entry **entry,
*/
static void disable_tracepoint(struct tracepoint *elem)
{
- if (elem->unregfunc && elem->state)
+ if (elem->unregfunc && jump_label_enabled(&elem->key))
elem->unregfunc();

- if (elem->state) {
- jump_label_disable(&elem->state);
- elem->state = 0;
- }
+ if (jump_label_enabled(&elem->key))
+ jump_label_dec(&elem->key);
rcu_assign_pointer(elem->funcs, NULL);
}

--
1.7.1

2011-03-09 21:11:42

by Jason Baron

[permalink] [raw]
Subject: [PATCH 2/2] dynamic debug: add jump label support

Since jump labels make use of atomic_read(), and dynamic_debug.h
gets included by kernel.h, the inclusion of atomic.h in
dynamic_debug.h (to get atomic_read()), would cause a circular
dependency since atomic.h requires kernel.h. Thus, for dynamic
debug I am explicitly including jump_label.h when jump labels are
configured, and making use of a regular reads (non-atomic), when
jump labels are disabled.

Signed-off-by: Jason Baron <[email protected]>
---
include/linux/dynamic_debug.h | 46 ++++++++++++++++++++++++++++++++++------
lib/dynamic_debug.c | 4 +-
2 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 8906d52..455cb3e 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -1,6 +1,26 @@
#ifndef _DYNAMIC_DEBUG_H
#define _DYNAMIC_DEBUG_H

+/*
+ * When jump labels are disabled, they make use of atomic_read(). Since
+ * dynamic_debug.h gets included by kernel.h, the inclusion of jump_label.h by
+ * dynamic_debug.h and thus atomic.h (to get atomic_read()), would cause a
+ * circular dependency since atomic.h requires kernel.h. Thus, for dynamic debug
+ * we are explicitly including jump_label.h when jump labels are configured, and
+ * making use of a regular reads (non-atomic), when jump labels are disabled.
+ */
+#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
+#include <linux/jump_label.h>
+#endif
+
+#ifdef HAVE_JUMP_LABEL
+ #define dd_enable(dp) do { jump_label_inc(&dp->enabled); } while (0)
+ #define dd_disable(dp) do { jump_label_dec(&dp->enabled); } while (0)
+#else
+ #define dd_enable(dp) do { dp->enabled = 1; } while (0)
+ #define dd_disable(dp) do { dp->enabled = 0; } while (0)
+#endif
+
/* dynamic_printk_enabled, and dynamic_printk_enabled2 are bitmasks in which
* bit n is set to 1 if any modname hashes into the bucket n, 0 otherwise. They
* use independent hash functions, to reduce the chance of false positives.
@@ -31,7 +51,11 @@ struct _ddebug {
#define _DPRINTK_FLAGS_PRINT (1<<0) /* printk() a message using the format */
#define _DPRINTK_FLAGS_DEFAULT 0
unsigned int flags:8;
- char enabled;
+#ifdef HAVE_JUMP_LABEL
+ struct jump_label_key enabled;
+#else
+ int enabled;
+#endif
} __attribute__((aligned(8)));


@@ -41,14 +65,22 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n,
#if defined(CONFIG_DYNAMIC_DEBUG)
extern int ddebug_remove_module(const char *mod_name);

+#ifdef HAVE_JUMP_LABEL
+# define DDEBUG_INIT JUMP_LABEL_INIT
+# define DDEBUG_BRANCH(enabled) static_branch(&enabled)
+#else
+# define DDEBUG_INIT 0
+# define DDEBUG_BRANCH(enabled) unlikely(enabled)
+#endif
+
#define dynamic_pr_debug(fmt, ...) do { \
static struct _ddebug descriptor \
__used \
__attribute__((section("__verbose"), aligned(8))) = \
{ KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \
- _DPRINTK_FLAGS_DEFAULT }; \
- if (unlikely(descriptor.enabled)) \
- printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
+ _DPRINTK_FLAGS_DEFAULT, DDEBUG_INIT }; \
+ if (DDEBUG_BRANCH(descriptor.enabled)) \
+ printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
} while (0)


@@ -57,9 +89,9 @@ extern int ddebug_remove_module(const char *mod_name);
__used \
__attribute__((section("__verbose"), aligned(8))) = \
{ KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \
- _DPRINTK_FLAGS_DEFAULT }; \
- if (unlikely(descriptor.enabled)) \
- dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \
+ _DPRINTK_FLAGS_DEFAULT, DDEBUG_INIT }; \
+ if (DDEBUG_BRANCH(descriptor.enabled)) \
+ dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \
} while (0)

#else
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index b335acb..5973c53 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -142,9 +142,9 @@ static void ddebug_change(const struct ddebug_query *query,
dt->num_enabled++;
dp->flags = newflags;
if (newflags)
- dp->enabled = 1;
+ dd_enable(dp);
else
- dp->enabled = 0;
+ dd_disable(dp);
if (verbose)
printk(KERN_INFO
"ddebug: changed %s:%d [%s]%s %s\n",
--
1.7.1

2011-03-10 03:36:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
> Hi,
>
> Re-fresh of updates against latest -tip tree.

Thanks Jason,

I started looking at them, I should have comments tomorrow (if I have
any comments ;)

>
> I've tried to split this update up somewhat, but I've only succeeded to split
> out the dynamic debug bits. The interface changes and re-write are quite
> intertwined.
>
> I believe this update should address all the comments from the previous posting
> except for Mathieu's request for a section of jump label pointers that point to
> the jump label structures (since the compiler might leave gaps in the jump label
> structures).

The jump label structures is a list of 3 pointers, correct? I doubt that
gcc would place any holes in it as they are all aligned by natural word
size.

> I've got a prototype patch to address this issue but its somewhat
> invasive, and thus I'd like to leave it as a follow-up item. I have to date,
> not seen this issue in practice.
>
> thanks,
>
> -Jason
>
>
> Jason Baron (2):
> jump label: introduce static_branch() interface
> dynamic debug: add jump label support
>
> arch/mips/include/asm/jump_label.h | 22 +-
> arch/mips/kernel/jump_label.c | 2 +-

Could we get Ralf's acked-by for the mips code.

> arch/sparc/include/asm/jump_label.h | 25 +-

Could we get David Miller's Acked-by for sparc.

Thanks,

-- Steve

> arch/x86/include/asm/alternative.h | 3 +-
> arch/x86/include/asm/jump_label.h | 26 +-
> arch/x86/kernel/alternative.c | 2 +-
> arch/x86/kernel/jump_label.c | 2 +-
> arch/x86/kernel/module.c | 1 +
> include/asm-generic/vmlinux.lds.h | 14 +-
> include/linux/dynamic_debug.h | 44 +++-
> include/linux/jump_label.h | 86 ++++---
> include/linux/jump_label_ref.h | 44 ---
> include/linux/perf_event.h | 26 +-
> include/linux/tracepoint.h | 22 +-
> kernel/jump_label.c | 537 ++++++++++++++---------------------
> kernel/perf_event.c | 4 +-
> kernel/tracepoint.c | 23 +-
> lib/dynamic_debug.c | 4 +-
> 18 files changed, 391 insertions(+), 496 deletions(-)
> delete mode 100644 include/linux/jump_label_ref.h

2011-03-10 14:11:46

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Steven Rostedt ([email protected]) wrote:
> On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
> > Hi,
> >
> > Re-fresh of updates against latest -tip tree.
>
> Thanks Jason,
>
> I started looking at them, I should have comments tomorrow (if I have
> any comments ;)
>
> >
> > I've tried to split this update up somewhat, but I've only succeeded to split
> > out the dynamic debug bits. The interface changes and re-write are quite
> > intertwined.
> >
> > I believe this update should address all the comments from the previous posting
> > except for Mathieu's request for a section of jump label pointers that point to
> > the jump label structures (since the compiler might leave gaps in the jump label
> > structures).
>
> The jump label structures is a list of 3 pointers, correct? I doubt that
> gcc would place any holes in it as they are all aligned by natural word
> size.
>

Hi Steven,

Can you explain what would prevent gcc from aligning these 3 pointers
(total of 24 bytes on 64-bit architectures) on 32-bytes ? Also, could
you point out what would refrain the linker from aligning the start of
object sections on the next 32-bytes (thus power of two) address
multiple ?

I think we need to be a bit more strict in our interpretation of what
guarantee gcc/ld provide and don't provide with respect to section and
structure alignment.

As it stands now, the section alignment of jump labels looks half-broken
on most architectures, and this *is* a big deal. I would really like to
see a patch for this (it can be a separate patch) going in for .39.

Thank you,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 14:48:00

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Wed, Mar 09, 2011 at 10:36:31PM -0500, Steven Rostedt wrote:
> > Hi,
> >
> > Re-fresh of updates against latest -tip tree.
>
> Thanks Jason,
>
> I started looking at them, I should have comments tomorrow (if I have
> any comments ;)
>
> >
> > I've tried to split this update up somewhat, but I've only succeeded to split
> > out the dynamic debug bits. The interface changes and re-write are quite
> > intertwined.
> >
> > I believe this update should address all the comments from the previous posting
> > except for Mathieu's request for a section of jump label pointers that point to
> > the jump label structures (since the compiler might leave gaps in the jump label
> > structures).
>
> The jump label structures is a list of 3 pointers, correct? I doubt that
> gcc would place any holes in it as they are all aligned by natural word
> size.
>

yes:

<address to patch>
<address to jump to (when enabled)>
<key value (which is a pointer address)>

> > I've got a prototype patch to address this issue but its somewhat
> > invasive, and thus I'd like to leave it as a follow-up item. I have to date,
> > not seen this issue in practice.
> >
> > thanks,
> >
> > -Jason
> >
> >
> > Jason Baron (2):
> > jump label: introduce static_branch() interface
> > dynamic debug: add jump label support
> >
> > arch/mips/include/asm/jump_label.h | 22 +-
> > arch/mips/kernel/jump_label.c | 2 +-
>
> Could we get Ralf's acked-by for the mips code.
>
> > arch/sparc/include/asm/jump_label.h | 25 +-
>
> Could we get David Miller's Acked-by for sparc.
>
> Thanks,
>
> -- Steve
>

Ok, I'll ping them.

thanks,

-Jason

2011-03-10 15:38:29

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 09:11 -0500, Mathieu Desnoyers wrote:
> * Steven Rostedt ([email protected]) wrote:
> > On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
> > > Hi,
> > >
> > > Re-fresh of updates against latest -tip tree.
> >
> > Thanks Jason,
> >
> > I started looking at them, I should have comments tomorrow (if I have
> > any comments ;)
> >
> > >
> > > I've tried to split this update up somewhat, but I've only succeeded to split
> > > out the dynamic debug bits. The interface changes and re-write are quite
> > > intertwined.
> > >
> > > I believe this update should address all the comments from the previous posting
> > > except for Mathieu's request for a section of jump label pointers that point to
> > > the jump label structures (since the compiler might leave gaps in the jump label
> > > structures).
> >
> > The jump label structures is a list of 3 pointers, correct? I doubt that
> > gcc would place any holes in it as they are all aligned by natural word
> > size.
> >
>
> Hi Steven,
>
> Can you explain what would prevent gcc from aligning these 3 pointers
> (total of 24 bytes on 64-bit architectures) on 32-bytes ? Also, could
> you point out what would refrain the linker from aligning the start of
> object sections on the next 32-bytes (thus power of two) address
> multiple ?

Maybe it would be just easier to add another long ;)

Seriously, it would. Then it would be 32 bytes on 64bit and 16 bytes on
32bit. Then I guess we can have our guarantee without doing a large
change to have this indirect pointer and still waste sizeof(long) bytes
in having it.

Just insert a long "Reserved" word.

-- Steve

>
> I think we need to be a bit more strict in our interpretation of what
> guarantee gcc/ld provide and don't provide with respect to section and
> structure alignment.
>
> As it stands now, the section alignment of jump labels looks half-broken
> on most architectures, and this *is* a big deal. I would really like to
> see a patch for this (it can be a separate patch) going in for .39.
>
> Thank you,
>
> Mathieu
>

2011-03-10 16:41:41

by Jan Glauber

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Wed, Mar 09, 2011 at 03:47:24PM -0500, Jason Baron wrote:
> Hi,
>
> Re-fresh of updates against latest -tip tree.

Hi Jason,

here are the s390 bits, tested successfully on top of your patches
so it would be nice to have this in .39 too.

The s390 support depends on the JUMP_TABLE being in the RW section
which is not the case in .38 and leads to a protection exception
there.

--Jan

Subject: [PATCH] jump_label: Add s390 support
From: Jan Glauber <[email protected]>

Implement the architecture backend for jump label support on s390.

For a shared kernel booted from a NSS silently disable jump labels
because the NSS is read-only. Therefore jump labels will be disabled
in a shared kernel and can't be activated.

Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Jan Glauber <[email protected]>
---
arch/s390/Kconfig | 1
arch/s390/include/asm/jump_label.h | 34 +++++++++++++++++++++
arch/s390/kernel/Makefile | 2 -
arch/s390/kernel/jump_label.c | 59 +++++++++++++++++++++++++++++++++++++
arch/s390/kernel/module.c | 1
5 files changed, 96 insertions(+), 1 deletion(-)

--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -87,6 +87,7 @@ config S390
select HAVE_KERNEL_LZO
select HAVE_GET_USER_PAGES_FAST
select HAVE_ARCH_MUTEX_CPU_RELAX
+ select HAVE_ARCH_JUMP_LABEL if !MARCH_G5
select ARCH_INLINE_SPIN_TRYLOCK
select ARCH_INLINE_SPIN_TRYLOCK_BH
select ARCH_INLINE_SPIN_LOCK
--- /dev/null
+++ b/arch/s390/include/asm/jump_label.h
@@ -0,0 +1,34 @@
+#ifndef _ASM_S390_JUMP_LABEL_H
+#define _ASM_S390_JUMP_LABEL_H
+
+#include <linux/types.h>
+
+#define JUMP_LABEL_NOP_SIZE 6
+
+#ifdef CONFIG_64BIT
+#define ASM_PTR ".quad"
+#else
+#define ASM_PTR ".long"
+#endif
+
+static __always_inline bool arch_static_branch(struct jump_label_key *key)
+{
+ asm goto("0: brcl 0,0\n"
+ ".pushsection __jump_table, \"aw\"\n"
+ ASM_PTR " 0b, %l[label], %0\n"
+ ".popsection\n"
+ : : "X" (key) : : label);
+ return false;
+label:
+ return true;
+}
+
+typedef unsigned long jump_label_t;
+
+struct jump_entry {
+ jump_label_t code;
+ jump_label_t target;
+ jump_label_t key;
+};
+
+#endif
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -23,7 +23,7 @@ CFLAGS_sysinfo.o += -Iinclude/math-emu -
obj-y := bitmap.o traps.o time.o process.o base.o early.o setup.o \
processor.o sys_s390.o ptrace.o signal.o cpcmd.o ebcdic.o \
s390_ext.o debug.o irq.o ipl.o dis.o diag.o mem_detect.o \
- vdso.o vtime.o sysinfo.o nmi.o sclp.o
+ vdso.o vtime.o sysinfo.o nmi.o sclp.o jump_label.o

obj-y += $(if $(CONFIG_64BIT),entry64.o,entry.o)
obj-y += $(if $(CONFIG_64BIT),reipl64.o,reipl.o)
--- /dev/null
+++ b/arch/s390/kernel/jump_label.c
@@ -0,0 +1,59 @@
+/*
+ * Jump label s390 support
+ *
+ * Copyright IBM Corp. 2011
+ * Author(s): Jan Glauber <[email protected]>
+ */
+#include <linux/module.h>
+#include <linux/uaccess.h>
+#include <linux/stop_machine.h>
+#include <linux/jump_label.h>
+#include <asm/ipl.h>
+
+#ifdef HAVE_JUMP_LABEL
+
+struct insn {
+ u16 opcode;
+ s32 offset;
+} __packed;
+
+struct insn_args {
+ unsigned long *target;
+ struct insn *insn;
+ ssize_t size;
+};
+
+static int __arch_jump_label_transform(void *data)
+{
+ struct insn_args *args = data;
+ int rc;
+
+ rc = probe_kernel_write(args->target, args->insn, args->size);
+ WARN_ON_ONCE(rc < 0);
+ return 0;
+}
+
+void arch_jump_label_transform(struct jump_entry *entry,
+ enum jump_label_type type)
+{
+ struct insn_args args;
+ struct insn insn;
+
+ if (type == JUMP_LABEL_ENABLE) {
+ /* brcl 15,offset */
+ insn.opcode = 0xc0f4;
+ insn.offset = (entry->target - entry->code) >> 1;
+ } else {
+ /* brcl 0,0 */
+ insn.opcode = 0xc004;
+ insn.offset = 0;
+ }
+
+ args.target = (void *) entry->code;
+ args.insn = &insn;
+ args.size = JUMP_LABEL_NOP_SIZE;
+
+ stop_machine(__arch_jump_label_transform, &args, NULL);
+}
+
+#endif
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -407,6 +407,7 @@ int module_finalize(const Elf_Ehdr *hdr,
{
vfree(me->arch.syminfo);
me->arch.syminfo = NULL;
+ jump_label_apply_nops(me);
return 0;
}

2011-03-10 17:27:53

by David Daney

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On 03/10/2011 07:38 AM, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 09:11 -0500, Mathieu Desnoyers wrote:
>> * Steven Rostedt ([email protected]) wrote:
>>> On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
>>>> Hi,
>>>>
>>>> Re-fresh of updates against latest -tip tree.
>>>
>>> Thanks Jason,
>>>
>>> I started looking at them, I should have comments tomorrow (if I have
>>> any comments ;)
>>>
>>>>
>>>> I've tried to split this update up somewhat, but I've only succeeded to split
>>>> out the dynamic debug bits. The interface changes and re-write are quite
>>>> intertwined.
>>>>
>>>> I believe this update should address all the comments from the previous posting
>>>> except for Mathieu's request for a section of jump label pointers that point to
>>>> the jump label structures (since the compiler might leave gaps in the jump label
>>>> structures).
>>>
>>> The jump label structures is a list of 3 pointers, correct? I doubt that
>>> gcc would place any holes in it as they are all aligned by natural word
>>> size.
>>>
>>
>> Hi Steven,
>>
>> Can you explain what would prevent gcc from aligning these 3 pointers
>> (total of 24 bytes on 64-bit architectures) on 32-bytes ?

I can:

http://www.x86-64.org/documentation/abi.pdf Section 3.1.2:

Aggregates and Unions

Structures and unions assume the alignment of their most strictly
aligned component. Each member is assigned to the lowest
available offset with the appropriate alignment. The size of any
object is always a multiple of the object‘s alignment.

An array uses the same alignment as its elements, except that a
local or global array variable of length at least 16 bytes or a C99
variable-length array variable always has alignment of at least 16
bytes.

Structure and union objects can require padding to meet size and
alignment constraints. The contents of any padding is undefined.

I don't think it is explicitly stated, but it is also true that the size
is the smallest value that meets the above constraints.


>> Also, could
>> you point out what would refrain the linker from aligning the start of
>> object sections on the next 32-bytes (thus power of two) address
>> multiple ?
>

The rules of the ABI are quite specific. It would be a toolchain bug if
this were messed up.



> Maybe it would be just easier to add another long ;)

Maybe we should audit all the data structures in the entire kernel and
add manual padding to power of 2 boundaries.

>
> Seriously, it would. Then it would be 32 bytes on 64bit and 16 bytes on
> 32bit. Then I guess we can have our guarantee without doing a large
> change to have this indirect pointer and still waste sizeof(long) bytes
> in having it.
>
> Just insert a long "Reserved" word.
>

I disagree. Wasting memory to work around non-existent hypothetical
bugs seems wrong to me.

David Daney

2011-03-10 18:04:08

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 09:27 -0800, David Daney wrote:
> On 03/10/2011 07:38 AM, Steven Rostedt wrote:

> >> Can you explain what would prevent gcc from aligning these 3 pointers
> >> (total of 24 bytes on 64-bit architectures) on 32-bytes ?
>
> I can:
>
> http://www.x86-64.org/documentation/abi.pdf Section 3.1.2:
>
> Aggregates and Unions

Note, we are not dealing with C or arrays, but with inline assembly, and
the linker.

+static __always_inline bool arch_static_branch(struct jump_label_key *key)
+{
+ asm goto("1:\tnop\n\t"
+ "nop\n\t"
+ ".pushsection __jump_table, \"aw\"\n\t"
+ WORD_INSN " 1b, %l[l_yes], %0\n\t"
+ ".popsection\n\t"
+ : : "i" (key) : : l_yes);
+ return false;
+l_yes:
+ return true;
+}


That push/pop section part creates the structure we are talking about.
It's made up of three pointers. The address of the nop, the address of
the label l_yes and the address of the key.

Now its up to the linker to decide where to place that element. Can we
guarantee that it will always be on an 8 byte boundery?
Hmm, I wonder if we could add a .ALIGN sizeof(long) before that?

Now if we have two object files where there's a list of these jump
labels, and then when the linker concatenates them we have something
like:

.long f1-a, .long f1-b, .long f1-c
[ the above is 24 bytes ] so
.long [ pad 8 bytes]
.long f2-a, .long f2-b, .long f2-c ...

Where f1 is object file 1 and f2 is object file 2. File 1 has a jump
label table that holds a total of 24 bytes, and when the linker added
the next jump label it padded it with 8 bytes into that section. The
question remains, is that OK for the linker to do that, even though we
specified in vmlinux.ld:

/* implement dynamic printk debug */ \
+ . = ALIGN(8); \
+ VMLINUX_SYMBOL(__start___jump_table) = .; \
+ *(__jump_table) \
+ VMLINUX_SYMBOL(__stop___jump_table) = .; \

But then again, maybe it will break on 32 bit, where the above file 1
would have a total of 12 bytes, it may pad it with 4 bytes to keep that
8 byte alignment.


> Structures and unions assume the alignment of their most strictly
> aligned component. Each member is assigned to the lowest
> available offset with the appropriate alignment. The size of any
> object is always a multiple of the object‘s alignment.
>
> An array uses the same alignment as its elements, except that a
> local or global array variable of length at least 16 bytes or a C99
> variable-length array variable always has alignment of at least 16
> bytes.
>
> Structure and union objects can require padding to meet size and
> alignment constraints. The contents of any padding is undefined.
>
> I don't think it is explicitly stated, but it is also true that the size
> is the smallest value that meets the above constraints.

Could be true, but gcc has no idea that this data is an array. It's
really up to the linker.

>
>
> >> Also, could
> >> you point out what would refrain the linker from aligning the start of
> >> object sections on the next 32-bytes (thus power of two) address
> >> multiple ?
> >
>
> The rules of the ABI are quite specific. It would be a toolchain bug if
> this were messed up.
>
>
>
> > Maybe it would be just easier to add another long ;)
>
> Maybe we should audit all the data structures in the entire kernel and
> add manual padding to power of 2 boundaries.

We are not worried about normal C data structures, we are worried about
data structures that are created by inline assembly and the linker. As
we did have a bug with the trace_events code. But that dealt with a
structure that was not strictly naturally word aligned. It had "int" as
well as pointers.

>
> >
> > Seriously, it would. Then it would be 32 bytes on 64bit and 16 bytes on
> > 32bit. Then I guess we can have our guarantee without doing a large
> > change to have this indirect pointer and still waste sizeof(long) bytes
> > in having it.
> >
> > Just insert a long "Reserved" word.
> >
>
> I disagree. Wasting memory to work around non-existent hypothetical
> bugs seems wrong to me.

The linker may never cause the issue. I haven't seen any problems with
things that were naturally word aligned. But then, all the places that
we do this has been naturally word aligned as well as a power of 2
(extables for example).

Thus, if we do "waste" space, I rather just add the 'Reserved' word and
which makes it a power of 2 and be done with it.

-- Steve

2011-03-10 18:21:16

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, Mar 10, 2011 at 01:04:01PM -0500, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 09:27 -0800, David Daney wrote:
> > On 03/10/2011 07:38 AM, Steven Rostedt wrote:
>
> > >> Can you explain what would prevent gcc from aligning these 3 pointers
> > >> (total of 24 bytes on 64-bit architectures) on 32-bytes ?
> >
> > I can:
> >
> > http://www.x86-64.org/documentation/abi.pdf Section 3.1.2:
> >
> > Aggregates and Unions
>
> Note, we are not dealing with C or arrays, but with inline assembly, and
> the linker.
>
> +static __always_inline bool arch_static_branch(struct jump_label_key *key)
> +{
> + asm goto("1:\tnop\n\t"
> + "nop\n\t"
> + ".pushsection __jump_table, \"aw\"\n\t"
> + WORD_INSN " 1b, %l[l_yes], %0\n\t"
> + ".popsection\n\t"
> + : : "i" (key) : : l_yes);
> + return false;
> +l_yes:
> + return true;
> +}
>
>
> That push/pop section part creates the structure we are talking about.
> It's made up of three pointers. The address of the nop, the address of
> the label l_yes and the address of the key.
>
> Now its up to the linker to decide where to place that element. Can we
> guarantee that it will always be on an 8 byte boundery?
> Hmm, I wonder if we could add a .ALIGN sizeof(long) before that?
>

we can. Sparc does it, see: arch/sparc/include/asm/jump_label.h.

So I guess it would be .align 8 for 64-bit and .align 4 for 32-bit...

-Jason

2011-03-10 18:35:48

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 13:20 -0500, Jason Baron wrote:
> On Thu, Mar 10, 2011 at 01:04:01PM -0500, Steven Rostedt wrote:
> Now its up to the linker to decide where to place that element. Can we
> > guarantee that it will always be on an 8 byte boundery?
> > Hmm, I wonder if we could add a .ALIGN sizeof(long) before that?
> >
>
> we can. Sparc does it, see: arch/sparc/include/asm/jump_label.h.
>
> So I guess it would be .align 8 for 64-bit and .align 4 for 32-bit...

Now what about the vmlinux.ld? That has a align 8. Is that a one time
short (align to 8 bytes here), or will it carry through aligning the
rest of the section. If not, perhaps we may be OK.

-- Steve

2011-03-10 18:47:17

by David Daney

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On 03/10/2011 10:35 AM, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 13:20 -0500, Jason Baron wrote:
>> On Thu, Mar 10, 2011 at 01:04:01PM -0500, Steven Rostedt wrote:
>> Now its up to the linker to decide where to place that element. Can we
>>> guarantee that it will always be on an 8 byte boundery?
>>> Hmm, I wonder if we could add a .ALIGN sizeof(long) before that?
>>>
>>
>> we can. Sparc does it, see: arch/sparc/include/asm/jump_label.h.
>>
>> So I guess it would be .align 8 for 64-bit and .align 4 for 32-bit...
>
> Now what about the vmlinux.ld? That has a align 8. Is that a one time
> short (align to 8 bytes here), or will it carry through aligning the
> rest of the section. If not, perhaps we may be OK.
>

That is the alignment of the start of the section in the *output*
object. The linker will respect the requested alignment of the various
__jump_table sections in the *input* objects. So adding an .align 8
before emitting data into the __jump_table section would not hurt (It
could even help.)

I empirically determined that if no .align is specified, the requested
alignment defaults to 4 for x86_64.

The alignment requested by the assembler will have to satisfy *all* the
requested alignments, so manually forcing everything to .align 8 (or
.align 4 for 32-bit) should ensure that the linker doesn't put in any holes.

David Daney

2011-03-10 18:54:03

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:

> The alignment requested by the assembler will have to satisfy *all* the
> requested alignments, so manually forcing everything to .align 8 (or
> .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.

I would agree with the assessment although, I don't know that it is
documented anywhere that this is what happens. As the previous "bug"
with the trace_events was solved by me adding .align(4) everywhere, I
would think that .align(sizeof(long)) would work here too.

It may be a good ideal to force this alignment, and not add wasted
space. If anything, if this (hypothetical) bug appears, it will most
likely show up as a crash on boot up. I'm not too concerned about it.

-- Steve

2011-03-10 18:58:03

by David Daney

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On 03/10/2011 10:53 AM, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:
>
>> The alignment requested by the assembler will have to satisfy *all* the
>> requested alignments, so manually forcing everything to .align 8 (or
>> .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.
>
> I would agree with the assessment although, I don't know that it is
> documented anywhere that this is what happens. As the previous "bug"
> with the trace_events was solved by me adding .align(4) everywhere, I
> would think that .align(sizeof(long)) would work here too.
>
> It may be a good ideal to force this alignment, and not add wasted
> space. If anything, if this (hypothetical) bug appears, it will most
> likely show up as a crash on boot up. I'm not too concerned about it.
>

If the linker put in gratuitous holes, things like __ex_table would
break too.


David Daney

2011-03-10 19:25:47

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 10:57 -0800, David Daney wrote:
> On 03/10/2011 10:53 AM, Steven Rostedt wrote:
> > On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:
> >
> >> The alignment requested by the assembler will have to satisfy *all* the
> >> requested alignments, so manually forcing everything to .align 8 (or
> >> .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.
> >
> > I would agree with the assessment although, I don't know that it is
> > documented anywhere that this is what happens. As the previous "bug"
> > with the trace_events was solved by me adding .align(4) everywhere, I
> > would think that .align(sizeof(long)) would work here too.
> >
> > It may be a good ideal to force this alignment, and not add wasted
> > space. If anything, if this (hypothetical) bug appears, it will most
> > likely show up as a crash on boot up. I'm not too concerned about it.
> >
>
> If the linker put in gratuitous holes, things like __ex_table would
> break too.

Again, as I have said (although I said extable not __ex_table), there
seems to be no problem when the data is of a power of 2 as well. As
__ex_table size is a power of 2. We just don't know if the linker will
add holes when the size is something other than power of 2.

-- Steve

2011-03-10 19:45:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 14:25 -0500, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 10:57 -0800, David Daney wrote:
> > On 03/10/2011 10:53 AM, Steven Rostedt wrote:
> > > On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:
> > >
> > >> The alignment requested by the assembler will have to satisfy *all* the
> > >> requested alignments, so manually forcing everything to .align 8 (or
> > >> .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.
> > >
> > > I would agree with the assessment although, I don't know that it is
> > > documented anywhere that this is what happens. As the previous "bug"
> > > with the trace_events was solved by me adding .align(4) everywhere, I
> > > would think that .align(sizeof(long)) would work here too.
> > >
> > > It may be a good ideal to force this alignment, and not add wasted
> > > space. If anything, if this (hypothetical) bug appears, it will most
> > > likely show up as a crash on boot up. I'm not too concerned about it.
> > >
> >
> > If the linker put in gratuitous holes, things like __ex_table would
> > break too.
>
> Again, as I have said (although I said extable not __ex_table), there
> seems to be no problem when the data is of a power of 2 as well. As
> __ex_table size is a power of 2. We just don't know if the linker will
> add holes when the size is something other than power of 2.

Anyway, I think the best thing for now is to have Jason add
the .align(sizeof(long)) in the inline assembly for all locations and be
done with it.

-- Steve

2011-03-10 19:54:32

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, Mar 10, 2011 at 02:45:29PM -0500, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 14:25 -0500, Steven Rostedt wrote:
> > On Thu, 2011-03-10 at 10:57 -0800, David Daney wrote:
> > > On 03/10/2011 10:53 AM, Steven Rostedt wrote:
> > > > On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:
> > > >
> > > >> The alignment requested by the assembler will have to satisfy *all* the
> > > >> requested alignments, so manually forcing everything to .align 8 (or
> > > >> .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.
> > > >
> > > > I would agree with the assessment although, I don't know that it is
> > > > documented anywhere that this is what happens. As the previous "bug"
> > > > with the trace_events was solved by me adding .align(4) everywhere, I
> > > > would think that .align(sizeof(long)) would work here too.
> > > >
> > > > It may be a good ideal to force this alignment, and not add wasted
> > > > space. If anything, if this (hypothetical) bug appears, it will most
> > > > likely show up as a crash on boot up. I'm not too concerned about it.
> > > >
> > >
> > > If the linker put in gratuitous holes, things like __ex_table would
> > > break too.
> >
> > Again, as I have said (although I said extable not __ex_table), there
> > seems to be no problem when the data is of a power of 2 as well. As
> > __ex_table size is a power of 2. We just don't know if the linker will
> > add holes when the size is something other than power of 2.
>
> Anyway, I think the best thing for now is to have Jason add
> the .align(sizeof(long)) in the inline assembly for all locations and be
> done with it.
>

agreed. something like the following is all that's needed for x86. Sparc
already has this, so mips just needs something similar. Steve, should I
re-post the entire series. Or can I just post this patch separately?

thanks,

-Jason


diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index f217cee..57f31ff 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -8,6 +8,7 @@
#include <asm/asm.h>

#define JUMP_LABEL_NOP_SIZE 5
+#define BYTES_PER_LONG (BITS_PER_LONG / 8)

#define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"

@@ -16,9 +17,10 @@ static __always_inline bool arch_static_branch(struct jump_label_key *key)
asm goto("1:"
JUMP_LABEL_INITIAL_NOP
".pushsection __jump_table, \"aw\" \n\t"
- _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
+ ".align %c0\n\t"
+ _ASM_PTR "1b, %l[l_yes], %c1 \n\t"
".popsection \n\t"
- : : "i" (key) : : l_yes);
+ : : "i" BYTES_PER_LONG, "i" (key) : : l_yes);
return false;
l_yes:
return true;

2011-03-10 20:01:56

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 14:53 -0500, Jason Baron wrote:

> agreed. something like the following is all that's needed for x86. Sparc
> already has this, so mips just needs something similar. Steve, should I
> re-post the entire series. Or can I just post this patch separately?

Nah, I could take this as a PATCH 3/2. Just reply with your SoB.

I still want to analyze the rest of the patch set. I've been doing busy
work for now. I'll do that this afternoon.

Thanks,

-- Steve

>
> diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> index f217cee..57f31ff 100644
> --- a/arch/x86/include/asm/jump_label.h
> +++ b/arch/x86/include/asm/jump_label.h
> @@ -8,6 +8,7 @@
> #include <asm/asm.h>
>
> #define JUMP_LABEL_NOP_SIZE 5
> +#define BYTES_PER_LONG (BITS_PER_LONG / 8)
>
> #define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"
>
> @@ -16,9 +17,10 @@ static __always_inline bool arch_static_branch(struct jump_label_key *key)
> asm goto("1:"
> JUMP_LABEL_INITIAL_NOP
> ".pushsection __jump_table, \"aw\" \n\t"
> - _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
> + ".align %c0\n\t"
> + _ASM_PTR "1b, %l[l_yes], %c1 \n\t"
> ".popsection \n\t"
> - : : "i" (key) : : l_yes);
> + : : "i" BYTES_PER_LONG, "i" (key) : : l_yes);
> return false;
> l_yes:
> return true;

2011-03-10 20:54:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:

> arch/mips/include/asm/jump_label.h | 22 +-
> arch/mips/kernel/jump_label.c | 2 +-
> arch/sparc/include/asm/jump_label.h | 25 +-
> arch/x86/include/asm/alternative.h | 3 +-
> arch/x86/include/asm/jump_label.h | 26 +-
> arch/x86/kernel/alternative.c | 2 +-
> arch/x86/kernel/jump_label.c | 2 +-
> arch/x86/kernel/module.c | 1 +
> include/asm-generic/vmlinux.lds.h | 14 +-
> include/linux/dynamic_debug.h | 2 -
> include/linux/jump_label.h | 86 ++++---
> include/linux/jump_label_ref.h | 44 ---
> include/linux/perf_event.h | 26 +-
> include/linux/tracepoint.h | 22 +-
> kernel/jump_label.c | 537 ++++++++++++++---------------------
> kernel/perf_event.c | 4 +-
> kernel/tracepoint.c | 23 +-
> 17 files changed, 352 insertions(+), 489 deletions(-)
> delete mode 100644 include/linux/jump_label_ref.h
>


> diff --git a/arch/sparc/include/asm/jump_label.h b/arch/sparc/include/asm/jump_label.h
> index 427d468..fc73a82 100644
> --- a/arch/sparc/include/asm/jump_label.h
> +++ b/arch/sparc/include/asm/jump_label.h
> @@ -7,17 +7,20 @@
>
> #define JUMP_LABEL_NOP_SIZE 4
>
> -#define JUMP_LABEL(key, label) \
> - do { \
> - asm goto("1:\n\t" \
> - "nop\n\t" \
> - "nop\n\t" \
> - ".pushsection __jump_table, \"a\"\n\t"\
> - ".align 4\n\t" \
> - ".word 1b, %l[" #label "], %c0\n\t" \
> - ".popsection \n\t" \
> - : : "i" (key) : : label);\
> - } while (0)
> +static __always_inline bool arch_static_branch(struct jump_label_key *key)
> +{
> + asm goto("1:\n\t"
> + "nop\n\t"
> + "nop\n\t"
> + ".pushsection __jump_table, \"aw\"\n\t"
> + ".align 4\n\t"
> + ".word 1b, %l[l_yes], %c0\n\t"
> + ".popsection \n\t"
> + : : "i" (key) : : l_yes);
> + return false;
> +l_yes:
> + return true;
> +}
>
> #endif /* __KERNEL__ */
>

Hi David,

Can I get your Acked-by for this patch.

Thanks,

-- Steve

2011-03-10 20:56:59

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:

> arch/mips/include/asm/jump_label.h | 22 +-
> arch/mips/kernel/jump_label.c | 2 +-
> arch/sparc/include/asm/jump_label.h | 25 +-
> arch/x86/include/asm/alternative.h | 3 +-
> arch/x86/include/asm/jump_label.h | 26 +-
> arch/x86/kernel/alternative.c | 2 +-
> arch/x86/kernel/jump_label.c | 2 +-
> arch/x86/kernel/module.c | 1 +
> include/asm-generic/vmlinux.lds.h | 14 +-
> include/linux/dynamic_debug.h | 2 -
> include/linux/jump_label.h | 86 ++++---
> include/linux/jump_label_ref.h | 44 ---
> include/linux/perf_event.h | 26 +-
> include/linux/tracepoint.h | 22 +-
> kernel/jump_label.c | 537 ++++++++++++++---------------------
> kernel/perf_event.c | 4 +-
> kernel/tracepoint.c | 23 +-
> 17 files changed, 352 insertions(+), 489 deletions(-)
> delete mode 100644 include/linux/jump_label_ref.h
>
> diff --git a/arch/mips/include/asm/jump_label.h b/arch/mips/include/asm/jump_label.h
> index 7622ccf..1881b31 100644
> --- a/arch/mips/include/asm/jump_label.h
> +++ b/arch/mips/include/asm/jump_label.h
> @@ -20,16 +20,18 @@
> #define WORD_INSN ".word"
> #endif
>
> -#define JUMP_LABEL(key, label) \
> - do { \
> - asm goto("1:\tnop\n\t" \
> - "nop\n\t" \
> - ".pushsection __jump_table, \"a\"\n\t" \
> - WORD_INSN " 1b, %l[" #label "], %0\n\t" \
> - ".popsection\n\t" \
> - : : "i" (key) : : label); \
> - } while (0)
> -
> +static __always_inline bool arch_static_branch(struct jump_label_key *key)
> +{
> + asm goto("1:\tnop\n\t"
> + "nop\n\t"
> + ".pushsection __jump_table, \"aw\"\n\t"
> + WORD_INSN " 1b, %l[l_yes], %0\n\t"
> + ".popsection\n\t"
> + : : "i" (key) : : l_yes);
> + return false;
> +l_yes:
> + return true;
> +}
>
> #endif /* __KERNEL__ */
>
> diff --git a/arch/mips/kernel/jump_label.c b/arch/mips/kernel/jump_label.c
> index 6001610..09ac7ca 100644
> --- a/arch/mips/kernel/jump_label.c
> +++ b/arch/mips/kernel/jump_label.c
> @@ -6,11 +6,11 @@
> * Copyright (c) 2010 Cavium Networks, Inc.
> */
>
> -#include <linux/jump_label.h>
> #include <linux/kernel.h>
> #include <linux/memory.h>
> #include <linux/mutex.h>
> #include <linux/types.h>
> +#include <linux/jump_label.h>
> #include <linux/cpu.h>
>
> #include <asm/cacheflush.h>

Hi Ralf,

Can I get your Acked-by on this patch.

Thanks,

-- Steve

2011-03-10 21:01:39

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:

> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
> index 961b6b3..dfa4c3c 100644
> --- a/arch/x86/kernel/jump_label.c
> +++ b/arch/x86/kernel/jump_label.c
> @@ -4,13 +4,13 @@
> * Copyright (C) 2009 Jason Baron <[email protected]>
> *
> */
> -#include <linux/jump_label.h>
> #include <linux/memory.h>
> #include <linux/uaccess.h>
> #include <linux/module.h>
> #include <linux/list.h>
> #include <linux/jhash.h>
> #include <linux/cpu.h>
> +#include <linux/jump_label.h>
> #include <asm/kprobes.h>
> #include <asm/alternative.h>
>
> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c

Jason,

I'm curious, is there a reason for this header change? Clean up? Maybe
it could go into a separate patch. If anything, just to make this one
slightly smaller.


-- Steve

2011-03-10 21:11:11

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* David Daney ([email protected]) wrote:
> On 03/10/2011 07:38 AM, Steven Rostedt wrote:
>> On Thu, 2011-03-10 at 09:11 -0500, Mathieu Desnoyers wrote:
>>> * Steven Rostedt ([email protected]) wrote:
>>>> On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
>>>>> Hi,
>>>>>
>>>>> Re-fresh of updates against latest -tip tree.
>>>>
>>>> Thanks Jason,
>>>>
>>>> I started looking at them, I should have comments tomorrow (if I have
>>>> any comments ;)
>>>>
>>>>>
>>>>> I've tried to split this update up somewhat, but I've only succeeded to split
>>>>> out the dynamic debug bits. The interface changes and re-write are quite
>>>>> intertwined.
>>>>>
>>>>> I believe this update should address all the comments from the previous posting
>>>>> except for Mathieu's request for a section of jump label pointers that point to
>>>>> the jump label structures (since the compiler might leave gaps in the jump label
>>>>> structures).
>>>>
>>>> The jump label structures is a list of 3 pointers, correct? I doubt that
>>>> gcc would place any holes in it as they are all aligned by natural word
>>>> size.
>>>>
>>>
>>> Hi Steven,
>>>
>>> Can you explain what would prevent gcc from aligning these 3 pointers
>>> (total of 24 bytes on 64-bit architectures) on 32-bytes ?
>
> I can:
>
> http://www.x86-64.org/documentation/abi.pdf Section 3.1.2:
>
> Aggregates and Unions
>
> Structures and unions assume the alignment of their most strictly
> aligned component. Each member is assigned to the lowest
> available offset with the appropriate alignment. The size of any
> object is always a multiple of the object‘s alignment.
>
> An array uses the same alignment as its elements, except that a
> local or global array variable of length at least 16 bytes or a C99
> variable-length array variable always has alignment of at least 16
> bytes.
>
> Structure and union objects can require padding to meet size and
> alignment constraints. The contents of any padding is undefined.
>
> I don't think it is explicitly stated, but it is also true that the size
> is the smallest value that meets the above constraints.

Your quote seems to apply in this case because:

a) the "structure" is defined in assembly, so the compiler is not
deciding the alignment of the structure.

b) the C code that iterates on the section array in jump_label.c see an
array of structures, which imply that gcc does not have no freedom in
terms of up-aligning these structures.

The problem we faced with trace event and tracepoints was caused by the
fact that we did define static structures from within C code, and the
compiler mistakenly believed that it was the only one holding references
to these structures, so it could up-align them to larger multiples as it
wants, because they were each independent objects.

>
>
>>> Also, could
>>> you point out what would refrain the linker from aligning the start of
>>> object sections on the next 32-bytes (thus power of two) address
>>> multiple ?
>>
>
> The rules of the ABI are quite specific. It would be a toolchain bug if
> this were messed up.

The trace event and tracepoint code are doing static structure
declarations into specific sections for the linker to create array
bounds with, and this last part sadly falls outside of the compiler
knowledge, and therefore might be outside of the ABI scope because the
compiler believes that these objects are local to the object and never
ever exported. Note that up-aligning on larger multiples still meets the
local alignment requirements -- it's just not valid to e.g. create an
array that assumes that these statically defined structures are next to
another without any extra padding.

>
>
>
>> Maybe it would be just easier to add another long ;)
>
> Maybe we should audit all the data structures in the entire kernel and
> add manual padding to power of 2 boundaries.

Only required when we play games with sections outside of the compiler
knowledge, which is really a handful of cases.

>
>>
>> Seriously, it would. Then it would be 32 bytes on 64bit and 16 bytes on
>> 32bit. Then I guess we can have our guarantee without doing a large
>> change to have this indirect pointer and still waste sizeof(long) bytes
>> in having it.
>>
>> Just insert a long "Reserved" word.
>>
>
> I disagree. Wasting memory to work around non-existent hypothetical
> bugs seems wrong to me.

This object layout problem has been me with tracepoints in the past, and
has bitter Steven with trace event recently. It is therefore very real.

Thanks,

Mathieu

>
> David Daney

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 21:14:39

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Steven Rostedt ([email protected]) wrote:
> On Thu, 2011-03-10 at 09:11 -0500, Mathieu Desnoyers wrote:
> > * Steven Rostedt ([email protected]) wrote:
> > > On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
> > > > Hi,
> > > >
> > > > Re-fresh of updates against latest -tip tree.
> > >
> > > Thanks Jason,
> > >
> > > I started looking at them, I should have comments tomorrow (if I have
> > > any comments ;)
> > >
> > > >
> > > > I've tried to split this update up somewhat, but I've only succeeded to split
> > > > out the dynamic debug bits. The interface changes and re-write are quite
> > > > intertwined.
> > > >
> > > > I believe this update should address all the comments from the previous posting
> > > > except for Mathieu's request for a section of jump label pointers that point to
> > > > the jump label structures (since the compiler might leave gaps in the jump label
> > > > structures).
> > >
> > > The jump label structures is a list of 3 pointers, correct? I doubt that
> > > gcc would place any holes in it as they are all aligned by natural word
> > > size.
> > >
> >
> > Hi Steven,
> >
> > Can you explain what would prevent gcc from aligning these 3 pointers
> > (total of 24 bytes on 64-bit architectures) on 32-bytes ? Also, could
> > you point out what would refrain the linker from aligning the start of
> > object sections on the next 32-bytes (thus power of two) address
> > multiple ?
>
> Maybe it would be just easier to add another long ;)
>
> Seriously, it would. Then it would be 32 bytes on 64bit and 16 bytes on
> 32bit. Then I guess we can have our guarantee without doing a large
> change to have this indirect pointer and still waste sizeof(long) bytes
> in having it.
>
> Just insert a long "Reserved" word.

I agree that this solution can work, but it's only because the
"object" definition is done in assembly in this case (and not in C, like
we did for trace event and tracepoints). Padding to power of 2 multiples
should make the linker happy. There should be a nice comment beside
these padding elements though.

Thanks,

Mathieu

>
> -- Steve
>
> >
> > I think we need to be a bit more strict in our interpretation of what
> > guarantee gcc/ld provide and don't provide with respect to section and
> > structure alignment.
> >
> > As it stands now, the section alignment of jump labels looks half-broken
> > on most architectures, and this *is* a big deal. I would really like to
> > see a patch for this (it can be a separate patch) going in for .39.
> >
> > Thank you,
> >
> > Mathieu
> >
>
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 21:18:38

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Thu, Mar 10, 2011 at 04:01:35PM -0500, Steven Rostedt wrote:
> On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
>
> > diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
> > index 961b6b3..dfa4c3c 100644
> > --- a/arch/x86/kernel/jump_label.c
> > +++ b/arch/x86/kernel/jump_label.c
> > @@ -4,13 +4,13 @@
> > * Copyright (C) 2009 Jason Baron <[email protected]>
> > *
> > */
> > -#include <linux/jump_label.h>
> > #include <linux/memory.h>
> > #include <linux/uaccess.h>
> > #include <linux/module.h>
> > #include <linux/list.h>
> > #include <linux/jhash.h>
> > #include <linux/cpu.h>
> > +#include <linux/jump_label.h>
> > #include <asm/kprobes.h>
> > #include <asm/alternative.h>
> >
> > diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
>
> Jason,
>
> I'm curious, is there a reason for this header change? Clean up? Maybe
> it could go into a separate patch. If anything, just to make this one
> slightly smaller.
>

hmm...I did that to get types.h for the new atomic_t usage.

However, now that I look at it again its bogus. The right fix is to
explicitly include <linux/types.h> from jump_label.h.

-Jason

2011-03-10 21:22:21

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Steven Rostedt ([email protected]) wrote:
> On Thu, 2011-03-10 at 14:25 -0500, Steven Rostedt wrote:
> > On Thu, 2011-03-10 at 10:57 -0800, David Daney wrote:
> > > On 03/10/2011 10:53 AM, Steven Rostedt wrote:
> > > > On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:
> > > >
> > > >> The alignment requested by the assembler will have to satisfy *all* the
> > > >> requested alignments, so manually forcing everything to .align 8 (or
> > > >> .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.
> > > >
> > > > I would agree with the assessment although, I don't know that it is
> > > > documented anywhere that this is what happens. As the previous "bug"
> > > > with the trace_events was solved by me adding .align(4) everywhere, I
> > > > would think that .align(sizeof(long)) would work here too.
> > > >
> > > > It may be a good ideal to force this alignment, and not add wasted
> > > > space. If anything, if this (hypothetical) bug appears, it will most
> > > > likely show up as a crash on boot up. I'm not too concerned about it.
> > > >
> > >
> > > If the linker put in gratuitous holes, things like __ex_table would
> > > break too.
> >
> > Again, as I have said (although I said extable not __ex_table), there
> > seems to be no problem when the data is of a power of 2 as well. As
> > __ex_table size is a power of 2. We just don't know if the linker will
> > add holes when the size is something other than power of 2.
>
> Anyway, I think the best thing for now is to have Jason add
> the .align(sizeof(long)) in the inline assembly for all locations and be
> done with it.

You seem to be contradicting yourself here. I'm concerned about having
"structures" of a size not power of two. Can we simply either

- Add a padding element at the end
or
- use .align 4*sizeof(long) at the beginning

to make sure the linker won't put any holes when it puts objects
together ?

Thanks,

Mathieu

>
> -- Steve
>
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 21:34:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 16:14 -0500, Mathieu Desnoyers wrote:

> > Just insert a long "Reserved" word.
>
> I agree that this solution can work, but it's only because the
> "object" definition is done in assembly in this case (and not in C, like
> we did for trace event and tracepoints). Padding to power of 2 multiples
> should make the linker happy. There should be a nice comment beside
> these padding elements though.

Rereading what David and even what you wrote just now, I don't think
this is even needed. As you said. The issue with us is that we had
defined structs in C as static which lost all bets. Not to mention,
these structures were not natural word aligned.

The linker should not be adding holes more than natural word alignment.
Why waste space?

-- Steve

2011-03-10 21:40:03

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Steven Rostedt ([email protected]) wrote:
> On Thu, 2011-03-10 at 10:47 -0800, David Daney wrote:
>
> > The alignment requested by the assembler will have to satisfy *all* the
> > requested alignments, so manually forcing everything to .align 8 (or
> > .align 4 for 32-bit) should ensure that the linker doesn't put in any holes.
>
> I would agree with the assessment although, I don't know that it is
> documented anywhere that this is what happens. As the previous "bug"
> with the trace_events was solved by me adding .align(4) everywhere, I
> would think that .align(sizeof(long)) would work here too.
>
> It may be a good ideal to force this alignment, and not add wasted
> space. If anything, if this (hypothetical) bug appears, it will most
> likely show up as a crash on boot up. I'm not too concerned about it.

How can you be so sure it will trigger a crash on boot up ?

The sorting phase only compare key values, so NULL pointers will be
thought as valid. Following that, there is the initial no-op'ing phrase
that might crash (on the architectures using it). Then a NULL code
pointer is considered as the discarded "init" section. A NULL key will
just be a non-match, and thus skipped.

So I can very much see scenarios where this bug would silently skip jump
labels without a crash. This is what I am concerned about. Using the
approach that "a crash will happen" as a safety net to tell us that we
missed something seems very risky to me.

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 21:42:23

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 16:22 -0500, Mathieu Desnoyers wrote:

> > Anyway, I think the best thing for now is to have Jason add
> > the .align(sizeof(long)) in the inline assembly for all locations and be
> > done with it.
>
> You seem to be contradicting yourself here. I'm concerned about having
> "structures" of a size not power of two. Can we simply either

But we don't have structures. We have data that has been allocated in
assembly. Come to think of it, it may be best to keep these as
".align 4".


>
> - Add a padding element at the end
> or
> - use .align 4*sizeof(long) at the beginning
>
> to make sure the linker won't put any holes when it puts objects
> together ?
>

The linker should be dumb and not trying to "optimize", because it has
no idea what the content is. If anything, it should try to compact
things as best as possible, with the exception of keeping things
naturally word aligned. If you added even ".align(4)" on a 64bit system,
the linker should be trying to keep everything packed.

If I get time, I could look at the linker code to see exactly what it
does, but adding holes into sections that are naturally word align seems
more like a bug in the linker than a problem that we need to deal with.

The only issue I could fathom, is if gcc added its own padding in a
section. That is, when it created the __jump_table section with one
element, it added another 4/8 bytes to make the section size a power of
two. Maybe that is a true issue, maybe not. It would seems stupid to do
so IMHO, because when you get to bigger numbers, the aligning a power of
2 can get much bigger. But perhaps it does it for small power of 2s?

-- Steve


2011-03-10 22:02:14

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Steven Rostedt ([email protected]) wrote:
> On Thu, 2011-03-10 at 16:14 -0500, Mathieu Desnoyers wrote:
>
> > > Just insert a long "Reserved" word.
> >
> > I agree that this solution can work, but it's only because the
> > "object" definition is done in assembly in this case (and not in C, like
> > we did for trace event and tracepoints). Padding to power of 2 multiples
> > should make the linker happy. There should be a nice comment beside
> > these padding elements though.
>
> Rereading what David and even what you wrote just now, I don't think
> this is even needed. As you said. The issue with us is that we had
> defined structs in C as static which lost all bets. Not to mention,
> these structures were not natural word aligned.
>
> The linker should not be adding holes more than natural word alignment.
> Why waste space?

Here is what I am concerned about (maybe wongly, we'll see):

if we take a few objects, chosen arbitrarily, which will end up being
linked together, e.g.

in kernel/

Sections:
Idx Name Size VMA LMA File off Algn

objdump -h *.o |grep jump
28 __jump_table 000007e0 0000000000000000 0000000000000000 000ba614 2**0
5 __jump_table 00000030 0000000000000000 0000000000000000 0000077c 2**0
6 __jump_table 00000048 0000000000000000 0000000000000000 000027d8 2**0
9 __jump_table 00000048 0000000000000000 0000000000000000 00002788 2**0
3 __jump_table 000000c0 0000000000000000 0000000000000000 00000f60 2**0
3 __jump_table 00000048 0000000000000000 0000000000000000 0000092c 2**0
6 __jump_table 00000018 0000000000000000 0000000000000000 000008db 2**0
8 __jump_table 00000018 0000000000000000 0000000000000000 00002744 2**0
5 __jump_table 00000030 0000000000000000 0000000000000000 000008ea 2**0
6 __jump_table 00000078 0000000000000000 0000000000000000 000066ad 2**0
7 __jump_table 00000018 0000000000000000 0000000000000000 000023f0 2**0
13 __jump_table 00000120 0000000000000000 0000000000000000 00010c9a 2**0
7 __jump_table 00000060 0000000000000000 0000000000000000 00005126 2**0
5 __jump_table 00000138 0000000000000000 0000000000000000 000020f4 2**0
6 __jump_table 00000180 0000000000000000 0000000000000000 000045c6 2**0
10 __jump_table 00000078 0000000000000000 0000000000000000 00005060 2**0

We see here that the section alignment is 2**0 for each object (that is
for a 2.6.38-rc7-tip kernel). Now as long as the alignment stays like
this, it's OK, because the linker won't add padding between the
sections. So my question is: is there any guarantee that the linker will
keep this alignment to 2**0, or is there a possibility that it bumps it
to an higher value ?

For instance, if we have two objects linked together in a first linking
phase, thus generating a resulting object that uses the 8-byte alignment
specified by the linker script, and then we have a second link phase
that uses this intermediate object and links it into the kernel, this
might add such a whole on a 32-bit architecture, no ? (e.g. see the
e1000e Makefile, it seems to use this 2-steps method when built as =Y).

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 22:11:19

by David Daney

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On 03/10/2011 01:42 PM, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 16:22 -0500, Mathieu Desnoyers wrote:
>
>>> Anyway, I think the best thing for now is to have Jason add
>>> the .align(sizeof(long)) in the inline assembly for all locations and be
>>> done with it.
>>
>> You seem to be contradicting yourself here. I'm concerned about having
>> "structures" of a size not power of two. Can we simply either
>
> But we don't have structures. We have data that has been allocated in
> assembly. Come to think of it, it may be best to keep these as
> ".align 4".
>
>
>>
>> - Add a padding element at the end
>> or
>> - use .align 4*sizeof(long) at the beginning
>>
>> to make sure the linker won't put any holes when it puts objects
>> together ?
>>
>
> The linker should be dumb and not trying to "optimize", because it has
> no idea what the content is. If anything, it should try to compact
> things as best as possible, with the exception of keeping things
> naturally word aligned. If you added even ".align(4)" on a 64bit system,
> the linker should be trying to keep everything packed.
>
> If I get time, I could look at the linker code to see exactly what it
> does, but adding holes into sections that are naturally word align seems
> more like a bug in the linker than a problem that we need to deal with.
>
> The only issue I could fathom, is if gcc added its own padding in a
> section. That is, when it created the __jump_table section with one
> element, it added another 4/8 bytes to make the section size a power of
> two. Maybe that is a true issue, maybe not. It would seems stupid to do
> so IMHO, because when you get to bigger numbers, the aligning a power of
> 2 can get much bigger. But perhaps it does it for small power of 2s?
>

GCC on x86_64 likes to align its data with .align 16:
-------------------------------
$ cat jl.c

struct foo {
long a;
long b;
long c;
};

struct foo bar = {1,2,3};

$ gcc -O3 -S jl.c
$ cat jl.s
.file "jl.c"
.globl bar
.data
.align 16
.type bar, @object
.size bar, 24
bar:
.quad 1
.quad 2
.quad 3
.ident "GCC: (GNU) 4.4.4 20100630 (Red Hat 4.4.4-10)"
.section .note.GNU-stack,"",@progbits
----------------------------------

But that shouldn't matter because we only emit data to the __jump_table
section from asm().

GCC is getting a reference to that table (array of structures really)
from a global variable, I don't see how it can violate the ABI in this case.


David Daney

2011-03-10 22:24:07

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 14:11 -0800, David Daney wrote:

> But that shouldn't matter because we only emit data to the __jump_table
> section from asm().
>
> GCC is getting a reference to that table (array of structures really)
> from a global variable, I don't see how it can violate the ABI in this case.

Right, actually gcc should be completely out of the loop, as it has no
idea about that section we created. That's done in the assembly phase,
and the linker should not be adding holes.

-- Steve

2011-03-10 22:48:50

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Steven Rostedt ([email protected]) wrote:
> On Thu, 2011-03-10 at 16:22 -0500, Mathieu Desnoyers wrote:
>
> > > Anyway, I think the best thing for now is to have Jason add
> > > the .align(sizeof(long)) in the inline assembly for all locations and be
> > > done with it.
> >
> > You seem to be contradicting yourself here. I'm concerned about having
> > "structures" of a size not power of two. Can we simply either
>
> But we don't have structures. We have data that has been allocated in
> assembly. Come to think of it, it may be best to keep these as
> ".align 4".

The .align 4 is certainly not the right answer, because it will trigger
unaligned accesses on some 64-bit architectures, as we have faced with
trace event.

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-03-10 23:16:17

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 2011-03-10 at 17:48 -0500, Mathieu Desnoyers wrote:
> * Steven Rostedt ([email protected]) wrote:
> > On Thu, 2011-03-10 at 16:22 -0500, Mathieu Desnoyers wrote:
> >
> > > > Anyway, I think the best thing for now is to have Jason add
> > > > the .align(sizeof(long)) in the inline assembly for all locations and be
> > > > done with it.
> > >
> > > You seem to be contradicting yourself here. I'm concerned about having
> > > "structures" of a size not power of two. Can we simply either
> >
> > But we don't have structures. We have data that has been allocated in
> > assembly. Come to think of it, it may be best to keep these as
> > ".align 4".
>
> The .align 4 is certainly not the right answer, because it will trigger
> unaligned accesses on some 64-bit architectures, as we have faced with
> trace event.

Will it? Seems that sparc does this regardless.

Remember, this is 3 natural word sized objects, and vmlinux.ld starts
the section off with .ALIGN 8, hence, the section is already 8 byte
aligned, all objects within this section are 8 bytes (for 64bit archs,
and 4 bytes for 32bit), why would saying ".align 4" cause the linker to
add holes to make it 4 byte aligned when it is already 8 byte aligned.
The ".align 4" should work with both 32bit and 64bit archs.

-- Steve

2011-03-10 23:25:08

by David Daney

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On 03/10/2011 03:16 PM, Steven Rostedt wrote:
> On Thu, 2011-03-10 at 17:48 -0500, Mathieu Desnoyers wrote:
>> * Steven Rostedt ([email protected]) wrote:
>>> On Thu, 2011-03-10 at 16:22 -0500, Mathieu Desnoyers wrote:
>>>
>>>>> Anyway, I think the best thing for now is to have Jason add
>>>>> the .align(sizeof(long)) in the inline assembly for all locations and be
>>>>> done with it.
>>>>
>>>> You seem to be contradicting yourself here. I'm concerned about having
>>>> "structures" of a size not power of two. Can we simply either
>>>
>>> But we don't have structures. We have data that has been allocated in
>>> assembly. Come to think of it, it may be best to keep these as
>>> ".align 4".
>>
>> The .align 4 is certainly not the right answer, because it will trigger
>> unaligned accesses on some 64-bit architectures, as we have faced with
>> trace event.
>
> Will it? Seems that sparc does this regardless.
>
> Remember, this is 3 natural word sized objects, and vmlinux.ld starts
> the section off with .ALIGN 8, hence, the section is already 8 byte
> aligned, all objects within this section are 8 bytes (for 64bit archs,
> and 4 bytes for 32bit), why would saying ".align 4" cause the linker to
> add holes to make it 4 byte aligned when it is already 8 byte aligned.
> The ".align 4" should work with both 32bit and 64bit archs.
>

It should work, but it hurts my eyes to see the source code forcing a
64-bit word to 32-bit alignment.

David Daney

2011-03-10 23:33:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 10 Mar 2011, David Daney wrote:

> On 03/10/2011 03:16 PM, Steven Rostedt wrote:
> > On Thu, 2011-03-10 at 17:48 -0500, Mathieu Desnoyers wrote:
> > > * Steven Rostedt ([email protected]) wrote:
> > > > On Thu, 2011-03-10 at 16:22 -0500, Mathieu Desnoyers wrote:
> > > >
> > > > > > Anyway, I think the best thing for now is to have Jason add
> > > > > > the .align(sizeof(long)) in the inline assembly for all locations
> > > > > > and be
> > > > > > done with it.
> > > > >
> > > > > You seem to be contradicting yourself here. I'm concerned about having
> > > > > "structures" of a size not power of two. Can we simply either
> > > >
> > > > But we don't have structures. We have data that has been allocated in
> > > > assembly. Come to think of it, it may be best to keep these as
> > > > ".align 4".
> > >
> > > The .align 4 is certainly not the right answer, because it will trigger
> > > unaligned accesses on some 64-bit architectures, as we have faced with
> > > trace event.
> >
> > Will it? Seems that sparc does this regardless.
> >
> > Remember, this is 3 natural word sized objects, and vmlinux.ld starts
> > the section off with .ALIGN 8, hence, the section is already 8 byte
> > aligned, all objects within this section are 8 bytes (for 64bit archs,
> > and 4 bytes for 32bit), why would saying ".align 4" cause the linker to
> > add holes to make it 4 byte aligned when it is already 8 byte aligned.
> > The ".align 4" should work with both 32bit and 64bit archs.
> >
>
> It should work, but it hurts my eyes to see the source code forcing a 64-bit
> word to 32-bit alignment.

We solved such stuff with macros in other places already.

#ifdef 64bit
# define BLA 8
#else
# define BLA 4
#endif

and then use

.aling BLA

Where is the problem?

Thanks,

tglx

2011-03-10 23:43:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Fri, 2011-03-11 at 00:32 +0100, Thomas Gleixner wrote:

> > It should work, but it hurts my eyes to see the source code forcing a 64-bit
> > word to 32-bit alignment.
>
> We solved such stuff with macros in other places already.
>
> #ifdef 64bit
> # define BLA 8
> #else
> # define BLA 4
> #endif
>
> and then use
>
> .aling BLA
>
> Where is the problem?

That's basically the solution that Jason came up with, and what I
expected. What caught my eye was that David Miller had .align 4 for this
on sparc which is the one arch that is truly picky about such things. As
I understood more about this, I see that would work for both archs.

That doesn't mean I'm against the .align BLA solution.

-- Steve

2011-03-10 23:52:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, 10 Mar 2011, Steven Rostedt wrote:
> On Fri, 2011-03-11 at 00:32 +0100, Thomas Gleixner wrote:
>
> > > It should work, but it hurts my eyes to see the source code forcing a 64-bit
> > > word to 32-bit alignment.
> >
> > We solved such stuff with macros in other places already.
> >
> > #ifdef 64bit
> > # define BLA 8
> > #else
> > # define BLA 4
> > #endif
> >
> > and then use
> >
> > .aling BLA
> >
> > Where is the problem?
>
> That's basically the solution that Jason came up with, and what I
> expected. What caught my eye was that David Miller had .align 4 for this
> on sparc which is the one arch that is truly picky about such things. As
> I understood more about this, I see that would work for both archs.

I doubt that sparc will barf when we make this 64/32bit aware. If it
does, we have some bigger fish to fry.

> That doesn't mean I'm against the .align BLA solution.

As hpa pointed out in some unreadable mail, we have _ASM_ALIGN on x86
already exactly for this purpose.

Thanks,

tglx

2011-03-11 00:39:17

by Ralf Baechle

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Thu, Mar 10, 2011 at 04:22:14PM -0500, Mathieu Desnoyers wrote:

> You seem to be contradicting yourself here. I'm concerned about having
> "structures" of a size not power of two. Can we simply either
>
> - Add a padding element at the end
> or
> - use .align 4*sizeof(long) at the beginning
>
> to make sure the linker won't put any holes when it puts objects
> together ?

It may only be a technicality but but careful with .align. On some
architectures .align x will align to a multiple of x; on other
architectures to 2^x. Another reason to stick to C wherever possible.

Ralf

2011-03-11 01:19:07

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

On Fri, 2011-03-11 at 01:38 +0100, Ralf Baechle wrote:
> On Thu, Mar 10, 2011 at 04:22:14PM -0500, Mathieu Desnoyers wrote:
>
> > You seem to be contradicting yourself here. I'm concerned about having
> > "structures" of a size not power of two. Can we simply either
> >
> > - Add a padding element at the end
> > or
> > - use .align 4*sizeof(long) at the beginning
> >
> > to make sure the linker won't put any holes when it puts objects
> > together ?
>
> It may only be a technicality but but careful with .align. On some
> architectures .align x will align to a multiple of x; on other
> architectures to 2^x. Another reason to stick to C wherever possible.

.balign is saner:

http://sourceware.org/binutils/docs-2.21/as/Balign.html

cheers


Attachments:
signature.asc (198.00 B)
This is a digitally signed message part

2011-03-11 02:02:57

by Ralf Baechle

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Thu, Mar 10, 2011 at 04:18:01PM -0500, Jason Baron wrote:

> > I'm curious, is there a reason for this header change? Clean up? Maybe
> > it could go into a separate patch. If anything, just to make this one
> > slightly smaller.
> >
>
> hmm...I did that to get types.h for the new atomic_t usage.
>
> However, now that I look at it again its bogus. The right fix is to
> explicitly include <linux/types.h> from jump_label.h.

And <linux/compiler.h> for unlikely().

Ralf

2011-03-11 02:05:57

by Ralf Baechle

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Thu, Mar 10, 2011 at 03:56:56PM -0500, Steven Rostedt wrote:

> On Wed, 2011-03-09 at 15:47 -0500, Jason Baron wrote:
>
> > arch/mips/include/asm/jump_label.h | 22 +-
> > arch/mips/kernel/jump_label.c | 2 +-
> > arch/sparc/include/asm/jump_label.h | 25 +-
> > arch/x86/include/asm/alternative.h | 3 +-
> > arch/x86/include/asm/jump_label.h | 26 +-
> > arch/x86/kernel/alternative.c | 2 +-
> > arch/x86/kernel/jump_label.c | 2 +-
> > arch/x86/kernel/module.c | 1 +
> > include/asm-generic/vmlinux.lds.h | 14 +-
> > include/linux/dynamic_debug.h | 2 -
> > include/linux/jump_label.h | 86 ++++---
> > include/linux/jump_label_ref.h | 44 ---
> > include/linux/perf_event.h | 26 +-
> > include/linux/tracepoint.h | 22 +-
> > kernel/jump_label.c | 537 ++++++++++++++---------------------
> > kernel/perf_event.c | 4 +-
> > kernel/tracepoint.c | 23 +-
> > 17 files changed, 352 insertions(+), 489 deletions(-)
> > delete mode 100644 include/linux/jump_label_ref.h
> >
> > diff --git a/arch/mips/include/asm/jump_label.h b/arch/mips/include/asm/jump_label.h
> > index 7622ccf..1881b31 100644
> > --- a/arch/mips/include/asm/jump_label.h
> > +++ b/arch/mips/include/asm/jump_label.h
> > @@ -20,16 +20,18 @@
> > #define WORD_INSN ".word"
> > #endif
> >
> > -#define JUMP_LABEL(key, label) \
> > - do { \
> > - asm goto("1:\tnop\n\t" \
> > - "nop\n\t" \
> > - ".pushsection __jump_table, \"a\"\n\t" \
> > - WORD_INSN " 1b, %l[" #label "], %0\n\t" \
> > - ".popsection\n\t" \
> > - : : "i" (key) : : label); \
> > - } while (0)
> > -
> > +static __always_inline bool arch_static_branch(struct jump_label_key *key)
> > +{
> > + asm goto("1:\tnop\n\t"
> > + "nop\n\t"
> > + ".pushsection __jump_table, \"aw\"\n\t"
> > + WORD_INSN " 1b, %l[l_yes], %0\n\t"
> > + ".popsection\n\t"
> > + : : "i" (key) : : l_yes);
> > + return false;
> > +l_yes:
> > + return true;
> > +}
> >
> > #endif /* __KERNEL__ */
> >
> > diff --git a/arch/mips/kernel/jump_label.c b/arch/mips/kernel/jump_label.c
> > index 6001610..09ac7ca 100644
> > --- a/arch/mips/kernel/jump_label.c
> > +++ b/arch/mips/kernel/jump_label.c
> > @@ -6,11 +6,11 @@
> > * Copyright (c) 2010 Cavium Networks, Inc.
> > */
> >
> > -#include <linux/jump_label.h>
> > #include <linux/kernel.h>
> > #include <linux/memory.h>
> > #include <linux/mutex.h>
> > #include <linux/types.h>
> > +#include <linux/jump_label.h>
> > #include <linux/cpu.h>
> >
> > #include <asm/cacheflush.h>
>
> Hi Ralf,
>
> Can I get your Acked-by on this patch.

You could have gotten it quicker if I had been on cc ...

I'm happy with the change to arch/mips/include/asm/jump_label.h and
assuming that the change to arch/mips/kernel/jump_label.c will be dropped

Acked-by: Ralf Baechle <[email protected]>

Ralf

2011-03-11 02:16:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/2] jump label: introduce static_branch() interface

On Fri, 2011-03-11 at 03:05 +0100, Ralf Baechle wrote:

> I'm happy with the change to arch/mips/include/asm/jump_label.h and
> assuming that the change to arch/mips/kernel/jump_label.c will be dropped

Noted.

>
> Acked-by: Ralf Baechle <[email protected]>

Jason,

Could you respin the patchset and send it out again with the comments
that were made. And please Cc Ralf on any patch that touches the mips
arch.

Thanks,

-- Steve

2011-03-11 02:39:17

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 0/2] jump label: update for .39

* Ralf Baechle ([email protected]) wrote:
> On Thu, Mar 10, 2011 at 04:22:14PM -0500, Mathieu Desnoyers wrote:
>
> > You seem to be contradicting yourself here. I'm concerned about having
> > "structures" of a size not power of two. Can we simply either
> >
> > - Add a padding element at the end
> > or
> > - use .align 4*sizeof(long) at the beginning
> >
> > to make sure the linker won't put any holes when it puts objects
> > together ?
>
> It may only be a technicality but but careful with .align. On some
> architectures .align x will align to a multiple of x; on other
> architectures to 2^x. Another reason to stick to C wherever possible.

Thanks for the reminder! Not sure if using C is practical in the case of
jump labels though. As Michael pointed out, using .balign might be
better here.

Mathieu

>
> Ralf

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com