Subject: [PATCH -tip v9 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts

Hi,
Here is the version 9 of NOKPROBE_SYMBOL series, including
bugfixes. This updates some issues pointed in Steven's review
against v8 (Thank you!)

Blacklist improvements
======================
Currently, kprobes uses __kprobes annotation and internal symbol-
name based blacklist to prohibit probing on some functions, because
to probe those functions may cause an infinite recursive loop by
int3/debug exceptions.
However, current mechanisms have some problems especially from the
view point of maintaining code;
- __kprobes is easy to confuse the function is
used by kprobes, despite it just means "no kprobe
on it".
- __kprobes moves functions to different section
this will be not good for cache optimization.
- symbol-name based solution is not good at all,
since the symbol name easily be changed, and
we cannot notice it.
- it doesn't support functions in modules at all.

Thus, I decided to introduce new NOKPROBE_SYMBOL macro for building
an integrated kprobe blacklist.

The new macro stores the address of the given symbols into
_kprobe_blacklist section, and initialize the blacklist based on the
address list at boottime.
This is also applied for modules. When loading a module, kprobes
finds the blacklist symbols in _kprobe_blacklist section in the
module automatically.
This series replaces all __kprobes on x86 and generic code with the
NOKPROBE_SYMBOL() too.

Although, the new blacklist still support old-style __kprobes by
decoding .kprobes.text if exist, because it still be used on arch-
dependent code except for x86.

Scalability effort
==================
This series fixes not only the kernel crashable "qualitative" bugs
but also "quantitative" issue with massive multiple kprobes. Thus
we can now do a stress test, putting kprobes on all (non-blacklisted)
kernel functions and enabling all of them.
To set kprobes on all kernel functions, run the below script.
----
#!/bin/sh
TRACE_DIR=/sys/kernel/debug/tracing/
echo > $TRACE_DIR/kprobe_events
grep -iw t /proc/kallsyms | tr -d . | \
awk 'BEGIN{i=0};{print("p:"$3"_"i, "0x"$1); i++}' | \
while read l; do echo $l >> $TRACE_DIR/kprobe_events ; done
----
Since it doesn't check the blacklist at all, you'll see many write
errors, but no problem :).

Note that a kind of performance issue is still in the kprobe-tracer
if you trace all functions. Since a few ftrace functions are called
inside the kprobe tracer even if we shut off the tracing (tracing_on
= 0), enabling kprobe-events on the functions will cause a bad
performance impact (it is safe, but you'll see the system slowdown
and no event recorded because it is just ignored).
To find those functions, you can use the third column of
(debugfs)/tracing/kprobe_profile as below, which tells you the number
of miss-hit(ignored) for each events. If you find that some events
which have small number in 2nd column and large number in 3rd column,
those may course the slowdown.
----
# sort -rnk 3 (debugfs)/tracing/kprobe_profile | head
ftrace_cmp_recs_4907 264950231 33648874543
ring_buffer_lock_reserve_5087 0 4802719935
trace_buffer_lock_reserve_5199 0 4385319303
trace_event_buffer_lock_reserve_5200 0 4379968153
ftrace_location_range_4918 18944015 2407616669
bsearch_17098 18979815 2407579741
ftrace_location_4972 18927061 2406723128
ftrace_int3_handler_1211 18926980 2406303531
poke_int3_handler_199 18448012 1403516611
inat_get_opcode_attribute_16941 0 12715314
----

I'd recommend you to enable events on such functions after all other
events enabled. Then its performance impact becomes minimum.

To enable kprobes on all kernel functions, run the below script.
----
#!/bin/sh
TRACE_DIR=/sys/kernel/debug/tracing
echo "Disable tracing to remove tracing overhead"
echo 0 > $TRACE_DIR/tracing_on

BADS="ftrace_cmp_recs ring_buffer_lock_reserve trace_buffer_lock_reserve trace_event_buffer_lock_reserve ftrace_location_range bsearch ftrace_location ftrace_int3_handler poke_int3_handler inat_get_opcode_attribute"
HIDES=
for i in $BADS; do HIDES=$HIDES" --hide=$i*"; done

SDATE=`date +%s`
echo "Enabling trace events: start at $SDATE"

cd $TRACE_DIR/events/kprobes/
for i in `ls $HIDES` ; do echo 1 > $i/enable; done
for j in $BADS; do for i in `ls -d $j*`;do echo 1 > $i/enable; done; done

EDATE=`date +%s`
TIME=`expr $EDATE - $SDATE`
echo "Elapsed time: $TIME"
----
Note: Perhaps, using systemtap doesn't need to consider above bad
symbols since it has own logic not to probe itself.

Result
======
These were also enabled after all other events are enabled.
And it took 2254 sec(without any intervals) for enabling 37222 probes.
And at that point, the perf top showed below result:
----
Samples: 10K of event 'cycles', Event count (approx.): 270565996
+ 16.39% [kernel] [k] native_load_idt
+ 11.17% [kernel] [k] int3
- 7.91% [kernel] [k] 0x00007fffa018e8e0
- 0xffffffffa018d8e0
59.09% trace_event_buffer_lock_reserve
kprobe_trace_func
kprobe_dispatcher
+ 40.45% trace_event_buffer_lock_reserve
----
0x00007fffa018e8e0 may be the trampoline buffer of an optimized
probe on trace_event_buffer_lock_reserve. native_load_idt and int3
are also called from normal kprobes.
This means, at least my environment, kprobes now passed the
stress test, and even if we put probes on all available functions
it just slows down about 50%.

Changes from v8:
- Add Steven's Reviewed-by tag to some patches (Thank you!)
- [1/26] Add WARN_ON() to check kprobe_status
- [4/26] Fix the documentation and a comment
- [8/26] Update the patch description
- [12/26] small style fix
- [14/26] Fix line-break style issues
- [16/26] Fix line-break style issues

Changes from v7:
- [24/26] Enlarge hash table to 512 instead of 4096.
- Re-evaluate the performance improvements.

Changes from v6:
- Updated patches on the latest -tip.
- [1/26] Add patch: Fix page-fault handling logic on x86 kprobes
- [2/26] Add patch: Allow to handle reentered kprobe on singlestepping
- [9/26] Add new patch: Call exception_enter after kprobes handled
- [12/26] Allow probing fetch functions in trace_uprobe.c.
- [24/26] Add new patch: Enlarge kprobes hash table size
- [25/26] Add new patch: Kprobe cache for frequently accessd kprobes
- [26/26] Add new patch: Skip Ftrace hlist check with ftrace-based kprobe

Changes from v5:
- [2/22] Introduce nokprobe_inline macro
- [6/22] Prohibit probing on memset/memcpy
- [11/22] Allow probing on text_poke/hw_breakpoint
- [12/22] Use nokprobe_inline macro instead of __always_inline
- [14/22] Ditto.
- [21/22] Remove preempt disable/enable from kprobes/x86
- [22/22] Add emergency int3 recovery code

Thank you,

---

Masami Hiramatsu (26):
[BUGFIX]kprobes/x86: Fix page-fault handling logic
kprobes/x86: Allow to handle reentered kprobe on singlestepping
kprobes: Prohibit probing on .entry.text code
kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist
[BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*
[BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt
[BUGFIX] x86: Prohibit probing on thunk functions and restore
kprobes/x86: Call exception handlers directly from do_int3/do_debug
x86: Call exception_enter after kprobes handled
kprobes/x86: Allow probe on some kprobe preparation functions
kprobes: Allow probe on some kprobe functions
ftrace/*probes: Allow probing on some functions
x86: Allow kprobes on text_poke/hw_breakpoint
x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation
kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes
ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace
notifier: Use NOKPROBE_SYMBOL macro in notifier
sched: Use NOKPROBE_SYMBOL macro in sched
kprobes: Show blacklist entries via debugfs
kprobes: Support blacklist functions in module
kprobes: Use NOKPROBE_SYMBOL() in sample modules
kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text
kprobes/x86: Remove unneeded preempt_disable/enable in interrupt handlers
kprobes: Enlarge hash table to 512 entries
kprobes: Introduce kprobe cache to reduce cache misshits
ftrace: Introduce FTRACE_OPS_FL_SELF_FILTER for ftrace-kprobe


Documentation/kprobes.txt | 24 +
arch/Kconfig | 10
arch/x86/include/asm/asm.h | 7
arch/x86/include/asm/kprobes.h | 2
arch/x86/include/asm/traps.h | 2
arch/x86/kernel/alternative.c | 3
arch/x86/kernel/apic/hw_nmi.c | 3
arch/x86/kernel/cpu/common.c | 4
arch/x86/kernel/cpu/perf_event.c | 3
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 3
arch/x86/kernel/dumpstack.c | 9
arch/x86/kernel/entry_32.S | 33 --
arch/x86/kernel/entry_64.S | 20 -
arch/x86/kernel/hw_breakpoint.c | 5
arch/x86/kernel/kprobes/core.c | 165 ++++----
arch/x86/kernel/kprobes/ftrace.c | 19 +
arch/x86/kernel/kprobes/opt.c | 32 +-
arch/x86/kernel/kvm.c | 4
arch/x86/kernel/nmi.c | 18 +
arch/x86/kernel/paravirt.c | 6
arch/x86/kernel/traps.c | 35 +-
arch/x86/lib/thunk_32.S | 3
arch/x86/lib/thunk_64.S | 3
arch/x86/mm/fault.c | 29 +
include/asm-generic/vmlinux.lds.h | 9
include/linux/compiler.h | 2
include/linux/ftrace.h | 3
include/linux/kprobes.h | 23 +
include/linux/module.h | 5
kernel/kprobes.c | 607 +++++++++++++++++++++---------
kernel/module.c | 6
kernel/notifier.c | 22 +
kernel/sched/core.c | 7
kernel/trace/ftrace.c | 3
kernel/trace/trace_event_perf.c | 5
kernel/trace/trace_kprobe.c | 71 ++--
kernel/trace/trace_probe.c | 65 ++-
kernel/trace/trace_probe.h | 15 -
kernel/trace/trace_uprobe.c | 20 -
samples/kprobes/jprobe_example.c | 1
samples/kprobes/kprobe_example.c | 3
samples/kprobes/kretprobe_example.c | 2
42 files changed, 830 insertions(+), 481 deletions(-)

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]


Subject: [PATCH -tip v9 01/26] [BUGFIX]kprobes/x86: Fix page-fault handling logic

Current kprobes in-kernel page fault handler doesn't
expect that its single-stepping can be interrupted by
an NMI handler which may cause a page fault(e.g. perf
with callback tracing).
In that case, the page-fault handled by kprobes and it
misunderstands the page-fault has been caused by the
single-stepping code and tries to recover IP address
to probed address.
But the truth is the page-fault has been caused by the
NMI handler, and do_page_fault failes to handle real
page fault because the IP address is modified and
causes Kernel BUGs like below.

----
[ 2264.726905] BUG: unable to handle kernel NULL pointer
dereference at 0000000000000020
[ 2264.727190] IP: [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40
[ 2264.727380] PGD cbcd067 PUD cbcc067 PMD 0
[ 2264.727529] Oops: 0000 [#1] SMP
[ 2264.727683] Modules linked in: ipt_MASQUERADE bnep bluetooth 6lowpan_iphc iptable_nat nf_nat_ipv4 nf_nat aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper virtio_balloon snd_hda_intel snd_hda_codec snd_hwdep
[ 2264.728391] CPU: 1 PID: 25094 Comm: perf Not tainted 3.14.0-rc1.badprobe+ #24
[ 2264.728592] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 2264.728747] task: ffff88003db9c210 ti: ffff88000caac000 task.ti: ffff88000caac000
[ 2264.728950] RIP: 0010:[<ffffffff813c46e0>] [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40
[ 2264.729163] RSP: 0018:ffff88003fd06bd0 EFLAGS: 00010246
[ 2264.729291] RAX: 0000000000000000 RBX: ffff88003fd06bf8 RCX: 0000000000000002
[ 2264.729472] RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff88003fd06bf8
[ 2264.729661] RBP: ffff88003fd06bd8 R08: 0000000000000030 R09: 0000000000000000
[ 2264.729789] R10: 000000000000001e R11: 0000000000000015 R12: ffff88000caadfd8
[ 2264.729789] R13: ffff88003d76bc00 R14: ffff88003db9c210 R15: 0000000000000020
[ 2264.729789] FS: 00007f398bbcc780(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
[ 2264.729789] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2264.729789] CR2: 0000000000000020 CR3: 00000000204f2000 CR4: 00000000000007e0
[ 2264.729789] Stack:
[ 2264.729789] ffffffff813c5fd4 ffff88003fd06c30 ffffffff810183b0 ffff88003d76bc00
[ 2264.729789] ffff88003fd06ef8 0000000000000000 0000000000000000 ffff88003d76bc00
[ 2264.729789] 000000000000000c 0000000000052ce0 ffff88000956f800 ffff88000caadf58
[ 2264.729789] Call Trace:
[ 2264.729789] <NMI>
[ 2264.729789] [<ffffffff813c5fd4>] ? copy_from_user_nmi+0x64/0x70
[ 2264.729789] [<ffffffff810183b0>] perf_callchain_user+0xc0/0x220
[ 2264.729789] [<ffffffff8111cdb4>] perf_callchain+0x1c4/0x210
[ 2264.729789] [<ffffffff811193e3>] perf_prepare_sample+0x253/0x320
[ 2264.729789] [<ffffffff81119597>] __perf_event_overflow+0xe7/0x230
[ 2264.729789] [<ffffffff810177c8>] ? x86_perf_event_set_period+0xe8/0x150
[ 2264.729789] [<ffffffff8111a034>] perf_event_overflow+0x14/0x20
[ 2264.729789] [<ffffffff8101cf2d>] intel_pmu_handle_irq+0x1cd/0x400
[ 2264.729789] [<ffffffff81854c71>] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789] [<ffffffff813c46e0>] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789] [<ffffffff810161bb>] perf_event_nmi_handler+0x2b/0x50
[ 2264.729789] [<ffffffff81006ce8>] nmi_handle+0x88/0x180
[ 2264.729789] [<ffffffff813c46e0>] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789] [<ffffffff81006fca>] default_do_nmi+0x4a/0x140
[ 2264.729789] [<ffffffff81007168>] do_nmi+0xa8/0xe0
[ 2264.729789] [<ffffffff81856fe1>] end_repeat_nmi+0x1e/0x2e
[ 2264.729789] [<ffffffff813c46e0>] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789] [<ffffffff810376ac>] ? skip_prefixes+0x1c/0x40
[ 2264.729789] [<ffffffff813c4c80>] ? bad_get_user+0x17/0x17
[ 2264.729789] [<ffffffff81854c71>] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789] [<ffffffff81854c71>] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789] [<ffffffff81854c71>] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789] <<EOE>>
[ 2264.729789] <#DB> [<ffffffff813c46e0>] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789] [<ffffffff813c46e1>] ? copy_user_generic_string+0x1/0x40
[ 2264.729789] [<ffffffff810e4321>] ? ftrace_cmp_recs+0x1/0x30
[ 2264.729789] [<ffffffff813c4c85>] ? inat_get_opcode_attribute+0x5/0x20
[ 2264.729789] [<ffffffff813c4c85>] ? inat_get_opcode_attribute+0x5/0x20
[ 2264.729789] [<ffffffff810376ac>] ? skip_prefixes+0x1c/0x40
[ 2264.729789] [<ffffffff81038107>] resume_execution+0x37/0x1d0
[ 2264.729789] [<ffffffff810382df>] kprobe_debug_handler+0x3f/0xe0
[ 2264.729789] [<ffffffff8100305f>] do_debug+0x7f/0x1d0
[ 2264.729789] [<ffffffff81856afa>] debug+0x3a/0x50
[ 2264.729789] <<EOE>>
[ 2264.729789] [<ffffffff811aae38>] ? seq_read+0x88/0x390
[ 2264.729789] [<ffffffff81363bc4>] ? security_file_permission+0x84/0xa0
[ 2264.729789] [<ffffffff811ed58d>] proc_reg_read+0x3d/0x80
[ 2264.729789] [<ffffffff81187c7b>] vfs_read+0x9b/0x160
[ 2264.729789] [<ffffffff81188739>] SyS_read+0x49/0xa0
[ 2264.729789] [<ffffffff81854f99>] system_call_fastpath+0x16/0x1b
[ 2264.729789] Code: c9 75 ee 21 d2 74 10 89 d1 8a 06 88 07 48 ff c6 48 ff c7 ff
c9 75 f2 31 c0 0f 1f 00 c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 <cc> 1f 00
83 fa 08 72 27 89 f9 83 e1 07 74 15 83 e9 08 f7 d9 29
[ 2264.729789] RIP [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40
[ 2264.729789] RSP <ffff88003fd06bd0>
[ 2264.729789] CR2: 0000000000000020
[ 2264.729789] ---[ end trace 533fc16b4cc45447 ]---
[ 2264.729789] Kernel panic - not syncing: Fatal exception in interrupt
[ 2264.729789] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xf
fffffff80000000-0xffffffff9fffffff)
----

To handle this correctly, I fixed the kprobes fault
handler to ensure the faulted ip address is its own
single-step buffer instead of checking current kprobe
state.

Changes from v8:
- add WARN_ON() to check kprobe_status.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 79a3f96..137e873 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -897,9 +897,10 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();

- switch (kcb->kprobe_status) {
- case KPROBE_HIT_SS:
- case KPROBE_REENTER:
+ if (unlikely(regs->ip == (unsigned long)cur->ainsn.insn)) {
+ /* This must happen on singlestepping */
+ WARN_ON(kcb->kprobe_status != KPROBE_HIT_SS &&
+ kcb->kprobe_status != KPROBE_REENTER);
/*
* We are here because the instruction being single
* stepped caused a page fault. We reset the current
@@ -914,9 +915,8 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
else
reset_current_kprobe();
preempt_enable_no_resched();
- break;
- case KPROBE_HIT_ACTIVE:
- case KPROBE_HIT_SSDONE:
+ } else if (kcb->kprobe_status == KPROBE_HIT_ACTIVE ||
+ kcb->kprobe_status == KPROBE_HIT_SSDONE) {
/*
* We increment the nmissed count for accounting,
* we can also use npre/npostfault count for accounting
@@ -945,10 +945,8 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
* fixup routine could not handle it,
* Let do_page_fault() fix it.
*/
- break;
- default:
- break;
}
+
return 0;
}


Subject: [PATCH -tip v9 07/26] [BUGFIX] x86: Prohibit probing on thunk functions and restore

thunk/restore functions are also used for tracing irqoff etc.
and those are involved in kprobe's exception handling.
Prohibit probing on them to avoid kernel crash.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/lib/thunk_32.S | 3 ++-
arch/x86/lib/thunk_64.S | 3 +++
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/thunk_32.S b/arch/x86/lib/thunk_32.S
index 2930ae0..28f85c91 100644
--- a/arch/x86/lib/thunk_32.S
+++ b/arch/x86/lib/thunk_32.S
@@ -4,8 +4,8 @@
* (inspired by Andi Kleen's thunk_64.S)
* Subject to the GNU public license, v.2. No warranty of any kind.
*/
-
#include <linux/linkage.h>
+ #include <asm/asm.h>

#ifdef CONFIG_TRACE_IRQFLAGS
/* put return address in eax (arg1) */
@@ -22,6 +22,7 @@
popl %ecx
popl %eax
ret
+ _ASM_NOKPROBE(\name)
.endm

thunk_ra trace_hardirqs_on_thunk,trace_hardirqs_on_caller
diff --git a/arch/x86/lib/thunk_64.S b/arch/x86/lib/thunk_64.S
index a63efd6..92d9fea 100644
--- a/arch/x86/lib/thunk_64.S
+++ b/arch/x86/lib/thunk_64.S
@@ -8,6 +8,7 @@
#include <linux/linkage.h>
#include <asm/dwarf2.h>
#include <asm/calling.h>
+#include <asm/asm.h>

/* rdi: arg1 ... normal C conventions. rax is saved/restored. */
.macro THUNK name, func, put_ret_addr_in_rdi=0
@@ -25,6 +26,7 @@
call \func
jmp restore
CFI_ENDPROC
+ _ASM_NOKPROBE(\name)
.endm

#ifdef CONFIG_TRACE_IRQFLAGS
@@ -43,3 +45,4 @@ restore:
RESTORE_ARGS
ret
CFI_ENDPROC
+ _ASM_NOKPROBE(restore)

Subject: [PATCH -tip v9 05/26] [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*

Prohibit probing on debug_stack_reset and debug_stack_set_zero.
Since the both functions are called from TRACE_IRQS_ON/OFF_DEBUG
macros which run in int3 ist entry, probing it may cause a soft
lockup.

This happens when the kernel built with CONFIG_DYNAMIC_FTRACE=y
and CONFIG_TRACE_IRQFLAGS=y.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Seiji Aguchi <[email protected]>
---
arch/x86/kernel/cpu/common.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a135239..5af696d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -8,6 +8,7 @@
#include <linux/delay.h>
#include <linux/sched.h>
#include <linux/init.h>
+#include <linux/kprobes.h>
#include <linux/kgdb.h>
#include <linux/smp.h>
#include <linux/io.h>
@@ -1160,6 +1161,7 @@ int is_debug_stack(unsigned long addr)
(addr <= __get_cpu_var(debug_stack_addr) &&
addr > (__get_cpu_var(debug_stack_addr) - DEBUG_STKSZ));
}
+NOKPROBE_SYMBOL(is_debug_stack);

DEFINE_PER_CPU(u32, debug_idt_ctr);

@@ -1168,6 +1170,7 @@ void debug_stack_set_zero(void)
this_cpu_inc(debug_idt_ctr);
load_current_idt();
}
+NOKPROBE_SYMBOL(debug_stack_set_zero);

void debug_stack_reset(void)
{
@@ -1176,6 +1179,7 @@ void debug_stack_reset(void)
if (this_cpu_dec_return(debug_idt_ctr) == 0)
load_current_idt();
}
+NOKPROBE_SYMBOL(debug_stack_reset);

#else /* CONFIG_X86_64 */


Subject: [PATCH -tip v9 10/26] kprobes/x86: Allow probe on some kprobe preparation functions

There is no need to prohibit probing on the functions
used in preparation phase. Those are safely probed because
those are not invoked from breakpoint/fault/debug handlers,
there is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist.
can_boost
can_probe
can_optimize
is_IF_modifier
__copy_instruction
copy_optimized_instructions
arch_copy_kprobe
arch_prepare_kprobe
arch_arm_kprobe
arch_disarm_kprobe
arch_remove_kprobe
arch_trampoline_kprobe
arch_prepare_kprobe_ftrace
arch_prepare_optimized_kprobe
arch_check_optimized_kprobe
arch_within_optimized_kprobe
__arch_remove_optimized_kprobe
arch_remove_optimized_kprobe
arch_optimize_kprobes
arch_unoptimize_kprobe

I tested those functions by putting kprobes on all
instructions in the functions with the bash script
I sent to LKML. See below,

https://lkml.org/lkml/2014/3/27/33

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 20 ++++++++++----------
arch/x86/kernel/kprobes/ftrace.c | 2 +-
arch/x86/kernel/kprobes/opt.c | 24 ++++++++++++------------
3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 4157d648..147e65e1 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -159,7 +159,7 @@ static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
* Returns non-zero if opcode is boostable.
* RIP relative instructions are adjusted at copying time in 64 bits mode
*/
-int __kprobes can_boost(kprobe_opcode_t *opcodes)
+int can_boost(kprobe_opcode_t *opcodes)
{
kprobe_opcode_t opcode;
kprobe_opcode_t *orig_opcodes = opcodes;
@@ -260,7 +260,7 @@ unsigned long recover_probed_instruction(kprobe_opcode_t *buf, unsigned long add
}

/* Check if paddr is at an instruction boundary */
-static int __kprobes can_probe(unsigned long paddr)
+static int can_probe(unsigned long paddr)
{
unsigned long addr, __addr, offset = 0;
struct insn insn;
@@ -299,7 +299,7 @@ static int __kprobes can_probe(unsigned long paddr)
/*
* Returns non-zero if opcode modifies the interrupt flag.
*/
-static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
+static int is_IF_modifier(kprobe_opcode_t *insn)
{
/* Skip prefixes */
insn = skip_prefixes(insn);
@@ -322,7 +322,7 @@ static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
* If not, return null.
* Only applicable to 64-bit x86.
*/
-int __kprobes __copy_instruction(u8 *dest, u8 *src)
+int __copy_instruction(u8 *dest, u8 *src)
{
struct insn insn;
kprobe_opcode_t buf[MAX_INSN_SIZE];
@@ -365,7 +365,7 @@ int __kprobes __copy_instruction(u8 *dest, u8 *src)
return insn.length;
}

-static int __kprobes arch_copy_kprobe(struct kprobe *p)
+static int arch_copy_kprobe(struct kprobe *p)
{
int ret;

@@ -392,7 +392,7 @@ static int __kprobes arch_copy_kprobe(struct kprobe *p)
return 0;
}

-int __kprobes arch_prepare_kprobe(struct kprobe *p)
+int arch_prepare_kprobe(struct kprobe *p)
{
if (alternatives_text_reserved(p->addr, p->addr))
return -EINVAL;
@@ -407,17 +407,17 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return arch_copy_kprobe(p);
}

-void __kprobes arch_arm_kprobe(struct kprobe *p)
+void arch_arm_kprobe(struct kprobe *p)
{
text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
}

-void __kprobes arch_disarm_kprobe(struct kprobe *p)
+void arch_disarm_kprobe(struct kprobe *p)
{
text_poke(p->addr, &p->opcode, 1);
}

-void __kprobes arch_remove_kprobe(struct kprobe *p)
+void arch_remove_kprobe(struct kprobe *p)
{
if (p->ainsn.insn) {
free_insn_slot(p->ainsn.insn, (p->ainsn.boostable == 1));
@@ -1060,7 +1060,7 @@ int __init arch_init_kprobes(void)
return 0;
}

-int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+int arch_trampoline_kprobe(struct kprobe *p)
{
return 0;
}
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 23ef5c5..dcaa131 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -85,7 +85,7 @@ end:
local_irq_restore(flags);
}

-int __kprobes arch_prepare_kprobe_ftrace(struct kprobe *p)
+int arch_prepare_kprobe_ftrace(struct kprobe *p)
{
p->ainsn.insn = NULL;
p->ainsn.boostable = -1;
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 898160b..fba7fb0 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -77,7 +77,7 @@ found:
}

/* Insert a move instruction which sets a pointer to eax/rdi (1st arg). */
-static void __kprobes synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
+static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
{
#ifdef CONFIG_X86_64
*addr++ = 0x48;
@@ -169,7 +169,7 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_
local_irq_restore(flags);
}

-static int __kprobes copy_optimized_instructions(u8 *dest, u8 *src)
+static int copy_optimized_instructions(u8 *dest, u8 *src)
{
int len = 0, ret;

@@ -189,7 +189,7 @@ static int __kprobes copy_optimized_instructions(u8 *dest, u8 *src)
}

/* Check whether insn is indirect jump */
-static int __kprobes insn_is_indirect_jump(struct insn *insn)
+static int insn_is_indirect_jump(struct insn *insn)
{
return ((insn->opcode.bytes[0] == 0xff &&
(X86_MODRM_REG(insn->modrm.value) & 6) == 4) || /* Jump */
@@ -224,7 +224,7 @@ static int insn_jump_into_range(struct insn *insn, unsigned long start, int len)
}

/* Decode whole function to ensure any instructions don't jump into target */
-static int __kprobes can_optimize(unsigned long paddr)
+static int can_optimize(unsigned long paddr)
{
unsigned long addr, size = 0, offset = 0;
struct insn insn;
@@ -275,7 +275,7 @@ static int __kprobes can_optimize(unsigned long paddr)
}

/* Check optimized_kprobe can actually be optimized. */
-int __kprobes arch_check_optimized_kprobe(struct optimized_kprobe *op)
+int arch_check_optimized_kprobe(struct optimized_kprobe *op)
{
int i;
struct kprobe *p;
@@ -290,15 +290,15 @@ int __kprobes arch_check_optimized_kprobe(struct optimized_kprobe *op)
}

/* Check the addr is within the optimized instructions. */
-int __kprobes
-arch_within_optimized_kprobe(struct optimized_kprobe *op, unsigned long addr)
+int arch_within_optimized_kprobe(struct optimized_kprobe *op,
+ unsigned long addr)
{
return ((unsigned long)op->kp.addr <= addr &&
(unsigned long)op->kp.addr + op->optinsn.size > addr);
}

/* Free optimized instruction slot */
-static __kprobes
+static
void __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
{
if (op->optinsn.insn) {
@@ -308,7 +308,7 @@ void __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
}
}

-void __kprobes arch_remove_optimized_kprobe(struct optimized_kprobe *op)
+void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
{
__arch_remove_optimized_kprobe(op, 1);
}
@@ -318,7 +318,7 @@ void __kprobes arch_remove_optimized_kprobe(struct optimized_kprobe *op)
* Target instructions MUST be relocatable (checked inside)
* This is called when new aggr(opt)probe is allocated or reused.
*/
-int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
+int arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
{
u8 *buf;
int ret;
@@ -372,7 +372,7 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
* Replace breakpoints (int3) with relative jumps.
* Caller must call with locking kprobe_mutex and text_mutex.
*/
-void __kprobes arch_optimize_kprobes(struct list_head *oplist)
+void arch_optimize_kprobes(struct list_head *oplist)
{
struct optimized_kprobe *op, *tmp;
u8 insn_buf[RELATIVEJUMP_SIZE];
@@ -398,7 +398,7 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
}

/* Replace a relative jump with a breakpoint (int3). */
-void __kprobes arch_unoptimize_kprobe(struct optimized_kprobe *op)
+void arch_unoptimize_kprobe(struct optimized_kprobe *op)
{
u8 insn_buf[RELATIVEJUMP_SIZE];


Subject: [PATCH -tip v9 13/26] x86: Allow kprobes on text_poke/hw_breakpoint

Allow kprobes on text_poke/hw_breakpoint because
those are not related to the critical int3-debug
recursive path of kprobes at this moment.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/kernel/alternative.c | 3 +--
arch/x86/kernel/hw_breakpoint.c | 5 ++---
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index df94598..703130f 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -5,7 +5,6 @@
#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/stringify.h>
-#include <linux/kprobes.h>
#include <linux/mm.h>
#include <linux/vmalloc.h>
#include <linux/memory.h>
@@ -551,7 +550,7 @@ void *__init_or_module text_poke_early(void *addr, const void *opcode,
*
* Note: Must be called under text_mutex.
*/
-void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+void *text_poke(void *addr, const void *opcode, size_t len)
{
unsigned long flags;
char *vaddr;
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index a67b47c..5f9cf20 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -32,7 +32,6 @@
#include <linux/irqflags.h>
#include <linux/notifier.h>
#include <linux/kallsyms.h>
-#include <linux/kprobes.h>
#include <linux/percpu.h>
#include <linux/kdebug.h>
#include <linux/kernel.h>
@@ -424,7 +423,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
* NOTIFY_STOP returned for all other cases
*
*/
-static int __kprobes hw_breakpoint_handler(struct die_args *args)
+static int hw_breakpoint_handler(struct die_args *args)
{
int i, cpu, rc = NOTIFY_STOP;
struct perf_event *bp;
@@ -511,7 +510,7 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
/*
* Handle debug exception notifications.
*/
-int __kprobes hw_breakpoint_exceptions_notify(
+int hw_breakpoint_exceptions_notify(
struct notifier_block *unused, unsigned long val, void *data)
{
if (val != DIE_DEBUG)

Subject: [PATCH -tip v9 12/26] ftrace/*probes: Allow probing on some functions

There is no need to prohibit probing on the functions
used for preparation and uprobe only fetch functions.
Those are safely probed because those are not invoked
from kprobe's breakpoint/fault/debug handlers. So there
is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist.
update_bitfield_fetch_param
free_bitfield_fetch_param
kprobe_register
FETCH_FUNC_NAME(stack, type) in trace_uprobe.c
FETCH_FUNC_NAME(memory, type) in trace_uprobe.c
FETCH_FUNC_NAME(memory, string) in trace_uprobe.c
FETCH_FUNC_NAME(memory, string_size) in trace_uprobe.c
FETCH_FUNC_NAME(file_offset, type) in trace_uprobe.c

Changes from v8:
- small style fix (Thanks to Steven Rostedt!)

Changes from v6:
- allow probing fetch functions in trace_uprobe.c

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
---
kernel/trace/trace_kprobe.c | 5 ++---
kernel/trace/trace_probe.c | 4 ++--
kernel/trace/trace_uprobe.c | 20 ++++++++++----------
3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 903ae28..aa5f0bf 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1196,9 +1196,8 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
* kprobe_trace_self_tests_init() does enable_trace_probe/disable_trace_probe
* lockless, but we can't race with this __init function.
*/
-static __kprobes
-int kprobe_register(struct ftrace_event_call *event,
- enum trace_reg type, void *data)
+static int kprobe_register(struct ftrace_event_call *event,
+ enum trace_reg type, void *data)
{
struct trace_kprobe *tk = (struct trace_kprobe *)event->data;
struct ftrace_event_file *file = data;
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 8364a42..d3a91e4 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -183,7 +183,7 @@ DEFINE_BASIC_FETCH_FUNCS(bitfield)
#define fetch_bitfield_string NULL
#define fetch_bitfield_string_size NULL

-static __kprobes void
+static void
update_bitfield_fetch_param(struct bitfield_fetch_param *data)
{
/*
@@ -196,7 +196,7 @@ update_bitfield_fetch_param(struct bitfield_fetch_param *data)
update_symbol_cache(data->orig.data);
}

-static __kprobes void
+static void
free_bitfield_fetch_param(struct bitfield_fetch_param *data)
{
/*
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 930e514..42c28fd 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -108,8 +108,8 @@ static unsigned long get_user_stack_nth(struct pt_regs *regs, unsigned int n)
* Uprobes-specific fetch functions
*/
#define DEFINE_FETCH_stack(type) \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+static void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
+ void *offset, void *dest) \
{ \
*(type *)dest = (type)get_user_stack_nth(regs, \
((unsigned long)offset)); \
@@ -120,8 +120,8 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
#define fetch_stack_string_size NULL

#define DEFINE_FETCH_memory(type) \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
- void *addr, void *dest) \
+static void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
+ void *addr, void *dest) \
{ \
type retval; \
void __user *vaddr = (void __force __user *) addr; \
@@ -136,8 +136,8 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
-static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
{
long ret;
u32 rloc = *(u32 *)dest;
@@ -158,8 +158,8 @@ static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
}
}

-static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest)
{
int len;
void __user *vaddr = (void __force __user *) addr;
@@ -184,8 +184,8 @@ static unsigned long translate_user_vaddr(void *file_offset)
}

#define DEFINE_FETCH_file_offset(type) \
-static __kprobes void FETCH_FUNC_NAME(file_offset, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+static void FETCH_FUNC_NAME(file_offset, type)(struct pt_regs *regs, \
+ void *offset, void *dest)\
{ \
void *vaddr = (void *)translate_user_vaddr(offset); \
\

Subject: [PATCH -tip v9 14/26] x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation

Use NOKPROBE_SYMBOL macro for protecting functions
from kprobes instead of __kprobes annotation under
arch/x86.

This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

This just folds a bunch of previous NOKPROBE_SYMBOL()
cleanup patches for x86 to one patch.

Changes from v8:
- Fix line-break style issues.

Changes from v7:
- Use nokprobe_inline instead of __always_inline.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Seiji Aguchi <[email protected]>
---
arch/x86/include/asm/traps.h | 2 -
arch/x86/kernel/apic/hw_nmi.c | 3 +
arch/x86/kernel/cpu/perf_event.c | 3 +
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 3 +
arch/x86/kernel/dumpstack.c | 9 ++--
arch/x86/kernel/kprobes/core.c | 77 +++++++++++++++++++-----------
arch/x86/kernel/kprobes/ftrace.c | 15 ++++--
arch/x86/kernel/kprobes/opt.c | 8 ++-
arch/x86/kernel/kvm.c | 4 +-
arch/x86/kernel/nmi.c | 18 +++++--
arch/x86/kernel/traps.c | 20 +++++---
arch/x86/mm/fault.c | 29 +++++++----
12 files changed, 122 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 58d66fe..ca32508 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -68,7 +68,7 @@ dotraplinkage void do_segment_not_present(struct pt_regs *, long);
dotraplinkage void do_stack_segment(struct pt_regs *, long);
#ifdef CONFIG_X86_64
dotraplinkage void do_double_fault(struct pt_regs *, long);
-asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *);
+asmlinkage struct pt_regs *sync_regs(struct pt_regs *);
#endif
dotraplinkage void do_general_protection(struct pt_regs *, long);
dotraplinkage void do_page_fault(struct pt_regs *, unsigned long);
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index a698d71..73eb5b3 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -60,7 +60,7 @@ void arch_trigger_all_cpu_backtrace(void)
smp_mb__after_clear_bit();
}

-static int __kprobes
+static int
arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
{
int cpu;
@@ -80,6 +80,7 @@ arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)

return NMI_DONE;
}
+NOKPROBE_SYMBOL(arch_trigger_all_cpu_backtrace_handler);

static int __init register_trigger_all_cpu_backtrace(void)
{
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index ae407f7..5fc8771 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1292,7 +1292,7 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}

-static int __kprobes
+static int
perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
u64 start_clock;
@@ -1310,6 +1310,7 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)

return ret;
}
+NOKPROBE_SYMBOL(perf_event_nmi_handler);

struct event_constraint emptyconstraint;
struct event_constraint unconstrained;
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 4c36bbe..cbb1be3e 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -593,7 +593,7 @@ out:
return 1;
}

-static int __kprobes
+static int
perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
int handled = 0;
@@ -606,6 +606,7 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)

return handled;
}
+NOKPROBE_SYMBOL(perf_ibs_nmi_handler);

static __init int perf_ibs_pmu_init(struct perf_ibs *perf_ibs, char *name)
{
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index d9c12d3..b74ebc7 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -200,7 +200,7 @@ static arch_spinlock_t die_lock = __ARCH_SPIN_LOCK_UNLOCKED;
static int die_owner = -1;
static unsigned int die_nest_count;

-unsigned __kprobes long oops_begin(void)
+unsigned long oops_begin(void)
{
int cpu;
unsigned long flags;
@@ -223,8 +223,9 @@ unsigned __kprobes long oops_begin(void)
return flags;
}
EXPORT_SYMBOL_GPL(oops_begin);
+NOKPROBE_SYMBOL(oops_begin);

-void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
+void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
if (regs && kexec_should_crash(current))
crash_kexec(regs);
@@ -247,8 +248,9 @@ void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
panic("Fatal exception");
do_exit(signr);
}
+NOKPROBE_SYMBOL(oops_end);

-int __kprobes __die(const char *str, struct pt_regs *regs, long err)
+int __die(const char *str, struct pt_regs *regs, long err)
{
#ifdef CONFIG_X86_32
unsigned short ss;
@@ -291,6 +293,7 @@ int __kprobes __die(const char *str, struct pt_regs *regs, long err)
#endif
return 0;
}
+NOKPROBE_SYMBOL(__die);

/*
* This is gone through when something in the kernel has done something bad
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 147e65e1..1a92130 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -112,7 +112,8 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = {

const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist);

-static void __kprobes __synthesize_relative_insn(void *from, void *to, u8 op)
+static nokprobe_inline void
+__synthesize_relative_insn(void *from, void *to, u8 op)
{
struct __arch_relative_insn {
u8 op;
@@ -125,21 +126,23 @@ static void __kprobes __synthesize_relative_insn(void *from, void *to, u8 op)
}

/* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
-void __kprobes synthesize_reljump(void *from, void *to)
+void synthesize_reljump(void *from, void *to)
{
__synthesize_relative_insn(from, to, RELATIVEJUMP_OPCODE);
}
+NOKPROBE_SYMBOL(synthesize_reljump);

/* Insert a call instruction at address 'from', which calls address 'to'.*/
-void __kprobes synthesize_relcall(void *from, void *to)
+void synthesize_relcall(void *from, void *to)
{
__synthesize_relative_insn(from, to, RELATIVECALL_OPCODE);
}
+NOKPROBE_SYMBOL(synthesize_relcall);

/*
* Skip the prefixes of the instruction.
*/
-static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
+static kprobe_opcode_t *skip_prefixes(kprobe_opcode_t *insn)
{
insn_attr_t attr;

@@ -154,6 +157,7 @@ static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
#endif
return insn;
}
+NOKPROBE_SYMBOL(skip_prefixes);

/*
* Returns non-zero if opcode is boostable.
@@ -425,7 +429,8 @@ void arch_remove_kprobe(struct kprobe *p)
}
}

-static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
+static nokprobe_inline void
+save_previous_kprobe(struct kprobe_ctlblk *kcb)
{
kcb->prev_kprobe.kp = kprobe_running();
kcb->prev_kprobe.status = kcb->kprobe_status;
@@ -433,7 +438,8 @@ static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
kcb->prev_kprobe.saved_flags = kcb->kprobe_saved_flags;
}

-static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
+static nokprobe_inline void
+restore_previous_kprobe(struct kprobe_ctlblk *kcb)
{
__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
kcb->kprobe_status = kcb->prev_kprobe.status;
@@ -441,8 +447,9 @@ static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
kcb->kprobe_saved_flags = kcb->prev_kprobe.saved_flags;
}

-static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
+static nokprobe_inline void
+set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
__this_cpu_write(current_kprobe, p);
kcb->kprobe_saved_flags = kcb->kprobe_old_flags
@@ -451,7 +458,7 @@ static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
kcb->kprobe_saved_flags &= ~X86_EFLAGS_IF;
}

-static void __kprobes clear_btf(void)
+static nokprobe_inline void clear_btf(void)
{
if (test_thread_flag(TIF_BLOCKSTEP)) {
unsigned long debugctl = get_debugctlmsr();
@@ -461,7 +468,7 @@ static void __kprobes clear_btf(void)
}
}

-static void __kprobes restore_btf(void)
+static nokprobe_inline void restore_btf(void)
{
if (test_thread_flag(TIF_BLOCKSTEP)) {
unsigned long debugctl = get_debugctlmsr();
@@ -471,8 +478,7 @@ static void __kprobes restore_btf(void)
}
}

-void __kprobes
-arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
+void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
{
unsigned long *sara = stack_addr(regs);

@@ -481,9 +487,10 @@ arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) &kretprobe_trampoline;
}
+NOKPROBE_SYMBOL(arch_prepare_kretprobe);

-static void __kprobes
-setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb, int reenter)
+static void setup_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb, int reenter)
{
if (setup_detour_execution(p, regs, reenter))
return;
@@ -519,14 +526,15 @@ setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *k
else
regs->ip = (unsigned long)p->ainsn.insn;
}
+NOKPROBE_SYMBOL(setup_singlestep);

/*
* We have reentered the kprobe_handler(), since another probe was hit while
* within the handler. We save the original kprobes variables and just single
* step on the instruction of the new probe without calling any user handlers.
*/
-static int __kprobes
-reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+static int reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
switch (kcb->kprobe_status) {
case KPROBE_HIT_SSDONE:
@@ -554,12 +562,13 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb

return 1;
}
+NOKPROBE_SYMBOL(reenter_kprobe);

/*
* Interrupts are disabled on entry as trap3 is an interrupt gate and they
* remain disabled throughout this function.
*/
-int __kprobes kprobe_int3_handler(struct pt_regs *regs)
+int kprobe_int3_handler(struct pt_regs *regs)
{
kprobe_opcode_t *addr;
struct kprobe *p;
@@ -622,12 +631,13 @@ int __kprobes kprobe_int3_handler(struct pt_regs *regs)
preempt_enable_no_resched();
return 0;
}
+NOKPROBE_SYMBOL(kprobe_int3_handler);

/*
* When a retprobed function returns, this code saves registers and
* calls trampoline_handler() runs, which calls the kretprobe's handler.
*/
-static void __used __kprobes kretprobe_trampoline_holder(void)
+static void __used kretprobe_trampoline_holder(void)
{
asm volatile (
".global kretprobe_trampoline\n"
@@ -658,11 +668,13 @@ static void __used __kprobes kretprobe_trampoline_holder(void)
#endif
" ret\n");
}
+NOKPROBE_SYMBOL(kretprobe_trampoline_holder);
+NOKPROBE_SYMBOL(kretprobe_trampoline);

/*
* Called from kretprobe_trampoline
*/
-__visible __used __kprobes void *trampoline_handler(struct pt_regs *regs)
+__visible __used void *trampoline_handler(struct pt_regs *regs)
{
struct kretprobe_instance *ri = NULL;
struct hlist_head *head, empty_rp;
@@ -748,6 +760,7 @@ __visible __used __kprobes void *trampoline_handler(struct pt_regs *regs)
}
return (void *)orig_ret_address;
}
+NOKPROBE_SYMBOL(trampoline_handler);

/*
* Called after single-stepping. p->addr is the address of the
@@ -776,8 +789,8 @@ __visible __used __kprobes void *trampoline_handler(struct pt_regs *regs)
* jump instruction after the copied instruction, that jumps to the next
* instruction after the probepoint.
*/
-static void __kprobes
-resume_execution(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+static void resume_execution(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
unsigned long *tos = stack_addr(regs);
unsigned long copy_ip = (unsigned long)p->ainsn.insn;
@@ -852,12 +865,13 @@ resume_execution(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *k
no_change:
restore_btf();
}
+NOKPROBE_SYMBOL(resume_execution);

/*
* Interrupts are disabled on entry as trap1 is an interrupt gate and they
* remain disabled throughout this function.
*/
-int __kprobes kprobe_debug_handler(struct pt_regs *regs)
+int kprobe_debug_handler(struct pt_regs *regs)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -892,8 +906,9 @@ out:

return 1;
}
+NOKPROBE_SYMBOL(kprobe_debug_handler);

-int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
+int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -950,12 +965,13 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)

return 0;
}
+NOKPROBE_SYMBOL(kprobe_fault_handler);

/*
* Wrapper routine for handling exceptions.
*/
-int __kprobes
-kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data)
+int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val,
+ void *data)
{
struct die_args *args = data;
int ret = NOTIFY_DONE;
@@ -975,8 +991,9 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
}
return ret;
}
+NOKPROBE_SYMBOL(kprobe_exceptions_notify);

-int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
+int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct jprobe *jp = container_of(p, struct jprobe, kp);
unsigned long addr;
@@ -1000,8 +1017,9 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
regs->ip = (unsigned long)(jp->entry);
return 1;
}
+NOKPROBE_SYMBOL(setjmp_pre_handler);

-void __kprobes jprobe_return(void)
+void jprobe_return(void)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();

@@ -1017,8 +1035,10 @@ void __kprobes jprobe_return(void)
" nop \n"::"b"
(kcb->jprobe_saved_sp):"memory");
}
+NOKPROBE_SYMBOL(jprobe_return);
+NOKPROBE_SYMBOL(jprobe_return_end);

-int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
+int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
u8 *addr = (u8 *) (regs->ip - 1);
@@ -1046,6 +1066,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
}
return 0;
}
+NOKPROBE_SYMBOL(longjmp_break_handler);

bool arch_within_kprobe_blacklist(unsigned long addr)
{
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index dcaa131..717b02a 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -25,8 +25,9 @@

#include "common.h"

-static int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
+static nokprobe_inline
+int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
/*
* Emulate singlestep (and also recover regs->ip)
@@ -41,18 +42,19 @@ static int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
return 1;
}

-int __kprobes skip_singlestep(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
+int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
if (kprobe_ftrace(p))
return __skip_singlestep(p, regs, kcb);
else
return 0;
}
+NOKPROBE_SYMBOL(skip_singlestep);

/* Ftrace callback handler for kprobes */
-void __kprobes kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
- struct ftrace_ops *ops, struct pt_regs *regs)
+void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
+ struct ftrace_ops *ops, struct pt_regs *regs)
{
struct kprobe *p;
struct kprobe_ctlblk *kcb;
@@ -84,6 +86,7 @@ void __kprobes kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
end:
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(kprobe_ftrace_handler);

int arch_prepare_kprobe_ftrace(struct kprobe *p)
{
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index fba7fb0..f304773 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -138,7 +138,8 @@ asm (
#define INT3_SIZE sizeof(kprobe_opcode_t)

/* Optimized kprobe call back function: called from optinsn */
-static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
+static void
+optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
unsigned long flags;
@@ -168,6 +169,7 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_
}
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(optimized_callback);

static int copy_optimized_instructions(u8 *dest, u8 *src)
{
@@ -424,8 +426,7 @@ extern void arch_unoptimize_kprobes(struct list_head *oplist,
}
}

-int __kprobes
-setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
+int setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
{
struct optimized_kprobe *op;

@@ -441,3 +442,4 @@ setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
}
return 0;
}
+NOKPROBE_SYMBOL(setup_detour_execution);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 0331cb3..d81abcb 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -251,8 +251,9 @@ u32 kvm_read_and_reset_pf_reason(void)
return reason;
}
EXPORT_SYMBOL_GPL(kvm_read_and_reset_pf_reason);
+NOKPROBE_SYMBOL(kvm_read_and_reset_pf_reason);

-dotraplinkage void __kprobes
+dotraplinkage void
do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
{
enum ctx_state prev_state;
@@ -276,6 +277,7 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
break;
}
}
+NOKPROBE_SYMBOL(do_async_page_fault);

static void __init paravirt_ops_setup(void)
{
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index b4872b99..c3e985d 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -110,7 +110,7 @@ static void nmi_max_handler(struct irq_work *w)
a->handler, whole_msecs, decimal_msecs);
}

-static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2b)
+static int nmi_handle(unsigned int type, struct pt_regs *regs, bool b2b)
{
struct nmi_desc *desc = nmi_to_desc(type);
struct nmiaction *a;
@@ -146,6 +146,7 @@ static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2
/* return total number of NMI events handled */
return handled;
}
+NOKPROBE_SYMBOL(nmi_handle);

int __register_nmi_handler(unsigned int type, struct nmiaction *action)
{
@@ -208,7 +209,7 @@ void unregister_nmi_handler(unsigned int type, const char *name)
}
EXPORT_SYMBOL_GPL(unregister_nmi_handler);

-static __kprobes void
+static void
pci_serr_error(unsigned char reason, struct pt_regs *regs)
{
/* check to see if anyone registered against these types of errors */
@@ -238,8 +239,9 @@ pci_serr_error(unsigned char reason, struct pt_regs *regs)
reason = (reason & NMI_REASON_CLEAR_MASK) | NMI_REASON_CLEAR_SERR;
outb(reason, NMI_REASON_PORT);
}
+NOKPROBE_SYMBOL(pci_serr_error);

-static __kprobes void
+static void
io_check_error(unsigned char reason, struct pt_regs *regs)
{
unsigned long i;
@@ -269,8 +271,9 @@ io_check_error(unsigned char reason, struct pt_regs *regs)
reason &= ~NMI_REASON_CLEAR_IOCHK;
outb(reason, NMI_REASON_PORT);
}
+NOKPROBE_SYMBOL(io_check_error);

-static __kprobes void
+static void
unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
{
int handled;
@@ -298,11 +301,12 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)

pr_emerg("Dazed and confused, but trying to continue\n");
}
+NOKPROBE_SYMBOL(unknown_nmi_error);

static DEFINE_PER_CPU(bool, swallow_nmi);
static DEFINE_PER_CPU(unsigned long, last_nmi_rip);

-static __kprobes void default_do_nmi(struct pt_regs *regs)
+static void default_do_nmi(struct pt_regs *regs)
{
unsigned char reason = 0;
int handled;
@@ -401,6 +405,7 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
else
unknown_nmi_error(reason, regs);
}
+NOKPROBE_SYMBOL(default_do_nmi);

/*
* NMIs can hit breakpoints which will cause it to lose its
@@ -520,7 +525,7 @@ static inline void nmi_nesting_postprocess(void)
}
#endif

-dotraplinkage notrace __kprobes void
+dotraplinkage notrace void
do_nmi(struct pt_regs *regs, long error_code)
{
nmi_nesting_preprocess(regs);
@@ -537,6 +542,7 @@ do_nmi(struct pt_regs *regs, long error_code)
/* On i386, may loop back to preprocess */
nmi_nesting_postprocess();
}
+NOKPROBE_SYMBOL(do_nmi);

void stop_nmi(void)
{
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index ba9abe9..3c8ae7d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -106,7 +106,7 @@ static inline void preempt_conditional_cli(struct pt_regs *regs)
preempt_count_dec();
}

-static int __kprobes
+static nokprobe_inline int
do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
struct pt_regs *regs, long error_code)
{
@@ -136,7 +136,7 @@ do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
return -1;
}

-static void __kprobes
+static void
do_trap(int trapnr, int signr, char *str, struct pt_regs *regs,
long error_code, siginfo_t *info)
{
@@ -173,6 +173,7 @@ do_trap(int trapnr, int signr, char *str, struct pt_regs *regs,
else
force_sig(signr, tsk);
}
+NOKPROBE_SYMBOL(do_trap);

#define DO_ERROR(trapnr, signr, str, name) \
dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
@@ -263,7 +264,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
}
#endif

-dotraplinkage void __kprobes
+dotraplinkage void
do_general_protection(struct pt_regs *regs, long error_code)
{
struct task_struct *tsk;
@@ -309,9 +310,10 @@ do_general_protection(struct pt_regs *regs, long error_code)
exit:
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_general_protection);

/* May run on IST stack. */
-dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_code)
+dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
{
enum ctx_state prev_state;

@@ -355,6 +357,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
exit:
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_int3);

#ifdef CONFIG_X86_64
/*
@@ -362,7 +365,7 @@ exit:
* for scheduling or signal handling. The actual stack switch is done in
* entry.S
*/
-asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
+asmlinkage struct pt_regs *sync_regs(struct pt_regs *eregs)
{
struct pt_regs *regs = eregs;
/* Did already sync */
@@ -381,6 +384,7 @@ asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
*regs = *eregs;
return regs;
}
+NOKPROBE_SYMBOL(sync_regs);
#endif

/*
@@ -407,7 +411,7 @@ asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
*
* May run on IST stack.
*/
-dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
+dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
{
struct task_struct *tsk = current;
enum ctx_state prev_state;
@@ -491,6 +495,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
exit:
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_debug);

/*
* Note that we play around with the 'TS' bit in an attempt to get
@@ -662,7 +667,7 @@ void math_state_restore(void)
}
EXPORT_SYMBOL_GPL(math_state_restore);

-dotraplinkage void __kprobes
+dotraplinkage void
do_device_not_available(struct pt_regs *regs, long error_code)
{
enum ctx_state prev_state;
@@ -688,6 +693,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
#endif
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_device_not_available);

#ifdef CONFIG_X86_32
dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 8e57229..f83bd0d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -8,7 +8,7 @@
#include <linux/kdebug.h> /* oops_begin/end, ... */
#include <linux/module.h> /* search_exception_table */
#include <linux/bootmem.h> /* max_low_pfn */
-#include <linux/kprobes.h> /* __kprobes, ... */
+#include <linux/kprobes.h> /* NOKPROBE_SYMBOL, ... */
#include <linux/mmiotrace.h> /* kmmio_handler, ... */
#include <linux/perf_event.h> /* perf_sw_event */
#include <linux/hugetlb.h> /* hstate_index_to_shift */
@@ -45,7 +45,7 @@ enum x86_pf_error_code {
* Returns 0 if mmiotrace is disabled, or if the fault is not
* handled by mmiotrace:
*/
-static inline int __kprobes
+static nokprobe_inline int
kmmio_fault(struct pt_regs *regs, unsigned long addr)
{
if (unlikely(is_kmmio_active()))
@@ -54,7 +54,7 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr)
return 0;
}

-static inline int __kprobes kprobes_fault(struct pt_regs *regs)
+static nokprobe_inline int kprobes_fault(struct pt_regs *regs)
{
int ret = 0;

@@ -261,7 +261,7 @@ void vmalloc_sync_all(void)
*
* Handle a fault on the vmalloc or module mapping area
*/
-static noinline __kprobes int vmalloc_fault(unsigned long address)
+static noinline int vmalloc_fault(unsigned long address)
{
unsigned long pgd_paddr;
pmd_t *pmd_k;
@@ -291,6 +291,7 @@ static noinline __kprobes int vmalloc_fault(unsigned long address)

return 0;
}
+NOKPROBE_SYMBOL(vmalloc_fault);

/*
* Did it hit the DOS screen memory VA from vm86 mode?
@@ -358,7 +359,7 @@ void vmalloc_sync_all(void)
*
* This assumes no large pages in there.
*/
-static noinline __kprobes int vmalloc_fault(unsigned long address)
+static noinline int vmalloc_fault(unsigned long address)
{
pgd_t *pgd, *pgd_ref;
pud_t *pud, *pud_ref;
@@ -425,6 +426,7 @@ static noinline __kprobes int vmalloc_fault(unsigned long address)

return 0;
}
+NOKPROBE_SYMBOL(vmalloc_fault);

#ifdef CONFIG_CPU_SUP_AMD
static const char errata93_warning[] =
@@ -927,7 +929,7 @@ static int spurious_fault_check(unsigned long error_code, pte_t *pte)
* There are no security implications to leaving a stale TLB when
* increasing the permissions on a page.
*/
-static noinline __kprobes int
+static noinline int
spurious_fault(unsigned long error_code, unsigned long address)
{
pgd_t *pgd;
@@ -975,6 +977,7 @@ spurious_fault(unsigned long error_code, unsigned long address)

return ret;
}
+NOKPROBE_SYMBOL(spurious_fault);

int show_unhandled_signals = 1;

@@ -1030,7 +1033,7 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
* {,trace_}do_page_fault() have notrace on. Having this an actual function
* guarantees there's a function trace entry.
*/
-static void __kprobes noinline
+static noinline void
__do_page_fault(struct pt_regs *regs, unsigned long error_code,
unsigned long address)
{
@@ -1253,8 +1256,9 @@ good_area:

up_read(&mm->mmap_sem);
}
+NOKPROBE_SYMBOL(__do_page_fault);

-dotraplinkage void __kprobes notrace
+dotraplinkage void notrace
do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
unsigned long address = read_cr2(); /* Get the faulting address */
@@ -1272,10 +1276,12 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
__do_page_fault(regs, error_code, address);
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_page_fault);

#ifdef CONFIG_TRACING
-static void trace_page_fault_entries(unsigned long address, struct pt_regs *regs,
- unsigned long error_code)
+static nokprobe_inline void
+trace_page_fault_entries(unsigned long address, struct pt_regs *regs,
+ unsigned long error_code)
{
if (user_mode(regs))
trace_page_fault_user(address, regs, error_code);
@@ -1283,7 +1289,7 @@ static void trace_page_fault_entries(unsigned long address, struct pt_regs *regs
trace_page_fault_kernel(address, regs, error_code);
}

-dotraplinkage void __kprobes notrace
+dotraplinkage void notrace
trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
/*
@@ -1300,4 +1306,5 @@ trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
__do_page_fault(regs, error_code, address);
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(trace_do_page_fault);
#endif /* CONFIG_TRACING */

Subject: [PATCH -tip v9 16/26] ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in ftrace.
This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

Changes from v8:
- Fix a line-break style issue.

Changes from previous:
- Use nokprobe_inline for call_fetch (Thanks to Steven Rostedt)
- Use nokprobe_inline instead of __always_inline.
- Apply NOKPROBE_SYMBOL to __get_data_size and store_trace_args
(Thanks to Steven Rostedt)

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/trace/trace_event_perf.c | 5 ++-
kernel/trace/trace_kprobe.c | 66 +++++++++++++++++++++++----------------
kernel/trace/trace_probe.c | 61 ++++++++++++++++++++----------------
kernel/trace/trace_probe.h | 15 ++++-----
4 files changed, 82 insertions(+), 65 deletions(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index c894614..5d12bb4 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -248,8 +248,8 @@ void perf_trace_del(struct perf_event *p_event, int flags)
tp_event->class->reg(tp_event, TRACE_REG_PERF_DEL, p_event);
}

-__kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
- struct pt_regs *regs, int *rctxp)
+void *perf_trace_buf_prepare(int size, unsigned short type,
+ struct pt_regs *regs, int *rctxp)
{
struct trace_entry *entry;
unsigned long flags;
@@ -281,6 +281,7 @@ __kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
return raw_data;
}
EXPORT_SYMBOL_GPL(perf_trace_buf_prepare);
+NOKPROBE_SYMBOL(perf_trace_buf_prepare);

#ifdef CONFIG_FUNCTION_TRACER
static void
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index aa5f0bf..242e4ec 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -40,27 +40,27 @@ struct trace_kprobe {
(sizeof(struct probe_arg) * (n)))


-static __kprobes bool trace_kprobe_is_return(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
{
return tk->rp.handler != NULL;
}

-static __kprobes const char *trace_kprobe_symbol(struct trace_kprobe *tk)
+static nokprobe_inline const char *trace_kprobe_symbol(struct trace_kprobe *tk)
{
return tk->symbol ? tk->symbol : "unknown";
}

-static __kprobes unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
+static nokprobe_inline unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
{
return tk->rp.kp.offset;
}

-static __kprobes bool trace_kprobe_has_gone(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
{
return !!(kprobe_gone(&tk->rp.kp));
}

-static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
+static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
struct module *mod)
{
int len = strlen(mod->name);
@@ -68,7 +68,7 @@ static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
return strncmp(mod->name, name, len) == 0 && name[len] == ':';
}

-static __kprobes bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
{
return !!strchr(trace_kprobe_symbol(tk), ':');
}
@@ -132,19 +132,21 @@ struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
* Kprobes-specific fetch functions
*/
#define DEFINE_FETCH_stack(type) \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
+static void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
(unsigned int)((unsigned long)offset)); \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(stack, type));
+
DEFINE_BASIC_FETCH_FUNCS(stack)
/* No string on the stack entry */
#define fetch_stack_string NULL
#define fetch_stack_string_size NULL

#define DEFINE_FETCH_memory(type) \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
+static void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
void *addr, void *dest) \
{ \
type retval; \
@@ -152,14 +154,16 @@ static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
*(type *)dest = 0; \
else \
*(type *)dest = retval; \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(memory, type));
+
DEFINE_BASIC_FETCH_FUNCS(memory)
/*
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
-static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
{
long ret;
int maxlen = get_rloc_len(*(u32 *)dest);
@@ -193,10 +197,11 @@ static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
get_rloc_offs(*(u32 *)dest));
}
}
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(memory, string));

/* Return the length of string -- including null terminal byte */
-static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest)
{
mm_segment_t old_fs;
int ret, len = 0;
@@ -219,17 +224,19 @@ static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
else
*(u32 *)dest = len;
}
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(memory, string_size));

#define DEFINE_FETCH_symbol(type) \
-__kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, void *data, void *dest)\
{ \
struct symbol_cache *sc = data; \
if (sc->addr) \
fetch_memory_##type(regs, (void *)sc->addr, dest); \
else \
*(type *)dest = 0; \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(symbol, type));
+
DEFINE_BASIC_FETCH_FUNCS(symbol)
DEFINE_FETCH_symbol(string)
DEFINE_FETCH_symbol(string_size)
@@ -907,7 +914,7 @@ static const struct file_operations kprobe_profile_ops = {
};

/* Kprobe handler */
-static __kprobes void
+static nokprobe_inline void
__kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
struct ftrace_event_file *ftrace_file)
{
@@ -943,7 +950,7 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
entry, irq_flags, pc, regs);
}

-static __kprobes void
+static void
kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs)
{
struct event_file_link *link;
@@ -951,9 +958,10 @@ kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs)
list_for_each_entry_rcu(link, &tk->tp.files, list)
__kprobe_trace_func(tk, regs, link->file);
}
+NOKPROBE_SYMBOL(kprobe_trace_func);

/* Kretprobe handler */
-static __kprobes void
+static nokprobe_inline void
__kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs,
struct ftrace_event_file *ftrace_file)
@@ -991,7 +999,7 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
entry, irq_flags, pc, regs);
}

-static __kprobes void
+static void
kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
@@ -1000,6 +1008,7 @@ kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
list_for_each_entry_rcu(link, &tk->tp.files, list)
__kretprobe_trace_func(tk, ri, regs, link->file);
}
+NOKPROBE_SYMBOL(kretprobe_trace_func);

/* Event entry printers */
static enum print_line_t
@@ -1131,7 +1140,7 @@ static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
#ifdef CONFIG_PERF_EVENTS

/* Kprobe profile handler */
-static __kprobes void
+static void
kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
{
struct ftrace_event_call *call = &tk->tp.call;
@@ -1158,9 +1167,10 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
store_trace_args(sizeof(*entry), &tk->tp, regs, (u8 *)&entry[1], dsize);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}
+NOKPROBE_SYMBOL(kprobe_perf_func);

/* Kretprobe profile handler */
-static __kprobes void
+static void
kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
@@ -1188,6 +1198,7 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
store_trace_args(sizeof(*entry), &tk->tp, regs, (u8 *)&entry[1], dsize);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}
+NOKPROBE_SYMBOL(kretprobe_perf_func);
#endif /* CONFIG_PERF_EVENTS */

/*
@@ -1223,8 +1234,7 @@ static int kprobe_register(struct ftrace_event_call *event,
return 0;
}

-static __kprobes
-int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
+static int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
{
struct trace_kprobe *tk = container_of(kp, struct trace_kprobe, rp.kp);

@@ -1238,9 +1248,10 @@ int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
#endif
return 0; /* We don't tweek kernel, so just return 0 */
}
+NOKPROBE_SYMBOL(kprobe_dispatcher);

-static __kprobes
-int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
+static int
+kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct trace_kprobe *tk = container_of(ri->rp, struct trace_kprobe, rp);

@@ -1254,6 +1265,7 @@ int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
#endif
return 0; /* We don't tweek kernel, so just return 0 */
}
+NOKPROBE_SYMBOL(kretprobe_dispatcher);

static struct trace_event_functions kretprobe_funcs = {
.trace = print_kretprobe_event
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index d3a91e4..d4b9fc2 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -37,13 +37,13 @@ const char *reserved_field_names[] = {

/* Printing in basic type function template */
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt) \
-__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
- const char *name, \
- void *data, void *ent) \
+int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, const char *name, \
+ void *data, void *ent) \
{ \
return trace_seq_printf(s, " %s=" fmt, name, *(type *)data); \
} \
-const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
+const char PRINT_TYPE_FMT_NAME(type)[] = fmt; \
+NOKPROBE_SYMBOL(PRINT_TYPE_FUNC_NAME(type));

DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "0x%x")
@@ -55,9 +55,8 @@ DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")

/* Print type function for string type */
-__kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
- const char *name,
- void *data, void *ent)
+int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, const char *name,
+ void *data, void *ent)
{
int len = *(u32 *)data >> 16;

@@ -67,6 +66,7 @@ __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
return trace_seq_printf(s, " %s=\"%s\"", name,
(const char *)get_loc_data(data, ent));
}
+NOKPROBE_SYMBOL(PRINT_TYPE_FUNC_NAME(string));

const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";

@@ -81,23 +81,24 @@ const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";

/* Data fetch function templates */
#define DEFINE_FETCH_reg(type) \
-__kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
- void *offset, void *dest) \
+void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_register(regs, \
(unsigned int)((unsigned long)offset)); \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(reg, type));
DEFINE_BASIC_FETCH_FUNCS(reg)
/* No string on the register */
#define fetch_reg_string NULL
#define fetch_reg_string_size NULL

#define DEFINE_FETCH_retval(type) \
-__kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
- void *dummy, void *dest) \
+void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
+ void *dummy, void *dest) \
{ \
*(type *)dest = (type)regs_return_value(regs); \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(retval, type));
DEFINE_BASIC_FETCH_FUNCS(retval)
/* No string on the retval */
#define fetch_retval_string NULL
@@ -112,8 +113,8 @@ struct deref_fetch_param {
};

#define DEFINE_FETCH_deref(type) \
-__kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
+ void *data, void *dest) \
{ \
struct deref_fetch_param *dprm = data; \
unsigned long addr; \
@@ -123,12 +124,13 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
dprm->fetch(regs, (void *)addr, dest); \
} else \
*(type *)dest = 0; \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(deref, type));
DEFINE_BASIC_FETCH_FUNCS(deref)
DEFINE_FETCH_deref(string)

-__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
- void *data, void *dest)
+void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
+ void *data, void *dest)
{
struct deref_fetch_param *dprm = data;
unsigned long addr;
@@ -140,16 +142,18 @@ __kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
} else
*(string_size *)dest = 0;
}
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(deref, string_size));

-static __kprobes void update_deref_fetch_param(struct deref_fetch_param *data)
+static void update_deref_fetch_param(struct deref_fetch_param *data)
{
if (CHECK_FETCH_FUNCS(deref, data->orig.fn))
update_deref_fetch_param(data->orig.data);
else if (CHECK_FETCH_FUNCS(symbol, data->orig.fn))
update_symbol_cache(data->orig.data);
}
+NOKPROBE_SYMBOL(update_deref_fetch_param);

-static __kprobes void free_deref_fetch_param(struct deref_fetch_param *data)
+static void free_deref_fetch_param(struct deref_fetch_param *data)
{
if (CHECK_FETCH_FUNCS(deref, data->orig.fn))
free_deref_fetch_param(data->orig.data);
@@ -157,6 +161,7 @@ static __kprobes void free_deref_fetch_param(struct deref_fetch_param *data)
free_symbol_cache(data->orig.data);
kfree(data);
}
+NOKPROBE_SYMBOL(free_deref_fetch_param);

/* Bitfield fetch function */
struct bitfield_fetch_param {
@@ -166,8 +171,8 @@ struct bitfield_fetch_param {
};

#define DEFINE_FETCH_bitfield(type) \
-__kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
+ void *data, void *dest) \
{ \
struct bitfield_fetch_param *bprm = data; \
type buf = 0; \
@@ -177,8 +182,8 @@ __kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
buf >>= bprm->low_shift; \
} \
*(type *)dest = buf; \
-}
-
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(bitfield, type));
DEFINE_BASIC_FETCH_FUNCS(bitfield)
#define fetch_bitfield_string NULL
#define fetch_bitfield_string_size NULL
@@ -255,17 +260,17 @@ fail:
}

/* Special function : only accept unsigned long */
-static __kprobes void fetch_kernel_stack_address(struct pt_regs *regs,
- void *dummy, void *dest)
+static void fetch_kernel_stack_address(struct pt_regs *regs, void *dummy, void *dest)
{
*(unsigned long *)dest = kernel_stack_pointer(regs);
}
+NOKPROBE_SYMBOL(fetch_kernel_stack_address);

-static __kprobes void fetch_user_stack_address(struct pt_regs *regs,
- void *dummy, void *dest)
+static void fetch_user_stack_address(struct pt_regs *regs, void *dummy, void *dest)
{
*(unsigned long *)dest = user_stack_pointer(regs);
}
+NOKPROBE_SYMBOL(fetch_user_stack_address);

static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
fetch_func_t orig_fn,
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index fb1ab5d..4f815fb 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -81,13 +81,13 @@
*/
#define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs))

-static inline void *get_rloc_data(u32 *dl)
+static nokprobe_inline void *get_rloc_data(u32 *dl)
{
return (u8 *)dl + get_rloc_offs(*dl);
}

/* For data_loc conversion */
-static inline void *get_loc_data(u32 *dl, void *ent)
+static nokprobe_inline void *get_loc_data(u32 *dl, void *ent)
{
return (u8 *)ent + get_rloc_offs(*dl);
}
@@ -136,9 +136,8 @@ typedef u32 string_size;

/* Printing in basic type function template */
#define DECLARE_BASIC_PRINT_TYPE_FUNC(type) \
-__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
- const char *name, \
- void *data, void *ent); \
+int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, const char *name, \
+ void *data, void *ent); \
extern const char PRINT_TYPE_FMT_NAME(type)[]

DECLARE_BASIC_PRINT_TYPE_FUNC(u8);
@@ -303,7 +302,7 @@ static inline bool trace_probe_is_registered(struct trace_probe *tp)
return !!(tp->flags & TP_FLAG_REGISTERED);
}

-static inline __kprobes void call_fetch(struct fetch_param *fprm,
+static nokprobe_inline void call_fetch(struct fetch_param *fprm,
struct pt_regs *regs, void *dest)
{
return fprm->fn(regs, fprm->data, dest);
@@ -351,7 +350,7 @@ extern ssize_t traceprobe_probes_write(struct file *file,
extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));

/* Sum up total data length for dynamic arraies (strings) */
-static inline __kprobes int
+static nokprobe_inline int
__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
{
int i, ret = 0;
@@ -367,7 +366,7 @@ __get_data_size(struct trace_probe *tp, struct pt_regs *regs)
}

/* Store the value of each argument */
-static inline __kprobes void
+static nokprobe_inline void
store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
u8 *data, int maxlen)
{

Subject: [PATCH -tip v9 18/26] sched: Use NOKPROBE_SYMBOL macro in sched

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in sched/core.c.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
kernel/sched/core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ce9b306..86e5778 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2477,7 +2477,7 @@ notrace unsigned long get_parent_ip(unsigned long addr)
#if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \
defined(CONFIG_PREEMPT_TRACER))

-void __kprobes preempt_count_add(int val)
+void preempt_count_add(int val)
{
#ifdef CONFIG_DEBUG_PREEMPT
/*
@@ -2503,8 +2503,9 @@ void __kprobes preempt_count_add(int val)
}
}
EXPORT_SYMBOL(preempt_count_add);
+NOKPROBE_SYMBOL(preempt_count_add);

-void __kprobes preempt_count_sub(int val)
+void preempt_count_sub(int val)
{
#ifdef CONFIG_DEBUG_PREEMPT
/*
@@ -2525,6 +2526,7 @@ void __kprobes preempt_count_sub(int val)
__preempt_count_sub(val);
}
EXPORT_SYMBOL(preempt_count_sub);
+NOKPROBE_SYMBOL(preempt_count_sub);

#endif


Subject: [PATCH -tip v9 19/26] kprobes: Show blacklist entries via debugfs

Show blacklist entries (function names with the address
range) via /sys/kernel/debug/kprobes/blacklist.

Note that at this point the blacklist supports only
in vmlinux, not module. So the list is fixed and
not updated.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: "David S. Miller" <[email protected]>
---
kernel/kprobes.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index a21b4e6..3214289 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2249,6 +2249,46 @@ static const struct file_operations debugfs_kprobes_operations = {
.release = seq_release,
};

+/* kprobes/blacklist -- shows which functions can not be probed */
+static void *kprobe_blacklist_seq_start(struct seq_file *m, loff_t *pos)
+{
+ return seq_list_start(&kprobe_blacklist, *pos);
+}
+
+static void *kprobe_blacklist_seq_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ return seq_list_next(v, &kprobe_blacklist, pos);
+}
+
+static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
+{
+ struct kprobe_blacklist_entry *ent =
+ list_entry(v, struct kprobe_blacklist_entry, list);
+
+ seq_printf(m, "0x%p-0x%p\t%ps\n", (void *)ent->start_addr,
+ (void *)ent->end_addr, (void *)ent->start_addr);
+ return 0;
+}
+
+static const struct seq_operations kprobe_blacklist_seq_ops = {
+ .start = kprobe_blacklist_seq_start,
+ .next = kprobe_blacklist_seq_next,
+ .stop = kprobe_seq_stop, /* Reuse void function */
+ .show = kprobe_blacklist_seq_show,
+};
+
+static int kprobe_blacklist_open(struct inode *inode, struct file *filp)
+{
+ return seq_open(filp, &kprobe_blacklist_seq_ops);
+}
+
+static const struct file_operations debugfs_kprobe_blacklist_ops = {
+ .open = kprobe_blacklist_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
static void arm_all_kprobes(void)
{
struct hlist_head *head;
@@ -2372,19 +2412,24 @@ static int __init debugfs_kprobe_init(void)

file = debugfs_create_file("list", 0444, dir, NULL,
&debugfs_kprobes_operations);
- if (!file) {
- debugfs_remove(dir);
- return -ENOMEM;
- }
+ if (!file)
+ goto error;

file = debugfs_create_file("enabled", 0600, dir,
&value, &fops_kp);
- if (!file) {
- debugfs_remove(dir);
- return -ENOMEM;
- }
+ if (!file)
+ goto error;
+
+ file = debugfs_create_file("blacklist", 0444, dir, NULL,
+ &debugfs_kprobe_blacklist_ops);
+ if (!file)
+ goto error;

return 0;
+
+error:
+ debugfs_remove(dir);
+ return -ENOMEM;
}

late_initcall(debugfs_kprobe_init);

Subject: [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module

To blacklist the functions in a module (e.g. user-defined
kprobe handler and the functions invoked from it), expand
blacklist support for modules.
With this change, users can use NOKPROBE_SYMBOL() macro in
their own modules.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Rob Landley <[email protected]>
Cc: Rusty Russell <[email protected]>
---
Documentation/kprobes.txt | 8 ++++++
include/linux/module.h | 5 ++++
kernel/kprobes.c | 63 ++++++++++++++++++++++++++++++++++++++-------
kernel/module.c | 6 ++++
4 files changed, 72 insertions(+), 10 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 4bbeca8..2845956 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -512,6 +512,14 @@ int enable_jprobe(struct jprobe *jp);
Enables *probe which has been disabled by disable_*probe(). You must specify
the probe which has been registered.

+4.9 NOKPROBE_SYMBOL()
+
+#include <linux/kprobes.h>
+NOKPROBE_SYMBOL(FUNCTION);
+
+Protects given FUNCTION from other kprobes. This is useful for handler
+functions and functions called from the handlers.
+
5. Kprobes Features and Limitations

Kprobes allows multiple probes at the same address. Currently,
diff --git a/include/linux/module.h b/include/linux/module.h
index f520a76..2fdb673 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -16,6 +16,7 @@
#include <linux/kobject.h>
#include <linux/moduleparam.h>
#include <linux/jump_label.h>
+#include <linux/kprobes.h>
#include <linux/export.h>

#include <linux/percpu.h>
@@ -357,6 +358,10 @@ struct module {
unsigned int num_ftrace_callsites;
unsigned long *ftrace_callsites;
#endif
+#ifdef CONFIG_KPROBES
+ unsigned int num_kprobe_blacklist;
+ unsigned long *kprobe_blacklist;
+#endif

#ifdef CONFIG_MODULE_UNLOAD
/* What modules depend on me? */
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 3214289..8319048 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -88,6 +88,7 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)

/* Blacklist -- list of struct kprobe_blacklist_entry */
static LIST_HEAD(kprobe_blacklist);
+static DEFINE_MUTEX(kprobe_blacklist_mutex);

#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
/*
@@ -1331,22 +1332,27 @@ bool __weak arch_within_kprobe_blacklist(unsigned long addr)
addr < (unsigned long)__kprobes_text_end;
}

-static bool within_kprobe_blacklist(unsigned long addr)
+static struct kprobe_blacklist_entry *find_blacklist_entry(unsigned long addr)
{
struct kprobe_blacklist_entry *ent;

+ list_for_each_entry(ent, &kprobe_blacklist, list) {
+ if (addr >= ent->start_addr && addr < ent->end_addr)
+ return ent;
+ }
+
+ return NULL;
+}
+
+static bool within_kprobe_blacklist(unsigned long addr)
+{
if (arch_within_kprobe_blacklist(addr))
return true;
/*
* If there exists a kprobe_blacklist, verify and
* fail any probe registration in the prohibited area
*/
- list_for_each_entry(ent, &kprobe_blacklist, list) {
- if (addr >= ent->start_addr && addr < ent->end_addr)
- return true;
- }
-
- return false;
+ return !!find_blacklist_entry(addr);
}

/*
@@ -1432,6 +1438,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
#endif
}

+ mutex_lock(&kprobe_blacklist_mutex);
jump_label_lock();
preempt_disable();

@@ -1469,6 +1476,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
out:
preempt_enable();
jump_label_unlock();
+ mutex_unlock(&kprobe_blacklist_mutex);

return ret;
}
@@ -2032,13 +2040,13 @@ NOKPROBE_SYMBOL(dump_kprobe);
* since a kprobe need not necessarily be at the beginning
* of a function.
*/
-static int __init populate_kprobe_blacklist(unsigned long *start,
- unsigned long *end)
+static int populate_kprobe_blacklist(unsigned long *start, unsigned long *end)
{
unsigned long *iter;
struct kprobe_blacklist_entry *ent;
unsigned long offset = 0, size = 0;

+ mutex_lock(&kprobe_blacklist_mutex);
for (iter = start; iter < end; iter++) {
if (!kallsyms_lookup_size_offset(*iter, &size, &offset)) {
pr_err("Failed to find blacklist %p\n", (void *)*iter);
@@ -2053,9 +2061,28 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
INIT_LIST_HEAD(&ent->list);
list_add_tail(&ent->list, &kprobe_blacklist);
}
+ mutex_unlock(&kprobe_blacklist_mutex);
+
return 0;
}

+/* Shrink the blacklist */
+static void shrink_kprobe_blacklist(unsigned long *start, unsigned long *end)
+{
+ struct kprobe_blacklist_entry *ent;
+ unsigned long *iter;
+
+ mutex_lock(&kprobe_blacklist_mutex);
+ for (iter = start; iter < end; iter++) {
+ ent = find_blacklist_entry(*iter);
+ if (!ent)
+ continue;
+ list_del(&ent->list);
+ kfree(ent);
+ }
+ mutex_unlock(&kprobe_blacklist_mutex);
+}
+
/* Module notifier call back, checking kprobes on the module */
static int kprobes_module_callback(struct notifier_block *nb,
unsigned long val, void *data)
@@ -2066,6 +2093,16 @@ static int kprobes_module_callback(struct notifier_block *nb,
unsigned int i;
int checkcore = (val == MODULE_STATE_GOING);

+ /* Add/remove module blacklist */
+ if (val == MODULE_STATE_COMING)
+ populate_kprobe_blacklist(mod->kprobe_blacklist,
+ mod->kprobe_blacklist +
+ mod->num_kprobe_blacklist);
+ else if (val == MODULE_STATE_GOING)
+ shrink_kprobe_blacklist(mod->kprobe_blacklist,
+ mod->kprobe_blacklist +
+ mod->num_kprobe_blacklist);
+
if (val != MODULE_STATE_GOING && val != MODULE_STATE_LIVE)
return NOTIFY_DONE;

@@ -2252,6 +2289,7 @@ static const struct file_operations debugfs_kprobes_operations = {
/* kprobes/blacklist -- shows which functions can not be probed */
static void *kprobe_blacklist_seq_start(struct seq_file *m, loff_t *pos)
{
+ mutex_lock(&kprobe_blacklist_mutex);
return seq_list_start(&kprobe_blacklist, *pos);
}

@@ -2260,6 +2298,11 @@ static void *kprobe_blacklist_seq_next(struct seq_file *m, void *v, loff_t *pos)
return seq_list_next(v, &kprobe_blacklist, pos);
}

+static void kprobe_blacklist_seq_stop(struct seq_file *m, void *v)
+{
+ mutex_unlock(&kprobe_blacklist_mutex);
+}
+
static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
{
struct kprobe_blacklist_entry *ent =
@@ -2273,7 +2316,7 @@ static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
static const struct seq_operations kprobe_blacklist_seq_ops = {
.start = kprobe_blacklist_seq_start,
.next = kprobe_blacklist_seq_next,
- .stop = kprobe_seq_stop, /* Reuse void function */
+ .stop = kprobe_blacklist_seq_stop,
.show = kprobe_blacklist_seq_show,
};

diff --git a/kernel/module.c b/kernel/module.c
index 1186940..895c635 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -58,6 +58,7 @@
#include <linux/percpu.h>
#include <linux/kmemleak.h>
#include <linux/jump_label.h>
+#include <linux/kprobes.h>
#include <linux/pfn.h>
#include <linux/bsearch.h>
#include <linux/fips.h>
@@ -2772,6 +2773,11 @@ static int find_module_sections(struct module *mod, struct load_info *info)
sizeof(*mod->ftrace_callsites),
&mod->num_ftrace_callsites);
#endif
+#ifdef CONFIG_KPROBES
+ mod->kprobe_blacklist = section_objs(info, "_kprobe_blacklist",
+ sizeof(*mod->kprobe_blacklist),
+ &mod->num_kprobe_blacklist);
+#endif

mod->extable = section_objs(info, "__ex_table",
sizeof(*mod->extable), &mod->num_exentries);

Subject: [PATCH -tip v9 22/26] kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text

Use kprobe_blackpoint for blacklisting .entry.text and .kprobees.text
instead of arch_within_kprobe_blacklist. This also makes them visible
via (debugfs)/kprobes/blacklist.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 11 +------
include/linux/kprobes.h | 1 +
kernel/kprobes.c | 64 ++++++++++++++++++++++++++++------------
3 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 1a92130..2fb3d3e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1068,17 +1068,10 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
}
NOKPROBE_SYMBOL(longjmp_break_handler);

-bool arch_within_kprobe_blacklist(unsigned long addr)
-{
- return (addr >= (unsigned long)__kprobes_text_start &&
- addr < (unsigned long)__kprobes_text_end) ||
- (addr >= (unsigned long)__entry_text_start &&
- addr < (unsigned long)__entry_text_end);
-}
-
int __init arch_init_kprobes(void)
{
- return 0;
+ return kprobe_blacklist_add_range((unsigned long)__entry_text_start,
+ (unsigned long) __entry_text_end);
}

int arch_trampoline_kprobe(struct kprobe *p)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e059507..e81bced 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -266,6 +266,7 @@ extern int arch_init_kprobes(void);
extern void show_registers(struct pt_regs *regs);
extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
+extern int kprobe_blacklist_add_range(unsigned long start, unsigned long end);

struct kprobe_insn_cache {
struct mutex mutex;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 8319048..abdede5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1325,13 +1325,6 @@ out:
return ret;
}

-bool __weak arch_within_kprobe_blacklist(unsigned long addr)
-{
- /* The __kprobes marked functions and entry code must not be probed */
- return addr >= (unsigned long)__kprobes_text_start &&
- addr < (unsigned long)__kprobes_text_end;
-}
-
static struct kprobe_blacklist_entry *find_blacklist_entry(unsigned long addr)
{
struct kprobe_blacklist_entry *ent;
@@ -1346,8 +1339,6 @@ static struct kprobe_blacklist_entry *find_blacklist_entry(unsigned long addr)

static bool within_kprobe_blacklist(unsigned long addr)
{
- if (arch_within_kprobe_blacklist(addr))
- return true;
/*
* If there exists a kprobe_blacklist, verify and
* fail any probe registration in the prohibited area
@@ -2032,6 +2023,40 @@ void dump_kprobe(struct kprobe *kp)
}
NOKPROBE_SYMBOL(dump_kprobe);

+static int __kprobe_blacklist_add(unsigned long start, unsigned long end)
+{
+ struct kprobe_blacklist_entry *ent;
+
+ ent = kmalloc(sizeof(*ent), GFP_KERNEL);
+ if (!ent)
+ return -ENOMEM;
+
+ ent->start_addr = start;
+ ent->end_addr = end;
+ INIT_LIST_HEAD(&ent->list);
+ list_add_tail(&ent->list, &kprobe_blacklist);
+ return 0;
+}
+
+int kprobe_blacklist_add_range(unsigned long start, unsigned long end)
+{
+ unsigned long offset = 0, size = 0;
+ int err = 0;
+
+ mutex_lock(&kprobe_blacklist_mutex);
+ while (!err && start < end) {
+ if (!kallsyms_lookup_size_offset(start, &size, &offset) ||
+ size == 0) {
+ err = -ENOENT;
+ break;
+ }
+ err = __kprobe_blacklist_add(start, start + size);
+ start += size;
+ }
+ mutex_unlock(&kprobe_blacklist_mutex);
+ return err;
+}
+
/*
* Lookup and populate the kprobe_blacklist.
*
@@ -2043,8 +2068,8 @@ NOKPROBE_SYMBOL(dump_kprobe);
static int populate_kprobe_blacklist(unsigned long *start, unsigned long *end)
{
unsigned long *iter;
- struct kprobe_blacklist_entry *ent;
unsigned long offset = 0, size = 0;
+ int ret;

mutex_lock(&kprobe_blacklist_mutex);
for (iter = start; iter < end; iter++) {
@@ -2052,14 +2077,7 @@ static int populate_kprobe_blacklist(unsigned long *start, unsigned long *end)
pr_err("Failed to find blacklist %p\n", (void *)*iter);
continue;
}
-
- ent = kmalloc(sizeof(*ent), GFP_KERNEL);
- if (!ent)
- return -ENOMEM;
- ent->start_addr = *iter;
- ent->end_addr = *iter + size;
- INIT_LIST_HEAD(&ent->list);
- list_add_tail(&ent->list, &kprobe_blacklist);
+ ret = __kprobe_blacklist_add(*iter, *iter + size);
}
mutex_unlock(&kprobe_blacklist_mutex);

@@ -2154,7 +2172,15 @@ static int __init init_kprobes(void)

err = populate_kprobe_blacklist(__start_kprobe_blacklist,
__stop_kprobe_blacklist);
- if (err) {
+
+ if (err >= 0 && __kprobes_text_start != __kprobes_text_end) {
+ /* The __kprobes marked functions must not be probed */
+ err = kprobe_blacklist_add_range(
+ (unsigned long)__kprobes_text_start,
+ (unsigned long)__kprobes_text_end);
+ }
+
+ if (err < 0) {
pr_err("kprobes: failed to populate blacklist: %d\n", err);
pr_err("Please take care of using kprobes.\n");
}

Subject: [PATCH -tip v9 24/26] kprobes: Enlarge hash table to 512 entries

Currently, since the kprobes expects to be used with less than
100 probe points, its hash table just has 64 entries. This is
too little to handle several thousands of probes.
Enlarge the size of kprobe_table to 512 entries which just
consumes 4KB (on 64bit arch) for better scalability.

Note that this also decouples the hash sizes of kprobe_table and
kretprobe_inst_table/locks, since those use different sources
for hash (kprobe uses probed address, whereas kretprobe uses
task structure.)

Without this patch, enabling 17787 probes takes more than 2 hours!
(9428sec, 1 min intervals for each 2000 probes enabled)

Enabling trace events: start at 1392782584
0 1392782585 a2mp_chan_alloc_skb_cb_38556
1 1392782585 a2mp_chan_close_cb_38555
....
17785 1392792008 lookup_vport_34987
17786 1392792010 loop_add_23485
17787 1392792012 loop_attr_do_show_autoclear_23464

I profiled it and saw that more than 90% of cycles are consumed
on get_kprobe.

Samples: 18K of event 'cycles', Event count (approx.): 37759714934
+ 95.90% [k] get_kprobe
+ 0.76% [k] ftrace_lookup_ip
+ 0.54% [k] kprobe_trace_func

And also more than 60% of executed instructions were in get_kprobe
too.

Samples: 17K of event 'instructions', Event count (approx.): 1321391290
+ 65.48% [k] get_kprobe
+ 4.07% [k] kprobe_trace_func
+ 2.93% [k] optimized_callback


And annotating get_kprobe also shows the hlist is too long and takes
a time on tracking it. Thus I guess it mostly comes from tracking
too long hlist at the table entry.

| struct hlist_head *head;
| struct kprobe *p;
|
| head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)];
| hlist_for_each_entry_rcu(p, head, hlist) {
86.33 | mov (%rax),%rax
11.24 | test %rax,%rax
| jne 60
| if (p->addr == addr)
| return p;
| }

To fix this issue, I tried to enlarge the size of kprobe_table
from 6(current) to 12, and profiled cycles% on the get_kprobe
with 10k probes enabled / 37k probes registered.

Size Cycles%
2^6 95.58%
2^7 85.83%
2^8 68.43%
2^9 48.61%
2^10 46.95%
2^11 48.46%
2^12 56.95%

Here, we can see the hash tables larger than 2^9 (512 entries)
have no much performance improvement. So I decided to enlarge
it to 512 entries.

With this fix, enabling 20,000 probes just took
about 45 min (2934 sec, 1 min intervals for
each 2000 probes enabled)

Enabling trace events: start at 1393921862
0 1393921864 a2mp_chan_alloc_skb_cb_38581
1 1393921864 a2mp_chan_close_cb_38580
...
19997 1393924927 nfs4_match_stateid_11750
19998 1393924927 nfs4_negotiate_security_12137
19999 1393924928 nfs4_open_confirm_done_11785

And it reduced cycles on get_kprobe (with 20,000 probes).

Samples: 691 of event 'cycles', Event count (approx.): 1743375714
+ 67.68% [k] get_kprobe
+ 5.98% [k] ftrace_lookup_ip
+ 4.03% [k] kprobe_trace_func

Changes from v7:
- Evaluate the size of hash table and reduce it to 512 entries.
- Decouple the hash size of kprobe_table from others.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
kernel/kprobes.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index abdede5..a29e622 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -54,9 +54,10 @@
#include <asm/errno.h>
#include <asm/uaccess.h>

-#define KPROBE_HASH_BITS 6
+#define KPROBE_HASH_BITS 9
+#define KRETPROBE_HASH_BITS 6
#define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS)
-
+#define KRETPROBE_TABLE_SIZE (1 << KRETPROBE_HASH_BITS)

/*
* Some oddball architectures like 64bit powerpc have function descriptors
@@ -69,7 +70,7 @@

static int kprobes_initialized;
static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
-static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
+static struct hlist_head kretprobe_inst_table[KRETPROBE_TABLE_SIZE];

/* NOTE: change this value only with kprobe_mutex held */
static bool kprobes_all_disarmed;
@@ -79,7 +80,7 @@ static DEFINE_MUTEX(kprobe_mutex);
static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL;
static struct {
raw_spinlock_t lock ____cacheline_aligned_in_smp;
-} kretprobe_table_locks[KPROBE_TABLE_SIZE];
+} kretprobe_table_locks[KRETPROBE_TABLE_SIZE];

static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
{
@@ -1096,7 +1097,7 @@ void kretprobe_hash_lock(struct task_struct *tsk,
struct hlist_head **head, unsigned long *flags)
__acquires(hlist_lock)
{
- unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
+ unsigned long hash = hash_ptr(tsk, KRETPROBE_HASH_BITS);
raw_spinlock_t *hlist_lock;

*head = &kretprobe_inst_table[hash];
@@ -1118,7 +1119,7 @@ void kretprobe_hash_unlock(struct task_struct *tsk,
unsigned long *flags)
__releases(hlist_lock)
{
- unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
+ unsigned long hash = hash_ptr(tsk, KRETPROBE_HASH_BITS);
raw_spinlock_t *hlist_lock;

hlist_lock = kretprobe_table_lock_ptr(hash);
@@ -1153,7 +1154,7 @@ void kprobe_flush_task(struct task_struct *tk)
return;

INIT_HLIST_HEAD(&empty_rp);
- hash = hash_ptr(tk, KPROBE_HASH_BITS);
+ hash = hash_ptr(tk, KRETPROBE_HASH_BITS);
head = &kretprobe_inst_table[hash];
kretprobe_table_lock(hash, &flags);
hlist_for_each_entry_safe(ri, tmp, head, hlist) {
@@ -1187,7 +1188,7 @@ static void cleanup_rp_inst(struct kretprobe *rp)
struct hlist_head *head;

/* No race here */
- for (hash = 0; hash < KPROBE_TABLE_SIZE; hash++) {
+ for (hash = 0; hash < KRETPROBE_TABLE_SIZE; hash++) {
kretprobe_table_lock(hash, &flags);
head = &kretprobe_inst_table[hash];
hlist_for_each_entry_safe(ri, next, head, hlist) {
@@ -1778,7 +1779,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
struct kretprobe_instance *ri;

/*TODO: consider to only swap the RA after the last pre_handler fired */
- hash = hash_ptr(current, KPROBE_HASH_BITS);
+ hash = hash_ptr(current, KRETPROBE_HASH_BITS);
raw_spin_lock_irqsave(&rp->lock, flags);
if (!hlist_empty(&rp->free_instances)) {
ri = hlist_entry(rp->free_instances.first,
@@ -2164,8 +2165,10 @@ static int __init init_kprobes(void)

/* FIXME allocate the probe table, currently defined statically */
/* initialize all list heads */
- for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
+ for (i = 0; i < KPROBE_TABLE_SIZE; i++)
INIT_HLIST_HEAD(&kprobe_table[i]);
+
+ for (i = 0; i < KRETPROBE_TABLE_SIZE; i++) {
INIT_HLIST_HEAD(&kretprobe_inst_table[i]);
raw_spin_lock_init(&(kretprobe_table_locks[i].lock));
}

Subject: [PATCH -tip v9 26/26] ftrace: Introduce FTRACE_OPS_FL_SELF_FILTER for ftrace-kprobe

Since the kprobes itself owns a hash table to get a kprobe
data structure corresponding to the given ip address, there
is no need to test ftrace hash in ftrace side.
To achive better performance on ftrace-based kprobe,
FTRACE_OPS_FL_SELF_FILTER flag to ftrace_ops which means
that ftrace skips testing its own hash table.

Without this patch, ftrace_lookup_ip() is biggest cycles
consumer when 20,000 kprobes are enabled.
----
Samples: 1K of event 'cycles', Event count (approx.): 340068894
+ 20.77% [k] ftrace_lookup_ip
+ 8.33% [k] kprobe_trace_func
+ 4.83% [k] get_kprobe_cached
----

With this patch, ftrace_lookup_ip() vanished from the
cycles consumer list (of course, there is no caller on
hotpath anymore :))
----
Samples: 1K of event 'cycles', Event count (approx.): 186861492
+ 9.95% [k] kprobe_trace_func
+ 6.00% [k] kprobe_ftrace_handler
+ 5.53% [k] get_kprobe_cached
----

Changes from v7:
- Re-evaluate the performance improvement.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
include/linux/ftrace.h | 3 +++
kernel/kprobes.c | 2 +-
kernel/trace/ftrace.c | 3 ++-
3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9212b01..36e4cb5 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -93,6 +93,8 @@ typedef void (*ftrace_func_t)(unsigned long ip, unsigned long parent_ip,
* INITIALIZED - The ftrace_ops has already been initialized (first use time
* register_ftrace_function() is called, it will initialized the ops)
* DELETED - The ops are being deleted, do not let them be registered again.
+ * SELF_FILTER - The ftrace_ops function filters ip by itself. Do not need to
+ * check hash table on each hit.
*/
enum {
FTRACE_OPS_FL_ENABLED = 1 << 0,
@@ -105,6 +107,7 @@ enum {
FTRACE_OPS_FL_STUB = 1 << 7,
FTRACE_OPS_FL_INITIALIZED = 1 << 8,
FTRACE_OPS_FL_DELETED = 1 << 9,
+ FTRACE_OPS_FL_SELF_FILTER = 1 << 10,
};

/*
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 465e912..af1ff6a 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1020,7 +1020,7 @@ static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
#ifdef CONFIG_KPROBES_ON_FTRACE
static struct ftrace_ops kprobe_ftrace_ops __read_mostly = {
.func = kprobe_ftrace_handler,
- .flags = FTRACE_OPS_FL_SAVE_REGS,
+ .flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_SELF_FILTER,
};
static int kprobe_ftrace_enabled;

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 1fd4b94..4235851 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4520,7 +4520,8 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
*/
preempt_disable_notrace();
do_for_each_ftrace_op(op, ftrace_ops_list) {
- if (ftrace_ops_test(op, ip, regs))
+ if (op->flags & FTRACE_OPS_FL_SELF_FILTER ||
+ ftrace_ops_test(op, ip, regs))
op->func(ip, parent_ip, op, regs);
} while_for_each_ftrace_op(op);
preempt_enable_notrace();

Subject: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits

Introduce kprobe cache to reduce cache misshits for
massive multiple kprobes.
For stress testing kprobes, we need to activate kprobes
as many as possible. This situation causes cache miss
hit storm on kprobe hash-list. kprobe hashlist is already
enlarged to 4k entries and this is still small for 40k
kprobes.

For example, when registering 40k probes on the hlist and
enabling 20k probes, perf tools shows still a lot of
cache-misses are on the get_kprobe.
----
Samples: 633 of event 'cache-misses', Event count (approx.): 3414776
+ 68.13% [k] get_kprobe
+ 4.38% [k] ftrace_lookup_ip
+ 2.54% [k] kprobe_ftrace_handler
----

Also, I found that the most of the kprobes are not hit.
In that case, to reduce cache-misses, we can reduce the
random memory access by introducing a per-cpu cache which
caches the address of frequently used kprobe data structure
and its probe address.

With kpcache enabled, the get_kprobe_cached goes down to
around 4-5% of cache-misses with 20k probes.
----
Samples: 729 of event 'cache-misses', Event count (approx.): 690125
+ 14.49% [k] ftrace_lookup_ip
+ 5.61% [k] kprobe_trace_func
+ 5.17% [k] kprobe_ftrace_handler
+ 4.62% [k] get_kprobe_cached
----

Of course this reduces the enabling time too.

Without this fix (just enlarge hash table):
(2934 sec, 1 min intervals for each 2000 probes enabled)

----
Enabling trace events: start at 1393921862
0 1393921864 a2mp_chan_alloc_skb_cb_38581
...
19999 1393924928 nfs4_open_confirm_done_11785
----

With this fix:
(2025 sec, 1 min intervals for each 2000 probes enabled)
----
Enabling trace events: start at 1393912623
0 1393912625 a2mp_chan_alloc_skb_cb_38800
....
19999 1393914648 nfs2_xdr_dec_readlinkres_11628
----

This patch implements a simple per-cpu 4way/512entry cache
for kprobes hlist. All get_kprobe on hot-path uses the cache
and if the cache miss-hit, it searches kprobes on the hlist
and inserts the found kprobes to the cache entry.
When removing kprobes, it clears cache entries by using IPI,
because it is per-cpu cache.

Note that this consumes some memory (34KB/cpu) only for
kprobes, and this is only good for the users who use
thousands of probes at a time, e.g. kprobe stress testing.
Thus I've added CONFIG_KPROBE_CACHE option for this feature.
If you aren't interested in the stress testing, you should
set CONFIG_KPROBE_CACHE=n.

Changes from v7:
- Re-evaluate the performance improvements with 512 entry size.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
arch/Kconfig | 10 +++
arch/x86/kernel/kprobes/core.c | 2 -
arch/x86/kernel/kprobes/ftrace.c | 2 -
include/linux/kprobes.h | 1
kernel/kprobes.c | 125 +++++++++++++++++++++++++++++++++++---
5 files changed, 128 insertions(+), 12 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 97ff872..080ddb3 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -46,6 +46,16 @@ config KPROBES
for kernel debugging, non-intrusive instrumentation and testing.
If in doubt, say "N".

+config KPROBE_CACHE
+ bool "Kprobe per-cpu cache for massive multiple probes"
+ depends on KPROBES
+ help
+ For handling massive multiple kprobes with better performance,
+ kprobe per-cpu cache is enabled by this option. This cache is
+ only for users who would like to use more than 10,000 probes
+ at a time, which is usually stress testing, debugging etc.
+ If in doubt, say "N".
+
config JUMP_LABEL
bool "Optimize very unlikely/likely branches"
depends on HAVE_ARCH_JUMP_LABEL
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 3a922b7..569ee2b 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -576,7 +576,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));

kcb = get_kprobe_ctlblk();
- p = get_kprobe(addr);
+ p = get_kprobe_cached(addr);

if (p) {
if (kprobe_running()) {
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 717b02a..8178dd4 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -63,7 +63,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
/* Disable irq for emulating a breakpoint and avoiding preempt */
local_irq_save(flags);

- p = get_kprobe((kprobe_opcode_t *)ip);
+ p = get_kprobe_cached((kprobe_opcode_t *)ip);
if (unlikely(!p) || kprobe_disabled(p))
goto end;

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e81bced..70c3314 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -339,6 +339,7 @@ extern int arch_prepare_kprobe_ftrace(struct kprobe *p);

/* Get the kprobe at this addr (if any) - called with preemption disabled */
struct kprobe *get_kprobe(void *addr);
+struct kprobe *get_kprobe_cached(void *addr);
void kretprobe_hash_lock(struct task_struct *tsk,
struct hlist_head **head, unsigned long *flags);
void kretprobe_hash_unlock(struct task_struct *tsk, unsigned long *flags);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index a29e622..465e912 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -91,6 +91,84 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
static LIST_HEAD(kprobe_blacklist);
static DEFINE_MUTEX(kprobe_blacklist_mutex);

+#ifdef CONFIG_KPROBE_CACHE
+/* Kprobe cache */
+#define KPCACHE_BITS 2
+#define KPCACHE_SIZE (1 << KPCACHE_BITS)
+#define KPCACHE_INDEX(i) ((i) & (KPCACHE_SIZE - 1))
+
+struct kprobe_cache_entry {
+ unsigned long addr;
+ struct kprobe *kp;
+};
+
+struct kprobe_cache {
+ struct kprobe_cache_entry table[KPROBE_TABLE_SIZE][KPCACHE_SIZE];
+ int index[KPROBE_TABLE_SIZE];
+};
+
+static DEFINE_PER_CPU(struct kprobe_cache, kpcache);
+
+static inline
+struct kprobe *kpcache_get(unsigned long hash, unsigned long addr)
+{
+ struct kprobe_cache *cache = this_cpu_ptr(&kpcache);
+ struct kprobe_cache_entry *ent = &cache->table[hash][0];
+ struct kprobe *ret;
+ int idx = ACCESS_ONCE(cache->index[hash]);
+ int i;
+
+ for (i = 0; i < KPCACHE_SIZE; i++)
+ if (ent[i].addr == addr) {
+ ret = ent[i].kp;
+ /* Check the cache is updated */
+ if (unlikely(idx != cache->index[hash]))
+ break;
+ return ret;
+ }
+ return NULL;
+}
+
+static inline void kpcache_set(unsigned long hash, unsigned long addr,
+ struct kprobe *kp)
+{
+ struct kprobe_cache *cache = this_cpu_ptr(&kpcache);
+ struct kprobe_cache_entry *ent = &cache->table[hash][0];
+ int i = KPCACHE_INDEX(cache->index[hash]++);
+
+ /*
+ * Setting must be done in this order for avoiding interruption;
+ * (1)invalidate entry, (2)set the value, and (3)enable entry.
+ */
+ ent[i].addr = 0;
+ barrier();
+ ent[i].kp = kp;
+ barrier();
+ ent[i].addr = addr;
+}
+
+static void kpcache_invalidate_this_cpu(void *addr)
+{
+ unsigned long hash = hash_ptr(addr, KPROBE_HASH_BITS);
+ struct kprobe_cache *cache = this_cpu_ptr(&kpcache);
+ struct kprobe_cache_entry *ent = &cache->table[hash][0];
+ int i;
+
+ for (i = 0; i < KPCACHE_SIZE; i++)
+ if (ent[i].addr == (unsigned long)addr)
+ ent[i].addr = 0;
+}
+
+/* This must be called after ensuring the kprobe is removed from hlist */
+static void kpcache_invalidate(unsigned long addr)
+{
+ on_each_cpu(kpcache_invalidate_this_cpu, (void *)addr, 1);
+}
+#else
+#define kpcache_get(hash, addr) (NULL)
+#define kpcache_set(hash, addr, kp) do {} while (0)
+#define kpcache_invalidate(addr) do {} while (0)
+#endif
#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
/*
* kprobe->ainsn.insn points to the copy of the instruction to be
@@ -297,18 +375,13 @@ static inline void reset_kprobe_instance(void)
__this_cpu_write(kprobe_instance, NULL);
}

-/*
- * This routine is called either:
- * - under the kprobe_mutex - during kprobe_[un]register()
- * OR
- * - with preemption disabled - from arch/xxx/kernel/kprobes.c
- */
-struct kprobe *get_kprobe(void *addr)
+static nokprobe_inline
+struct kprobe *__get_kprobe(void *addr, unsigned long hash)
{
struct hlist_head *head;
struct kprobe *p;

- head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)];
+ head = &kprobe_table[hash];
hlist_for_each_entry_rcu(p, head, hlist) {
if (p->addr == addr)
return p;
@@ -316,8 +389,37 @@ struct kprobe *get_kprobe(void *addr)

return NULL;
}
+
+/*
+ * This routine is called either:
+ * - under the kprobe_mutex - during kprobe_[un]register()
+ * OR
+ * - with preemption disabled - from arch/xxx/kernel/kprobes.c
+ */
+struct kprobe *get_kprobe(void *addr)
+{
+ return __get_kprobe(addr, hash_ptr(addr, KPROBE_HASH_BITS));
+}
NOKPROBE_SYMBOL(get_kprobe);

+/* This is called with preemption disabed from arch-depend functions */
+struct kprobe *get_kprobe_cached(void *addr)
+{
+ unsigned long hash = hash_ptr(addr, KPROBE_HASH_BITS);
+ struct kprobe *p;
+
+ p = kpcache_get(hash, (unsigned long)addr);
+ if (likely(p))
+ return p;
+
+ p = __get_kprobe(addr, hash);
+ if (likely(p))
+ kpcache_set(hash, (unsigned long)addr, p);
+ return p;
+}
+NOKPROBE_SYMBOL(get_kprobe_cached);
+
+
static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);

/* Return true if the kprobe is an aggregator */
@@ -518,6 +620,7 @@ static void do_free_cleaned_kprobes(void)
list_for_each_entry_safe(op, tmp, &freeing_list, list) {
BUG_ON(!kprobe_unused(&op->kp));
list_del_init(&op->list);
+ kpcache_invalidate((unsigned long)op->kp.addr);
free_aggr_kprobe(&op->kp);
}
}
@@ -1639,13 +1742,15 @@ static void __unregister_kprobe_bottom(struct kprobe *p)
{
struct kprobe *ap;

- if (list_empty(&p->list))
+ if (list_empty(&p->list)) {
/* This is an independent kprobe */
+ kpcache_invalidate((unsigned long)p->addr);
arch_remove_kprobe(p);
- else if (list_is_singular(&p->list)) {
+ } else if (list_is_singular(&p->list)) {
/* This is the last child of an aggrprobe */
ap = list_entry(p->list.next, struct kprobe, list);
list_del(&p->list);
+ kpcache_invalidate((unsigned long)p->addr);
free_aggr_kprobe(ap);
}
/* Otherwise, do nothing. */

Subject: [PATCH -tip v9 23/26] kprobes/x86: Remove unneeded preempt_disable/enable in interrupt handlers

Since the int3 itself disables the local_irq and kprobes
keeps it disabled while the single step has done, the
kernel preemption never happen while processing a kprobe.
This means that we don't need to disable/enable preemption.
Also, this changes kprobe_int3_handler to use goto-out style.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 24 +++++++-----------------
1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 2fb3d3e..3a922b7 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -506,7 +506,6 @@ static void setup_singlestep(struct kprobe *p, struct pt_regs *regs,
* stepping.
*/
regs->ip = (unsigned long)p->ainsn.insn;
- preempt_enable_no_resched();
return;
}
#endif
@@ -575,13 +574,6 @@ int kprobe_int3_handler(struct pt_regs *regs)
struct kprobe_ctlblk *kcb;

addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
- /*
- * We don't want to be preempted for the entire
- * duration of kprobe processing. We conditionally
- * re-enable preemption at the end of this function,
- * and also in reenter_kprobe() and setup_singlestep().
- */
- preempt_disable();

kcb = get_kprobe_ctlblk();
p = get_kprobe(addr);
@@ -589,7 +581,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
if (p) {
if (kprobe_running()) {
if (reenter_kprobe(p, regs, kcb))
- return 1;
+ goto handled;
} else {
set_current_kprobe(p, regs, kcb);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
@@ -604,7 +596,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
*/
if (!p->pre_handler || !p->pre_handler(p, regs))
setup_singlestep(p, regs, kcb, 0);
- return 1;
+ goto handled;
}
} else if (*addr != BREAKPOINT_INSTRUCTION) {
/*
@@ -617,19 +609,20 @@ int kprobe_int3_handler(struct pt_regs *regs)
* the original instruction.
*/
regs->ip = (unsigned long)addr;
- preempt_enable_no_resched();
- return 1;
+ goto handled;
} else if (kprobe_running()) {
p = __this_cpu_read(current_kprobe);
if (p->break_handler && p->break_handler(p, regs)) {
if (!skip_singlestep(p, regs, kcb))
setup_singlestep(p, regs, kcb, 0);
- return 1;
+ goto handled;
}
} /* else: not a kprobe fault; let the kernel handle it */

- preempt_enable_no_resched();
return 0;
+
+handled:
+ return 1;
}
NOKPROBE_SYMBOL(kprobe_int3_handler);

@@ -894,7 +887,6 @@ int kprobe_debug_handler(struct pt_regs *regs)
}
reset_current_kprobe();
out:
- preempt_enable_no_resched();

/*
* if somebody else is singlestepping across a probe point, flags
@@ -930,7 +922,6 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
restore_previous_kprobe(kcb);
else
reset_current_kprobe();
- preempt_enable_no_resched();
} else if (kcb->kprobe_status == KPROBE_HIT_ACTIVE ||
kcb->kprobe_status == KPROBE_HIT_SSDONE) {
/*
@@ -1061,7 +1052,6 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
memcpy((kprobe_opcode_t *)(kcb->jprobe_saved_sp),
kcb->jprobes_stack,
MIN_STACK_SIZE(kcb->jprobe_saved_sp));
- preempt_enable_no_resched();
return 1;
}
return 0;

Subject: [PATCH -tip v9 21/26] kprobes: Use NOKPROBE_SYMBOL() in sample modules

Use NOKPROBE_SYMBOL() to protect handlers from kprobes
in sample modules.

Signed-off-by: Masami Hiramatsu <[email protected]>
Ananth N Mavinakayanahalli <[email protected]>
---
samples/kprobes/jprobe_example.c | 1 +
samples/kprobes/kprobe_example.c | 3 +++
samples/kprobes/kretprobe_example.c | 2 ++
3 files changed, 6 insertions(+)

diff --git a/samples/kprobes/jprobe_example.c b/samples/kprobes/jprobe_example.c
index b754135..40114ac 100644
--- a/samples/kprobes/jprobe_example.c
+++ b/samples/kprobes/jprobe_example.c
@@ -35,6 +35,7 @@ static long jdo_fork(unsigned long clone_flags, unsigned long stack_start,
jprobe_return();
return 0;
}
+NOKPROBE_SYMBOL(jdo_fork);

static struct jprobe my_jprobe = {
.entry = jdo_fork,
diff --git a/samples/kprobes/kprobe_example.c b/samples/kprobes/kprobe_example.c
index 366db1a..462d90f 100644
--- a/samples/kprobes/kprobe_example.c
+++ b/samples/kprobes/kprobe_example.c
@@ -46,6 +46,7 @@ static int handler_pre(struct kprobe *p, struct pt_regs *regs)
/* A dump_stack() here will give a stack backtrace */
return 0;
}
+NOKPROBE_SYMBOL(handler_pre);

/* kprobe post_handler: called after the probed instruction is executed */
static void handler_post(struct kprobe *p, struct pt_regs *regs,
@@ -68,6 +69,7 @@ static void handler_post(struct kprobe *p, struct pt_regs *regs,
p->addr, regs->ex1);
#endif
}
+NOKPROBE_SYMBOL(handler_post);

/*
* fault_handler: this is called if an exception is generated for any
@@ -81,6 +83,7 @@ static int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr)
/* Return 0 because we don't handle the fault. */
return 0;
}
+NOKPROBE_SYMBOL(handler_fault);

static int __init kprobe_init(void)
{
diff --git a/samples/kprobes/kretprobe_example.c b/samples/kprobes/kretprobe_example.c
index 1041b67..d932c52 100644
--- a/samples/kprobes/kretprobe_example.c
+++ b/samples/kprobes/kretprobe_example.c
@@ -47,6 +47,7 @@ static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
data->entry_stamp = ktime_get();
return 0;
}
+NOKPROBE_SYMBOL(entry_handler);

/*
* Return-probe handler: Log the return value and duration. Duration may turn
@@ -66,6 +67,7 @@ static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
func_name, retval, (long long)delta);
return 0;
}
+NOKPROBE_SYMBOL(ret_handler);

static struct kretprobe my_kretprobe = {
.handler = ret_handler,

Subject: [PATCH -tip v9 17/26] notifier: Use NOKPROBE_SYMBOL macro in notifier

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in notifier.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
---
kernel/notifier.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index db4c8b0..4803da6 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -71,9 +71,9 @@ static int notifier_chain_unregister(struct notifier_block **nl,
* @returns: notifier_call_chain returns the value returned by the
* last notifier function called.
*/
-static int __kprobes notifier_call_chain(struct notifier_block **nl,
- unsigned long val, void *v,
- int nr_to_call, int *nr_calls)
+static int notifier_call_chain(struct notifier_block **nl,
+ unsigned long val, void *v,
+ int nr_to_call, int *nr_calls)
{
int ret = NOTIFY_DONE;
struct notifier_block *nb, *next_nb;
@@ -102,6 +102,7 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
}
return ret;
}
+NOKPROBE_SYMBOL(notifier_call_chain);

/*
* Atomic notifier chain routines. Registration and unregistration
@@ -172,9 +173,9 @@ EXPORT_SYMBOL_GPL(atomic_notifier_chain_unregister);
* Otherwise the return value is the return value
* of the last notifier function called.
*/
-int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
- unsigned long val, void *v,
- int nr_to_call, int *nr_calls)
+int __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
+ unsigned long val, void *v,
+ int nr_to_call, int *nr_calls)
{
int ret;

@@ -184,13 +185,15 @@ int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
return ret;
}
EXPORT_SYMBOL_GPL(__atomic_notifier_call_chain);
+NOKPROBE_SYMBOL(__atomic_notifier_call_chain);

-int __kprobes atomic_notifier_call_chain(struct atomic_notifier_head *nh,
- unsigned long val, void *v)
+int atomic_notifier_call_chain(struct atomic_notifier_head *nh,
+ unsigned long val, void *v)
{
return __atomic_notifier_call_chain(nh, val, v, -1, NULL);
}
EXPORT_SYMBOL_GPL(atomic_notifier_call_chain);
+NOKPROBE_SYMBOL(atomic_notifier_call_chain);

/*
* Blocking notifier chain routines. All access to the chain is
@@ -527,7 +530,7 @@ EXPORT_SYMBOL_GPL(srcu_init_notifier_head);

static ATOMIC_NOTIFIER_HEAD(die_chain);

-int notrace __kprobes notify_die(enum die_val val, const char *str,
+int notrace notify_die(enum die_val val, const char *str,
struct pt_regs *regs, long err, int trap, int sig)
{
struct die_args args = {
@@ -540,6 +543,7 @@ int notrace __kprobes notify_die(enum die_val val, const char *str,
};
return atomic_notifier_call_chain(&die_chain, val, &args);
}
+NOKPROBE_SYMBOL(notify_die);

int register_die_notifier(struct notifier_block *nb)
{

Subject: [PATCH -tip v9 15/26] kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: "David S. Miller" <[email protected]>
---
kernel/kprobes.c | 67 +++++++++++++++++++++++++++++++++---------------------
1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4db2cc6..a21b4e6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -301,7 +301,7 @@ static inline void reset_kprobe_instance(void)
* OR
* - with preemption disabled - from arch/xxx/kernel/kprobes.c
*/
-struct kprobe __kprobes *get_kprobe(void *addr)
+struct kprobe *get_kprobe(void *addr)
{
struct hlist_head *head;
struct kprobe *p;
@@ -314,8 +314,9 @@ struct kprobe __kprobes *get_kprobe(void *addr)

return NULL;
}
+NOKPROBE_SYMBOL(get_kprobe);

-static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);
+static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);

/* Return true if the kprobe is an aggregator */
static inline int kprobe_aggrprobe(struct kprobe *p)
@@ -347,7 +348,7 @@ static bool kprobes_allow_optimization;
* Call all pre_handler on the list, but ignores its return value.
* This must be called from arch-dep optimized caller.
*/
-void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
+void opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe *kp;

@@ -359,6 +360,7 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
reset_kprobe_instance();
}
}
+NOKPROBE_SYMBOL(opt_pre_handler);

/* Free optimized instructions and optimized_kprobe */
static void free_aggr_kprobe(struct kprobe *p)
@@ -995,7 +997,7 @@ static void disarm_kprobe(struct kprobe *kp, bool reopt)
* Aggregate handlers for multiple kprobes support - these handlers
* take care of invoking the individual kprobe handlers on p->list
*/
-static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
+static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe *kp;

@@ -1009,9 +1011,10 @@ static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
}
return 0;
}
+NOKPROBE_SYMBOL(aggr_pre_handler);

-static void __kprobes aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
- unsigned long flags)
+static void aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags)
{
struct kprobe *kp;

@@ -1023,9 +1026,10 @@ static void __kprobes aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
}
}
}
+NOKPROBE_SYMBOL(aggr_post_handler);

-static int __kprobes aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
- int trapnr)
+static int aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
+ int trapnr)
{
struct kprobe *cur = __this_cpu_read(kprobe_instance);

@@ -1039,8 +1043,9 @@ static int __kprobes aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
}
return 0;
}
+NOKPROBE_SYMBOL(aggr_fault_handler);

-static int __kprobes aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
+static int aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe *cur = __this_cpu_read(kprobe_instance);
int ret = 0;
@@ -1052,9 +1057,10 @@ static int __kprobes aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
reset_kprobe_instance();
return ret;
}
+NOKPROBE_SYMBOL(aggr_break_handler);

/* Walks the list and increments nmissed count for multiprobe case */
-void __kprobes kprobes_inc_nmissed_count(struct kprobe *p)
+void kprobes_inc_nmissed_count(struct kprobe *p)
{
struct kprobe *kp;
if (!kprobe_aggrprobe(p)) {
@@ -1065,9 +1071,10 @@ void __kprobes kprobes_inc_nmissed_count(struct kprobe *p)
}
return;
}
+NOKPROBE_SYMBOL(kprobes_inc_nmissed_count);

-void __kprobes recycle_rp_inst(struct kretprobe_instance *ri,
- struct hlist_head *head)
+void recycle_rp_inst(struct kretprobe_instance *ri,
+ struct hlist_head *head)
{
struct kretprobe *rp = ri->rp;

@@ -1082,8 +1089,9 @@ void __kprobes recycle_rp_inst(struct kretprobe_instance *ri,
/* Unregistering */
hlist_add_head(&ri->hlist, head);
}
+NOKPROBE_SYMBOL(recycle_rp_inst);

-void __kprobes kretprobe_hash_lock(struct task_struct *tsk,
+void kretprobe_hash_lock(struct task_struct *tsk,
struct hlist_head **head, unsigned long *flags)
__acquires(hlist_lock)
{
@@ -1094,17 +1102,19 @@ __acquires(hlist_lock)
hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_lock_irqsave(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_hash_lock);

-static void __kprobes kretprobe_table_lock(unsigned long hash,
- unsigned long *flags)
+static void kretprobe_table_lock(unsigned long hash,
+ unsigned long *flags)
__acquires(hlist_lock)
{
raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_lock_irqsave(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_table_lock);

-void __kprobes kretprobe_hash_unlock(struct task_struct *tsk,
- unsigned long *flags)
+void kretprobe_hash_unlock(struct task_struct *tsk,
+ unsigned long *flags)
__releases(hlist_lock)
{
unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
@@ -1113,14 +1123,16 @@ __releases(hlist_lock)
hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_unlock_irqrestore(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_hash_unlock);

-static void __kprobes kretprobe_table_unlock(unsigned long hash,
- unsigned long *flags)
+static void kretprobe_table_unlock(unsigned long hash,
+ unsigned long *flags)
__releases(hlist_lock)
{
raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_unlock_irqrestore(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_table_unlock);

/*
* This function is called from finish_task_switch when task tk becomes dead,
@@ -1128,7 +1140,7 @@ __releases(hlist_lock)
* with this task. These left over instances represent probed functions
* that have been called but will never return.
*/
-void __kprobes kprobe_flush_task(struct task_struct *tk)
+void kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
struct hlist_head *head, empty_rp;
@@ -1153,6 +1165,7 @@ void __kprobes kprobe_flush_task(struct task_struct *tk)
kfree(ri);
}
}
+NOKPROBE_SYMBOL(kprobe_flush_task);

static inline void free_rp_inst(struct kretprobe *rp)
{
@@ -1165,7 +1178,7 @@ static inline void free_rp_inst(struct kretprobe *rp)
}
}

-static void __kprobes cleanup_rp_inst(struct kretprobe *rp)
+static void cleanup_rp_inst(struct kretprobe *rp)
{
unsigned long flags, hash;
struct kretprobe_instance *ri;
@@ -1184,6 +1197,7 @@ static void __kprobes cleanup_rp_inst(struct kretprobe *rp)
}
free_rp_inst(rp);
}
+NOKPROBE_SYMBOL(cleanup_rp_inst);

/*
* Add the new probe to ap->list. Fail if this is the
@@ -1758,8 +1772,7 @@ EXPORT_SYMBOL_GPL(unregister_jprobes);
* This kprobe pre_handler is registered with every kretprobe. When probe
* hits it will set up the return probe.
*/
-static int __kprobes pre_handler_kretprobe(struct kprobe *p,
- struct pt_regs *regs)
+static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
{
struct kretprobe *rp = container_of(p, struct kretprobe, kp);
unsigned long hash, flags = 0;
@@ -1797,6 +1810,7 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
}
return 0;
}
+NOKPROBE_SYMBOL(pre_handler_kretprobe);

int register_kretprobe(struct kretprobe *rp)
{
@@ -1920,11 +1934,11 @@ void unregister_kretprobes(struct kretprobe **rps, int num)
}
EXPORT_SYMBOL_GPL(unregister_kretprobes);

-static int __kprobes pre_handler_kretprobe(struct kprobe *p,
- struct pt_regs *regs)
+static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
{
return 0;
}
+NOKPROBE_SYMBOL(pre_handler_kretprobe);

#endif /* CONFIG_KRETPROBES */

@@ -2002,12 +2016,13 @@ out:
}
EXPORT_SYMBOL_GPL(enable_kprobe);

-void __kprobes dump_kprobe(struct kprobe *kp)
+void dump_kprobe(struct kprobe *kp)
{
printk(KERN_WARNING "Dumping kprobe:\n");
printk(KERN_WARNING "Name: %s\nAddress: %p\nOffset: %x\n",
kp->symbol_name, kp->addr, kp->offset);
}
+NOKPROBE_SYMBOL(dump_kprobe);

/*
* Lookup and populate the kprobe_blacklist.

Subject: [PATCH -tip v9 11/26] kprobes: Allow probe on some kprobe functions

There is no need to prohibit probing on the functions
used for preparation, registeration, optimization,
controll etc. Those are safely probed because those are
not invoked from breakpoint/fault/debug handlers,
there is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist.
add_new_kprobe
aggr_kprobe_disabled
alloc_aggr_kprobe
alloc_aggr_kprobe
arm_all_kprobes
__arm_kprobe
arm_kprobe
arm_kprobe_ftrace
check_kprobe_address_safe
collect_garbage_slots
collect_garbage_slots
collect_one_slot
debugfs_kprobe_init
__disable_kprobe
disable_kprobe
disarm_all_kprobes
__disarm_kprobe
disarm_kprobe
disarm_kprobe_ftrace
do_free_cleaned_kprobes
do_optimize_kprobes
do_unoptimize_kprobes
enable_kprobe
force_unoptimize_kprobe
free_aggr_kprobe
free_aggr_kprobe
__free_insn_slot
__get_insn_slot
get_optimized_kprobe
__get_valid_kprobe
init_aggr_kprobe
init_aggr_kprobe
in_nokprobe_functions
kick_kprobe_optimizer
kill_kprobe
kill_optimized_kprobe
kprobe_addr
kprobe_optimizer
kprobe_queued
kprobe_seq_next
kprobe_seq_start
kprobe_seq_stop
kprobes_module_callback
kprobes_open
optimize_all_kprobes
optimize_kprobe
prepare_kprobe
prepare_optimized_kprobe
register_aggr_kprobe
register_jprobe
register_jprobes
register_kprobe
register_kprobes
register_kretprobe
register_kretprobe
register_kretprobes
register_kretprobes
report_probe
show_kprobe_addr
try_to_optimize_kprobe
unoptimize_all_kprobes
unoptimize_kprobe
unregister_jprobe
unregister_jprobes
unregister_kprobe
__unregister_kprobe_bottom
unregister_kprobes
__unregister_kprobe_top
unregister_kretprobe
unregister_kretprobe
unregister_kretprobes
unregister_kretprobes
wait_for_kprobe_optimizer

I tested those functions by putting kprobes on all
instructions in the functions with the bash script
I sent to LKML. See below,

https://lkml.org/lkml/2014/3/27/33

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: "David S. Miller" <[email protected]>
---
kernel/kprobes.c | 153 +++++++++++++++++++++++++++---------------------------
1 file changed, 76 insertions(+), 77 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 5ffc687..4db2cc6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -138,13 +138,13 @@ struct kprobe_insn_cache kprobe_insn_slots = {
.insn_size = MAX_INSN_SIZE,
.nr_garbage = 0,
};
-static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c);
+static int collect_garbage_slots(struct kprobe_insn_cache *c);

/**
* __get_insn_slot() - Find a slot on an executable page for an instruction.
* We allocate an executable page if there's no room on existing ones.
*/
-kprobe_opcode_t __kprobes *__get_insn_slot(struct kprobe_insn_cache *c)
+kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
{
struct kprobe_insn_page *kip;
kprobe_opcode_t *slot = NULL;
@@ -201,7 +201,7 @@ out:
}

/* Return 1 if all garbages are collected, otherwise 0. */
-static int __kprobes collect_one_slot(struct kprobe_insn_page *kip, int idx)
+static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
{
kip->slot_used[idx] = SLOT_CLEAN;
kip->nused--;
@@ -222,7 +222,7 @@ static int __kprobes collect_one_slot(struct kprobe_insn_page *kip, int idx)
return 0;
}

-static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c)
+static int collect_garbage_slots(struct kprobe_insn_cache *c)
{
struct kprobe_insn_page *kip, *next;

@@ -244,8 +244,8 @@ static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c)
return 0;
}

-void __kprobes __free_insn_slot(struct kprobe_insn_cache *c,
- kprobe_opcode_t *slot, int dirty)
+void __free_insn_slot(struct kprobe_insn_cache *c,
+ kprobe_opcode_t *slot, int dirty)
{
struct kprobe_insn_page *kip;

@@ -361,7 +361,7 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
}

/* Free optimized instructions and optimized_kprobe */
-static __kprobes void free_aggr_kprobe(struct kprobe *p)
+static void free_aggr_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -399,7 +399,7 @@ static inline int kprobe_disarmed(struct kprobe *p)
}

/* Return true(!0) if the probe is queued on (un)optimizing lists */
-static int __kprobes kprobe_queued(struct kprobe *p)
+static int kprobe_queued(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -415,7 +415,7 @@ static int __kprobes kprobe_queued(struct kprobe *p)
* Return an optimized kprobe whose optimizing code replaces
* instructions including addr (exclude breakpoint).
*/
-static struct kprobe *__kprobes get_optimized_kprobe(unsigned long addr)
+static struct kprobe *get_optimized_kprobe(unsigned long addr)
{
int i;
struct kprobe *p = NULL;
@@ -447,7 +447,7 @@ static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
* Optimize (replace a breakpoint with a jump) kprobes listed on
* optimizing_list.
*/
-static __kprobes void do_optimize_kprobes(void)
+static void do_optimize_kprobes(void)
{
/* Optimization never be done when disarmed */
if (kprobes_all_disarmed || !kprobes_allow_optimization ||
@@ -475,7 +475,7 @@ static __kprobes void do_optimize_kprobes(void)
* Unoptimize (replace a jump with a breakpoint and remove the breakpoint
* if need) kprobes listed on unoptimizing_list.
*/
-static __kprobes void do_unoptimize_kprobes(void)
+static void do_unoptimize_kprobes(void)
{
struct optimized_kprobe *op, *tmp;

@@ -507,7 +507,7 @@ static __kprobes void do_unoptimize_kprobes(void)
}

/* Reclaim all kprobes on the free_list */
-static __kprobes void do_free_cleaned_kprobes(void)
+static void do_free_cleaned_kprobes(void)
{
struct optimized_kprobe *op, *tmp;

@@ -519,13 +519,13 @@ static __kprobes void do_free_cleaned_kprobes(void)
}

/* Start optimizer after OPTIMIZE_DELAY passed */
-static __kprobes void kick_kprobe_optimizer(void)
+static void kick_kprobe_optimizer(void)
{
schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY);
}

/* Kprobe jump optimizer */
-static __kprobes void kprobe_optimizer(struct work_struct *work)
+static void kprobe_optimizer(struct work_struct *work)
{
mutex_lock(&kprobe_mutex);
/* Lock modules while optimizing kprobes */
@@ -561,7 +561,7 @@ static __kprobes void kprobe_optimizer(struct work_struct *work)
}

/* Wait for completing optimization and unoptimization */
-static __kprobes void wait_for_kprobe_optimizer(void)
+static void wait_for_kprobe_optimizer(void)
{
mutex_lock(&kprobe_mutex);

@@ -580,7 +580,7 @@ static __kprobes void wait_for_kprobe_optimizer(void)
}

/* Optimize kprobe if p is ready to be optimized */
-static __kprobes void optimize_kprobe(struct kprobe *p)
+static void optimize_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -614,7 +614,7 @@ static __kprobes void optimize_kprobe(struct kprobe *p)
}

/* Short cut to direct unoptimizing */
-static __kprobes void force_unoptimize_kprobe(struct optimized_kprobe *op)
+static void force_unoptimize_kprobe(struct optimized_kprobe *op)
{
get_online_cpus();
arch_unoptimize_kprobe(op);
@@ -624,7 +624,7 @@ static __kprobes void force_unoptimize_kprobe(struct optimized_kprobe *op)
}

/* Unoptimize a kprobe if p is optimized */
-static __kprobes void unoptimize_kprobe(struct kprobe *p, bool force)
+static void unoptimize_kprobe(struct kprobe *p, bool force)
{
struct optimized_kprobe *op;

@@ -684,7 +684,7 @@ static void reuse_unused_kprobe(struct kprobe *ap)
}

/* Remove optimized instructions */
-static void __kprobes kill_optimized_kprobe(struct kprobe *p)
+static void kill_optimized_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -710,7 +710,7 @@ static void __kprobes kill_optimized_kprobe(struct kprobe *p)
}

/* Try to prepare optimized instructions */
-static __kprobes void prepare_optimized_kprobe(struct kprobe *p)
+static void prepare_optimized_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -719,7 +719,7 @@ static __kprobes void prepare_optimized_kprobe(struct kprobe *p)
}

/* Allocate new optimized_kprobe and try to prepare optimized instructions */
-static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
+static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -734,13 +734,13 @@ static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
return &op->kp;
}

-static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p);
+static void init_aggr_kprobe(struct kprobe *ap, struct kprobe *p);

/*
* Prepare an optimized_kprobe and optimize it
* NOTE: p must be a normal registered kprobe
*/
-static __kprobes void try_to_optimize_kprobe(struct kprobe *p)
+static void try_to_optimize_kprobe(struct kprobe *p)
{
struct kprobe *ap;
struct optimized_kprobe *op;
@@ -774,7 +774,7 @@ out:
}

#ifdef CONFIG_SYSCTL
-static void __kprobes optimize_all_kprobes(void)
+static void optimize_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -797,7 +797,7 @@ out:
mutex_unlock(&kprobe_mutex);
}

-static void __kprobes unoptimize_all_kprobes(void)
+static void unoptimize_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -848,7 +848,7 @@ int proc_kprobes_optimization_handler(struct ctl_table *table, int write,
#endif /* CONFIG_SYSCTL */

/* Put a breakpoint for a probe. Must be called with text_mutex locked */
-static void __kprobes __arm_kprobe(struct kprobe *p)
+static void __arm_kprobe(struct kprobe *p)
{
struct kprobe *_p;

@@ -863,7 +863,7 @@ static void __kprobes __arm_kprobe(struct kprobe *p)
}

/* Remove the breakpoint of a probe. Must be called with text_mutex locked */
-static void __kprobes __disarm_kprobe(struct kprobe *p, bool reopt)
+static void __disarm_kprobe(struct kprobe *p, bool reopt)
{
struct kprobe *_p;

@@ -898,13 +898,13 @@ static void reuse_unused_kprobe(struct kprobe *ap)
BUG_ON(kprobe_unused(ap));
}

-static __kprobes void free_aggr_kprobe(struct kprobe *p)
+static void free_aggr_kprobe(struct kprobe *p)
{
arch_remove_kprobe(p);
kfree(p);
}

-static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
+static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
{
return kzalloc(sizeof(struct kprobe), GFP_KERNEL);
}
@@ -918,7 +918,7 @@ static struct ftrace_ops kprobe_ftrace_ops __read_mostly = {
static int kprobe_ftrace_enabled;

/* Must ensure p->addr is really on ftrace */
-static int __kprobes prepare_kprobe(struct kprobe *p)
+static int prepare_kprobe(struct kprobe *p)
{
if (!kprobe_ftrace(p))
return arch_prepare_kprobe(p);
@@ -927,7 +927,7 @@ static int __kprobes prepare_kprobe(struct kprobe *p)
}

/* Caller must lock kprobe_mutex */
-static void __kprobes arm_kprobe_ftrace(struct kprobe *p)
+static void arm_kprobe_ftrace(struct kprobe *p)
{
int ret;

@@ -942,7 +942,7 @@ static void __kprobes arm_kprobe_ftrace(struct kprobe *p)
}

/* Caller must lock kprobe_mutex */
-static void __kprobes disarm_kprobe_ftrace(struct kprobe *p)
+static void disarm_kprobe_ftrace(struct kprobe *p)
{
int ret;

@@ -962,7 +962,7 @@ static void __kprobes disarm_kprobe_ftrace(struct kprobe *p)
#endif

/* Arm a kprobe with text_mutex */
-static void __kprobes arm_kprobe(struct kprobe *kp)
+static void arm_kprobe(struct kprobe *kp)
{
if (unlikely(kprobe_ftrace(kp))) {
arm_kprobe_ftrace(kp);
@@ -979,7 +979,7 @@ static void __kprobes arm_kprobe(struct kprobe *kp)
}

/* Disarm a kprobe with text_mutex */
-static void __kprobes disarm_kprobe(struct kprobe *kp, bool reopt)
+static void disarm_kprobe(struct kprobe *kp, bool reopt)
{
if (unlikely(kprobe_ftrace(kp))) {
disarm_kprobe_ftrace(kp);
@@ -1189,7 +1189,7 @@ static void __kprobes cleanup_rp_inst(struct kretprobe *rp)
* Add the new probe to ap->list. Fail if this is the
* second jprobe at the address - two jprobes can't coexist
*/
-static int __kprobes add_new_kprobe(struct kprobe *ap, struct kprobe *p)
+static int add_new_kprobe(struct kprobe *ap, struct kprobe *p)
{
BUG_ON(kprobe_gone(ap) || kprobe_gone(p));

@@ -1213,7 +1213,7 @@ static int __kprobes add_new_kprobe(struct kprobe *ap, struct kprobe *p)
* Fill in the required fields of the "manager kprobe". Replace the
* earlier kprobe in the hlist with the manager kprobe
*/
-static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
+static void init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
{
/* Copy p's insn slot to ap */
copy_kprobe(p, ap);
@@ -1239,8 +1239,7 @@ static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
* This is the second or subsequent kprobe at the address - handle
* the intricacies
*/
-static int __kprobes register_aggr_kprobe(struct kprobe *orig_p,
- struct kprobe *p)
+static int register_aggr_kprobe(struct kprobe *orig_p, struct kprobe *p)
{
int ret = 0;
struct kprobe *ap = orig_p;
@@ -1318,7 +1317,7 @@ bool __weak arch_within_kprobe_blacklist(unsigned long addr)
addr < (unsigned long)__kprobes_text_end;
}

-static bool __kprobes within_kprobe_blacklist(unsigned long addr)
+static bool within_kprobe_blacklist(unsigned long addr)
{
struct kprobe_blacklist_entry *ent;

@@ -1342,7 +1341,7 @@ static bool __kprobes within_kprobe_blacklist(unsigned long addr)
* This returns encoded errors if it fails to look up symbol or invalid
* combination of parameters.
*/
-static kprobe_opcode_t __kprobes *kprobe_addr(struct kprobe *p)
+static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
{
kprobe_opcode_t *addr = p->addr;

@@ -1365,7 +1364,7 @@ invalid:
}

/* Check passed kprobe is valid and return kprobe in kprobe_table. */
-static struct kprobe * __kprobes __get_valid_kprobe(struct kprobe *p)
+static struct kprobe *__get_valid_kprobe(struct kprobe *p)
{
struct kprobe *ap, *list_p;

@@ -1397,8 +1396,8 @@ static inline int check_kprobe_rereg(struct kprobe *p)
return ret;
}

-static __kprobes int check_kprobe_address_safe(struct kprobe *p,
- struct module **probed_mod)
+static int check_kprobe_address_safe(struct kprobe *p,
+ struct module **probed_mod)
{
int ret = 0;
unsigned long ftrace_addr;
@@ -1460,7 +1459,7 @@ out:
return ret;
}

-int __kprobes register_kprobe(struct kprobe *p)
+int register_kprobe(struct kprobe *p)
{
int ret;
struct kprobe *old_p;
@@ -1522,7 +1521,7 @@ out:
EXPORT_SYMBOL_GPL(register_kprobe);

/* Check if all probes on the aggrprobe are disabled */
-static int __kprobes aggr_kprobe_disabled(struct kprobe *ap)
+static int aggr_kprobe_disabled(struct kprobe *ap)
{
struct kprobe *kp;

@@ -1538,7 +1537,7 @@ static int __kprobes aggr_kprobe_disabled(struct kprobe *ap)
}

/* Disable one kprobe: Make sure called under kprobe_mutex is locked */
-static struct kprobe *__kprobes __disable_kprobe(struct kprobe *p)
+static struct kprobe *__disable_kprobe(struct kprobe *p)
{
struct kprobe *orig_p;

@@ -1565,7 +1564,7 @@ static struct kprobe *__kprobes __disable_kprobe(struct kprobe *p)
/*
* Unregister a kprobe without a scheduler synchronization.
*/
-static int __kprobes __unregister_kprobe_top(struct kprobe *p)
+static int __unregister_kprobe_top(struct kprobe *p)
{
struct kprobe *ap, *list_p;

@@ -1622,7 +1621,7 @@ disarmed:
return 0;
}

-static void __kprobes __unregister_kprobe_bottom(struct kprobe *p)
+static void __unregister_kprobe_bottom(struct kprobe *p)
{
struct kprobe *ap;

@@ -1638,7 +1637,7 @@ static void __kprobes __unregister_kprobe_bottom(struct kprobe *p)
/* Otherwise, do nothing. */
}

-int __kprobes register_kprobes(struct kprobe **kps, int num)
+int register_kprobes(struct kprobe **kps, int num)
{
int i, ret = 0;

@@ -1656,13 +1655,13 @@ int __kprobes register_kprobes(struct kprobe **kps, int num)
}
EXPORT_SYMBOL_GPL(register_kprobes);

-void __kprobes unregister_kprobe(struct kprobe *p)
+void unregister_kprobe(struct kprobe *p)
{
unregister_kprobes(&p, 1);
}
EXPORT_SYMBOL_GPL(unregister_kprobe);

-void __kprobes unregister_kprobes(struct kprobe **kps, int num)
+void unregister_kprobes(struct kprobe **kps, int num)
{
int i;

@@ -1691,7 +1690,7 @@ unsigned long __weak arch_deref_entry_point(void *entry)
return (unsigned long)entry;
}

-int __kprobes register_jprobes(struct jprobe **jps, int num)
+int register_jprobes(struct jprobe **jps, int num)
{
struct jprobe *jp;
int ret = 0, i;
@@ -1722,19 +1721,19 @@ int __kprobes register_jprobes(struct jprobe **jps, int num)
}
EXPORT_SYMBOL_GPL(register_jprobes);

-int __kprobes register_jprobe(struct jprobe *jp)
+int register_jprobe(struct jprobe *jp)
{
return register_jprobes(&jp, 1);
}
EXPORT_SYMBOL_GPL(register_jprobe);

-void __kprobes unregister_jprobe(struct jprobe *jp)
+void unregister_jprobe(struct jprobe *jp)
{
unregister_jprobes(&jp, 1);
}
EXPORT_SYMBOL_GPL(unregister_jprobe);

-void __kprobes unregister_jprobes(struct jprobe **jps, int num)
+void unregister_jprobes(struct jprobe **jps, int num)
{
int i;

@@ -1799,7 +1798,7 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
return 0;
}

-int __kprobes register_kretprobe(struct kretprobe *rp)
+int register_kretprobe(struct kretprobe *rp)
{
int ret = 0;
struct kretprobe_instance *inst;
@@ -1852,7 +1851,7 @@ int __kprobes register_kretprobe(struct kretprobe *rp)
}
EXPORT_SYMBOL_GPL(register_kretprobe);

-int __kprobes register_kretprobes(struct kretprobe **rps, int num)
+int register_kretprobes(struct kretprobe **rps, int num)
{
int ret = 0, i;

@@ -1870,13 +1869,13 @@ int __kprobes register_kretprobes(struct kretprobe **rps, int num)
}
EXPORT_SYMBOL_GPL(register_kretprobes);

-void __kprobes unregister_kretprobe(struct kretprobe *rp)
+void unregister_kretprobe(struct kretprobe *rp)
{
unregister_kretprobes(&rp, 1);
}
EXPORT_SYMBOL_GPL(unregister_kretprobe);

-void __kprobes unregister_kretprobes(struct kretprobe **rps, int num)
+void unregister_kretprobes(struct kretprobe **rps, int num)
{
int i;

@@ -1899,24 +1898,24 @@ void __kprobes unregister_kretprobes(struct kretprobe **rps, int num)
EXPORT_SYMBOL_GPL(unregister_kretprobes);

#else /* CONFIG_KRETPROBES */
-int __kprobes register_kretprobe(struct kretprobe *rp)
+int register_kretprobe(struct kretprobe *rp)
{
return -ENOSYS;
}
EXPORT_SYMBOL_GPL(register_kretprobe);

-int __kprobes register_kretprobes(struct kretprobe **rps, int num)
+int register_kretprobes(struct kretprobe **rps, int num)
{
return -ENOSYS;
}
EXPORT_SYMBOL_GPL(register_kretprobes);

-void __kprobes unregister_kretprobe(struct kretprobe *rp)
+void unregister_kretprobe(struct kretprobe *rp)
{
}
EXPORT_SYMBOL_GPL(unregister_kretprobe);

-void __kprobes unregister_kretprobes(struct kretprobe **rps, int num)
+void unregister_kretprobes(struct kretprobe **rps, int num)
{
}
EXPORT_SYMBOL_GPL(unregister_kretprobes);
@@ -1930,7 +1929,7 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
#endif /* CONFIG_KRETPROBES */

/* Set the kprobe gone and remove its instruction buffer. */
-static void __kprobes kill_kprobe(struct kprobe *p)
+static void kill_kprobe(struct kprobe *p)
{
struct kprobe *kp;

@@ -1954,7 +1953,7 @@ static void __kprobes kill_kprobe(struct kprobe *p)
}

/* Disable one kprobe */
-int __kprobes disable_kprobe(struct kprobe *kp)
+int disable_kprobe(struct kprobe *kp)
{
int ret = 0;

@@ -1970,7 +1969,7 @@ int __kprobes disable_kprobe(struct kprobe *kp)
EXPORT_SYMBOL_GPL(disable_kprobe);

/* Enable one kprobe */
-int __kprobes enable_kprobe(struct kprobe *kp)
+int enable_kprobe(struct kprobe *kp)
{
int ret = 0;
struct kprobe *p;
@@ -2043,8 +2042,8 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
}

/* Module notifier call back, checking kprobes on the module */
-static int __kprobes kprobes_module_callback(struct notifier_block *nb,
- unsigned long val, void *data)
+static int kprobes_module_callback(struct notifier_block *nb,
+ unsigned long val, void *data)
{
struct module *mod = data;
struct hlist_head *head;
@@ -2145,7 +2144,7 @@ static int __init init_kprobes(void)
}

#ifdef CONFIG_DEBUG_FS
-static void __kprobes report_probe(struct seq_file *pi, struct kprobe *p,
+static void report_probe(struct seq_file *pi, struct kprobe *p,
const char *sym, int offset, char *modname, struct kprobe *pp)
{
char *kprobe_type;
@@ -2174,12 +2173,12 @@ static void __kprobes report_probe(struct seq_file *pi, struct kprobe *p,
(kprobe_ftrace(pp) ? "[FTRACE]" : ""));
}

-static void __kprobes *kprobe_seq_start(struct seq_file *f, loff_t *pos)
+static void *kprobe_seq_start(struct seq_file *f, loff_t *pos)
{
return (*pos < KPROBE_TABLE_SIZE) ? pos : NULL;
}

-static void __kprobes *kprobe_seq_next(struct seq_file *f, void *v, loff_t *pos)
+static void *kprobe_seq_next(struct seq_file *f, void *v, loff_t *pos)
{
(*pos)++;
if (*pos >= KPROBE_TABLE_SIZE)
@@ -2187,12 +2186,12 @@ static void __kprobes *kprobe_seq_next(struct seq_file *f, void *v, loff_t *pos)
return pos;
}

-static void __kprobes kprobe_seq_stop(struct seq_file *f, void *v)
+static void kprobe_seq_stop(struct seq_file *f, void *v)
{
/* Nothing to do */
}

-static int __kprobes show_kprobe_addr(struct seq_file *pi, void *v)
+static int show_kprobe_addr(struct seq_file *pi, void *v)
{
struct hlist_head *head;
struct kprobe *p, *kp;
@@ -2223,7 +2222,7 @@ static const struct seq_operations kprobes_seq_ops = {
.show = show_kprobe_addr
};

-static int __kprobes kprobes_open(struct inode *inode, struct file *filp)
+static int kprobes_open(struct inode *inode, struct file *filp)
{
return seq_open(filp, &kprobes_seq_ops);
}
@@ -2235,7 +2234,7 @@ static const struct file_operations debugfs_kprobes_operations = {
.release = seq_release,
};

-static void __kprobes arm_all_kprobes(void)
+static void arm_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -2263,7 +2262,7 @@ already_enabled:
return;
}

-static void __kprobes disarm_all_kprobes(void)
+static void disarm_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -2347,7 +2346,7 @@ static const struct file_operations fops_kp = {
.llseek = default_llseek,
};

-static int __kprobes debugfs_kprobe_init(void)
+static int __init debugfs_kprobe_init(void)
{
struct dentry *dir, *file;
unsigned int value = 1;

Subject: [PATCH -tip v9 06/26] [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt

Prohibit probing on native_set_debugreg and native_load_idt.
Since the kprobes uses do_debug for single stepping,
functions called from do_debug before notify_die must not
be probed.
And also native_load_idt is called from paranoid_exit when
returning int3, this also must not be probed.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Chris Wright <[email protected]>
Cc: Alok Kataria <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/kernel/paravirt.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index e136869..548d25f 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -390,8 +390,10 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.end_context_switch = paravirt_nop,
};

-/* At this point, native_get_debugreg has a real function entry */
+/* At this point, native_get/set_debugreg has real function entries */
NOKPROBE_SYMBOL(native_get_debugreg);
+NOKPROBE_SYMBOL(native_set_debugreg);
+NOKPROBE_SYMBOL(native_load_idt);

struct pv_apic_ops pv_apic_ops = {
#ifdef CONFIG_X86_LOCAL_APIC

Subject: [PATCH -tip v9 09/26] x86: Call exception_enter after kprobes handled

Move exception_enter() call after kprobes handler
is done. Since the exception_enter() involves
many other functions (like printk), it can cause
recursive int3/break loop when kprobes probe such
functions.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/kernel/traps.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e5d4a70..ba9abe9 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -327,7 +327,6 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
if (poke_int3_handler(regs))
return;

- prev_state = exception_enter();
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -338,6 +337,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
if (kprobe_int3_handler(regs))
return;
#endif
+ prev_state = exception_enter();

if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -415,8 +415,6 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
unsigned long dr6;
int si_code;

- prev_state = exception_enter();
-
get_debugreg(dr6, 6);

/* Filter out all the reserved bits which are preset to 1 */
@@ -449,6 +447,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
if (kprobe_debug_handler(regs))
goto exit;
#endif
+ prev_state = exception_enter();

if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
SIGTRAP) == NOTIFY_STOP)

Subject: [PATCH -tip v9 04/26] kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist

Introduce NOKPROBE_SYMBOL() macro which builds a kprobe
blacklist in build time. The usage of this macro is similar
to the EXPORT_SYMBOL, put the NOKPROBE_SYMBOL(function); just
after the function definition.
Since this macro will inhibit inlining of static/inline
functions, this patch also introduce nokprobe_inline macro
for static/inline functions. In this case, we must use
NOKPROBE_SYMBOL() for the inline function caller.

When CONFIG_KPROBES=y, the macro stores the given function
address in the "_kprobe_blacklist" section.

Since the data structures are not fully initialized by the
macro (because there is no "size" information), those
are re-initialized at boot time by using kallsyms.

Changes from v8:
- Fix the documentation and a comment according to
Steven's suggestion. (Thanks!)

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Rob Landley <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Chris Wright <[email protected]>
Cc: Alok Kataria <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
Documentation/kprobes.txt | 16 ++++++
arch/x86/include/asm/asm.h | 7 +++
arch/x86/kernel/paravirt.c | 4 +
include/asm-generic/vmlinux.lds.h | 9 +++
include/linux/compiler.h | 2 +
include/linux/kprobes.h | 20 ++++++-
kernel/kprobes.c | 100 +++++++++++++++++++------------------
kernel/sched/core.c | 1
8 files changed, 107 insertions(+), 52 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 0cfb00f..4bbeca8 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -22,8 +22,9 @@ Appendix B: The kprobes sysctl interface

Kprobes enables you to dynamically break into any kernel routine and
collect debugging and performance information non-disruptively. You
-can trap at almost any kernel code address, specifying a handler
+can trap at almost any kernel code address(*), specifying a handler
routine to be invoked when the breakpoint is hit.
+(*: some parts of the kernel code can not be trapped, see 1.5 Blacklist)

There are currently three types of probes: kprobes, jprobes, and
kretprobes (also called return probes). A kprobe can be inserted
@@ -273,6 +274,19 @@ using one of the following techniques:
or
- Execute 'sysctl -w debug.kprobes_optimization=n'

+1.5 Blacklist
+
+Kprobes can probe most of the kernel except itself. This means
+that there are some functions where kprobes cannot probe. Probing
+(trapping) such functions can cause a recursive trap (e.g. double
+fault) or the nested probe handler may never be called.
+Kprobes manages such functions as a blacklist.
+If you want to add a function into the blacklist, you just need
+to (1) include linux/kprobes.h and (2) use NOKPROBE_SYMBOL() macro
+to specify a blacklisted function.
+Kprobes checks the given probe address against the blacklist and
+rejects registering it, if the given address is in the blacklist.
+
2. Architectures Supported

Kprobes, jprobes, and return probes are implemented on the following
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 4582e8e..7730c1c 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -57,6 +57,12 @@
.long (from) - . ; \
.long (to) - . + 0x7ffffff0 ; \
.popsection
+
+# define _ASM_NOKPROBE(entry) \
+ .pushsection "_kprobe_blacklist","aw" ; \
+ _ASM_ALIGN ; \
+ _ASM_PTR (entry); \
+ .popsection
#else
# define _ASM_EXTABLE(from,to) \
" .pushsection \"__ex_table\",\"a\"\n" \
@@ -71,6 +77,7 @@
" .long (" #from ") - .\n" \
" .long (" #to ") - . + 0x7ffffff0\n" \
" .popsection\n"
+/* For C file, we already have NOKPROBE_SYMBOL macro */
#endif

#endif /* _ASM_X86_ASM_H */
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 1b10af8..e136869 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -23,6 +23,7 @@
#include <linux/efi.h>
#include <linux/bcd.h>
#include <linux/highmem.h>
+#include <linux/kprobes.h>

#include <asm/bug.h>
#include <asm/paravirt.h>
@@ -389,6 +390,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.end_context_switch = paravirt_nop,
};

+/* At this point, native_get_debugreg has a real function entry */
+NOKPROBE_SYMBOL(native_get_debugreg);
+
struct pv_apic_ops pv_apic_ops = {
#ifdef CONFIG_X86_LOCAL_APIC
.startup_ipi_hook = paravirt_nop,
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 146e4ff..40ceb3c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -109,6 +109,14 @@
#define BRANCH_PROFILE()
#endif

+#ifdef CONFIG_KPROBES
+#define KPROBE_BLACKLIST() VMLINUX_SYMBOL(__start_kprobe_blacklist) = .; \
+ *(_kprobe_blacklist) \
+ VMLINUX_SYMBOL(__stop_kprobe_blacklist) = .;
+#else
+#define KPROBE_BLACKLIST()
+#endif
+
#ifdef CONFIG_EVENT_TRACING
#define FTRACE_EVENTS() . = ALIGN(8); \
VMLINUX_SYMBOL(__start_ftrace_events) = .; \
@@ -507,6 +515,7 @@
*(.init.rodata) \
FTRACE_EVENTS() \
TRACE_SYSCALLS() \
+ KPROBE_BLACKLIST() \
MEM_DISCARD(init.rodata) \
CLK_OF_TABLES() \
RESERVEDMEM_OF_TABLES() \
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index ee7239e..0300c0f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -374,7 +374,9 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
/* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */
#ifdef CONFIG_KPROBES
# define __kprobes __attribute__((__section__(".kprobes.text")))
+# define nokprobe_inline __always_inline
#else
# define __kprobes
+# define nokprobe_inline inline
#endif
#endif /* __LINUX_COMPILER_H */
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index cdf9251..e059507 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -205,10 +205,10 @@ struct kretprobe_blackpoint {
void *addr;
};

-struct kprobe_blackpoint {
- const char *name;
+struct kprobe_blacklist_entry {
+ struct list_head list;
unsigned long start_addr;
- unsigned long range;
+ unsigned long end_addr;
};

#ifdef CONFIG_KPROBES
@@ -477,4 +477,18 @@ static inline int enable_jprobe(struct jprobe *jp)
return enable_kprobe(&jp->kp);
}

+#ifdef CONFIG_KPROBES
+/*
+ * Blacklist ganerating macro. Specify functions which is not probed
+ * by using this macro.
+ */
+#define __NOKPROBE_SYMBOL(fname) \
+static unsigned long __used \
+ __attribute__((section("_kprobe_blacklist"))) \
+ _kbl_addr_##fname = (unsigned long)fname;
+#define NOKPROBE_SYMBOL(fname) __NOKPROBE_SYMBOL(fname)
+#else
+#define NOKPROBE_SYMBOL(fname)
+#endif
+
#endif /* _LINUX_KPROBES_H */
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 5b5ac76..5ffc687 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -86,18 +86,8 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
return &(kretprobe_table_locks[hash].lock);
}

-/*
- * Normally, functions that we'd want to prohibit kprobes in, are marked
- * __kprobes. But, there are cases where such functions already belong to
- * a different section (__sched for preempt_schedule)
- *
- * For such cases, we now have a blacklist
- */
-static struct kprobe_blackpoint kprobe_blacklist[] = {
- {"preempt_schedule",},
- {"native_get_debugreg",},
- {NULL} /* Terminator */
-};
+/* Blacklist -- list of struct kprobe_blacklist_entry */
+static LIST_HEAD(kprobe_blacklist);

#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
/*
@@ -1328,24 +1318,22 @@ bool __weak arch_within_kprobe_blacklist(unsigned long addr)
addr < (unsigned long)__kprobes_text_end;
}

-static int __kprobes in_kprobes_functions(unsigned long addr)
+static bool __kprobes within_kprobe_blacklist(unsigned long addr)
{
- struct kprobe_blackpoint *kb;
+ struct kprobe_blacklist_entry *ent;

if (arch_within_kprobe_blacklist(addr))
- return -EINVAL;
+ return true;
/*
* If there exists a kprobe_blacklist, verify and
* fail any probe registration in the prohibited area
*/
- for (kb = kprobe_blacklist; kb->name != NULL; kb++) {
- if (kb->start_addr) {
- if (addr >= kb->start_addr &&
- addr < (kb->start_addr + kb->range))
- return -EINVAL;
- }
+ list_for_each_entry(ent, &kprobe_blacklist, list) {
+ if (addr >= ent->start_addr && addr < ent->end_addr)
+ return true;
}
- return 0;
+
+ return false;
}

/*
@@ -1436,7 +1424,7 @@ static __kprobes int check_kprobe_address_safe(struct kprobe *p,

/* Ensure it is not in reserved area nor out of text */
if (!kernel_text_address((unsigned long) p->addr) ||
- in_kprobes_functions((unsigned long) p->addr) ||
+ within_kprobe_blacklist((unsigned long) p->addr) ||
jump_label_text_reserved(p->addr, p->addr)) {
ret = -EINVAL;
goto out;
@@ -2022,6 +2010,38 @@ void __kprobes dump_kprobe(struct kprobe *kp)
kp->symbol_name, kp->addr, kp->offset);
}

+/*
+ * Lookup and populate the kprobe_blacklist.
+ *
+ * Unlike the kretprobe blacklist, we'll need to determine
+ * the range of addresses that belong to the said functions,
+ * since a kprobe need not necessarily be at the beginning
+ * of a function.
+ */
+static int __init populate_kprobe_blacklist(unsigned long *start,
+ unsigned long *end)
+{
+ unsigned long *iter;
+ struct kprobe_blacklist_entry *ent;
+ unsigned long offset = 0, size = 0;
+
+ for (iter = start; iter < end; iter++) {
+ if (!kallsyms_lookup_size_offset(*iter, &size, &offset)) {
+ pr_err("Failed to find blacklist %p\n", (void *)*iter);
+ continue;
+ }
+
+ ent = kmalloc(sizeof(*ent), GFP_KERNEL);
+ if (!ent)
+ return -ENOMEM;
+ ent->start_addr = *iter;
+ ent->end_addr = *iter + size;
+ INIT_LIST_HEAD(&ent->list);
+ list_add_tail(&ent->list, &kprobe_blacklist);
+ }
+ return 0;
+}
+
/* Module notifier call back, checking kprobes on the module */
static int __kprobes kprobes_module_callback(struct notifier_block *nb,
unsigned long val, void *data)
@@ -2065,14 +2085,13 @@ static struct notifier_block kprobe_module_nb = {
.priority = 0
};

+/* Markers of _kprobe_blacklist section */
+extern unsigned long __start_kprobe_blacklist[];
+extern unsigned long __stop_kprobe_blacklist[];
+
static int __init init_kprobes(void)
{
int i, err = 0;
- unsigned long offset = 0, size = 0;
- char *modname, namebuf[KSYM_NAME_LEN];
- const char *symbol_name;
- void *addr;
- struct kprobe_blackpoint *kb;

/* FIXME allocate the probe table, currently defined statically */
/* initialize all list heads */
@@ -2082,26 +2101,11 @@ static int __init init_kprobes(void)
raw_spin_lock_init(&(kretprobe_table_locks[i].lock));
}

- /*
- * Lookup and populate the kprobe_blacklist.
- *
- * Unlike the kretprobe blacklist, we'll need to determine
- * the range of addresses that belong to the said functions,
- * since a kprobe need not necessarily be at the beginning
- * of a function.
- */
- for (kb = kprobe_blacklist; kb->name != NULL; kb++) {
- kprobe_lookup_name(kb->name, addr);
- if (!addr)
- continue;
-
- kb->start_addr = (unsigned long)addr;
- symbol_name = kallsyms_lookup(kb->start_addr,
- &size, &offset, &modname, namebuf);
- if (!symbol_name)
- kb->range = 0;
- else
- kb->range = size;
+ err = populate_kprobe_blacklist(__start_kprobe_blacklist,
+ __stop_kprobe_blacklist);
+ if (err) {
+ pr_err("kprobes: failed to populate blacklist: %d\n", err);
+ pr_err("Please take care of using kprobes.\n");
}

if (kretprobe_blacklist_size) {
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 00ac248..ce9b306 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2801,6 +2801,7 @@ asmlinkage void __sched notrace preempt_schedule(void)
barrier();
} while (need_resched());
}
+NOKPROBE_SYMBOL(preempt_schedule);
EXPORT_SYMBOL(preempt_schedule);
#endif /* CONFIG_PREEMPT */


Subject: [PATCH -tip v9 02/26] kprobes/x86: Allow to handle reentered kprobe on singlestepping

Since the NMI handlers(e.g. perf) can interrupt in the
single stepping (or preparing the single stepping, do_debug
etc.), we should consider a kprobe is hit in the NMI
handler. Even in that case, the kprobe is allowed to be
reentered as same as the kprobes hit in kprobe handlers
(KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE).
The real issue will happen when a kprobe hit while another
reentered kprobe is processing (KPROBE_REENTER), because
we already consumed a saved-area for the previous kprobe.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 137e873..fb6d421 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -531,10 +531,11 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb
switch (kcb->kprobe_status) {
case KPROBE_HIT_SSDONE:
case KPROBE_HIT_ACTIVE:
+ case KPROBE_HIT_SS:
kprobes_inc_nmissed_count(p);
setup_singlestep(p, regs, kcb, 1);
break;
- case KPROBE_HIT_SS:
+ case KPROBE_REENTER:
/* A probe has been hit in the codepath leading up to, or just
* after, single-stepping of a probed instruction. This entire
* codepath should strictly reside in .kprobes.text section.

Subject: [PATCH -tip v9 08/26] kprobes/x86: Call exception handlers directly from do_int3/do_debug

To avoid a kernel crash by probing on lockdep code, call
kprobe_int3_handler() and kprobe_debug_handler()(which was
formerly called post_kprobe_handler()) directly from
do_int3 and do_debug.

Currently kprobes uses notify_die() to hook the int3/debug
exceptoins. Since there is a locking code in notify_die,
the lockdep code can be invoked. And because the lockdep involves
printk() related things, theoretically, we need to prohibit probing
on such code, which means much longer blacklist we'll have.
Instead, hooking the int3/debug for kprobes before notify_die()
can avoid this problem.

Anyway, most of the int3 handlers in the kernel are already
called from do_int3 directly, e.g. ftrace_int3_handler,
poke_int3_handler, kgdb_ll_trap. Actually only
kprobe_exceptions_notify is on the notifier_call_chain.

Changes from v8:
- Update the patch description.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
arch/x86/include/asm/kprobes.h | 2 ++
arch/x86/kernel/kprobes/core.c | 24 +++---------------------
arch/x86/kernel/traps.c | 10 ++++++++++
3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 9454c16..53cdfb2 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -116,4 +116,6 @@ struct kprobe_ctlblk {
extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
extern int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);
+extern int kprobe_int3_handler(struct pt_regs *regs);
+extern int kprobe_debug_handler(struct pt_regs *regs);
#endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 181bdde..4157d648 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -559,7 +559,7 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb
* Interrupts are disabled on entry as trap3 is an interrupt gate and they
* remain disabled throughout this function.
*/
-static int __kprobes kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_int3_handler(struct pt_regs *regs)
{
kprobe_opcode_t *addr;
struct kprobe *p;
@@ -857,7 +857,7 @@ no_change:
* Interrupts are disabled on entry as trap1 is an interrupt gate and they
* remain disabled throughout this function.
*/
-static int __kprobes post_kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_debug_handler(struct pt_regs *regs)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -963,22 +963,7 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
if (args->regs && user_mode_vm(args->regs))
return ret;

- switch (val) {
- case DIE_INT3:
- if (kprobe_handler(args->regs))
- ret = NOTIFY_STOP;
- break;
- case DIE_DEBUG:
- if (post_kprobe_handler(args->regs)) {
- /*
- * Reset the BS bit in dr6 (pointed by args->err) to
- * denote completion of processing
- */
- (*(unsigned long *)ERR_PTR(args->err)) &= ~DR_STEP;
- ret = NOTIFY_STOP;
- }
- break;
- case DIE_GPF:
+ if (val == DIE_GPF) {
/*
* To be potentially processing a kprobe fault and to
* trust the result from kprobe_running(), we have
@@ -987,9 +972,6 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
if (!preemptible() && kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
- break;
- default:
- break;
}
return ret;
}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..e5d4a70 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -334,6 +334,11 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
goto exit;
#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */

+#ifdef CONFIG_KPROBES
+ if (kprobe_int3_handler(regs))
+ return;
+#endif
+
if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
goto exit;
@@ -440,6 +445,11 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
/* Store the virtualized DR6 value */
tsk->thread.debugreg6 = dr6;

+#ifdef CONFIG_KPROBES
+ if (kprobe_debug_handler(regs))
+ goto exit;
+#endif
+
if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
SIGTRAP) == NOTIFY_STOP)
goto exit;

Subject: [PATCH -tip v9 03/26] kprobes: Prohibit probing on .entry.text code

.entry.text is a code area which is used for interrupt/syscall
entries, which includes many sensitive code.
Thus, it is better to prohibit probing on all of such code
instead of a part of that.
Since some symbols are already registered on kprobe blacklist,
this also removes them from the blacklist.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
arch/x86/kernel/entry_32.S | 33 ---------------------------------
arch/x86/kernel/entry_64.S | 20 --------------------
arch/x86/kernel/kprobes/core.c | 8 ++++++++
include/linux/kprobes.h | 1 +
kernel/kprobes.c | 13 ++++++++-----
5 files changed, 17 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index a2a4f46..0ca5bf1 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -315,10 +315,6 @@ ENTRY(ret_from_kernel_thread)
ENDPROC(ret_from_kernel_thread)

/*
- * Interrupt exit functions should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
-/*
* Return to user mode is not as complex as all this looks,
* but we want the default path for a system call return to
* go as quickly as possible which is why some of this is
@@ -372,10 +368,6 @@ need_resched:
END(resume_kernel)
#endif
CFI_ENDPROC
-/*
- * End of kprobes section
- */
- .popsection

/* SYSENTER_RETURN points to after the "sysenter" instruction in
the vsyscall page. See vsyscall-sysentry.S, which defines the symbol. */
@@ -495,10 +487,6 @@ sysexit_audit:
PTGS_TO_GS_EX
ENDPROC(ia32_sysenter_target)

-/*
- * syscall stub including irq exit should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
# system call handler stub
ENTRY(system_call)
RING0_INT_FRAME # can't unwind into user space anyway
@@ -691,10 +679,6 @@ syscall_badsys:
jmp resume_userspace
END(syscall_badsys)
CFI_ENDPROC
-/*
- * End of kprobes section
- */
- .popsection

.macro FIXUP_ESPFIX_STACK
/*
@@ -781,10 +765,6 @@ common_interrupt:
ENDPROC(common_interrupt)
CFI_ENDPROC

-/*
- * Irq entries should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
#define BUILD_INTERRUPT3(name, nr, fn) \
ENTRY(name) \
RING0_INT_FRAME; \
@@ -961,10 +941,6 @@ ENTRY(spurious_interrupt_bug)
jmp error_code
CFI_ENDPROC
END(spurious_interrupt_bug)
-/*
- * End of kprobes section
- */
- .popsection

#ifdef CONFIG_XEN
/* Xen doesn't set %esp to be precisely what the normal sysenter
@@ -1239,11 +1215,6 @@ return_to_handler:
jmp *%ecx
#endif

-/*
- * Some functions should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
-
#ifdef CONFIG_TRACING
ENTRY(trace_page_fault)
RING0_EC_FRAME
@@ -1453,7 +1424,3 @@ ENTRY(async_page_fault)
END(async_page_fault)
#endif

-/*
- * End of kprobes section
- */
- .popsection
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1e96c36..43bb389 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -487,8 +487,6 @@ ENDPROC(native_usergs_sysret64)
TRACE_IRQS_OFF
.endm

-/* save complete stack frame */
- .pushsection .kprobes.text, "ax"
ENTRY(save_paranoid)
XCPT_FRAME 1 RDI+8
cld
@@ -517,7 +515,6 @@ ENTRY(save_paranoid)
1: ret
CFI_ENDPROC
END(save_paranoid)
- .popsection

/*
* A newly forked process directly context switches into this address.
@@ -975,10 +972,6 @@ END(interrupt)
call \func
.endm

-/*
- * Interrupt entry/exit should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
/*
* The interrupt stubs push (~vector+0x80) onto the stack and
* then jump to common_interrupt.
@@ -1113,10 +1106,6 @@ ENTRY(retint_kernel)

CFI_ENDPROC
END(common_interrupt)
-/*
- * End of kprobes section
- */
- .popsection

/*
* APIC interrupts.
@@ -1477,11 +1466,6 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
hyperv_callback_vector hyperv_vector_handler
#endif /* CONFIG_HYPERV */

-/*
- * Some functions should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
-
paranoidzeroentry_ist debug do_debug DEBUG_STACK
paranoidzeroentry_ist int3 do_int3 DEBUG_STACK
paranoiderrorentry stack_segment do_stack_segment
@@ -1898,7 +1882,3 @@ ENTRY(ignore_sysret)
CFI_ENDPROC
END(ignore_sysret)

-/*
- * End of kprobes section
- */
- .popsection
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index fb6d421..181bdde 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1065,6 +1065,14 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
return 0;
}

+bool arch_within_kprobe_blacklist(unsigned long addr)
+{
+ return (addr >= (unsigned long)__kprobes_text_start &&
+ addr < (unsigned long)__kprobes_text_end) ||
+ (addr >= (unsigned long)__entry_text_start &&
+ addr < (unsigned long)__entry_text_end);
+}
+
int __init arch_init_kprobes(void)
{
return 0;
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 925eaf2..cdf9251 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -265,6 +265,7 @@ extern void arch_disarm_kprobe(struct kprobe *p);
extern int arch_init_kprobes(void);
extern void show_registers(struct pt_regs *regs);
extern void kprobes_inc_nmissed_count(struct kprobe *p);
+extern bool arch_within_kprobe_blacklist(unsigned long addr);

struct kprobe_insn_cache {
struct mutex mutex;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index ceeadfc..5b5ac76 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -96,9 +96,6 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
static struct kprobe_blackpoint kprobe_blacklist[] = {
{"preempt_schedule",},
{"native_get_debugreg",},
- {"irq_entries_start",},
- {"common_interrupt",},
- {"mcount",}, /* mcount can be called from everywhere */
{NULL} /* Terminator */
};

@@ -1324,12 +1321,18 @@ out:
return ret;
}

+bool __weak arch_within_kprobe_blacklist(unsigned long addr)
+{
+ /* The __kprobes marked functions and entry code must not be probed */
+ return addr >= (unsigned long)__kprobes_text_start &&
+ addr < (unsigned long)__kprobes_text_end;
+}
+
static int __kprobes in_kprobes_functions(unsigned long addr)
{
struct kprobe_blackpoint *kb;

- if (addr >= (unsigned long)__kprobes_text_start &&
- addr < (unsigned long)__kprobes_text_end)
+ if (arch_within_kprobe_blacklist(addr))
return -EINVAL;
/*
* If there exists a kprobe_blacklist, verify and

2014-04-17 08:37:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip v9 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts


* Masami Hiramatsu <[email protected]> wrote:

> Masami Hiramatsu (26):
> [BUGFIX]kprobes/x86: Fix page-fault handling logic
> kprobes/x86: Allow to handle reentered kprobe on singlestepping
> kprobes: Prohibit probing on .entry.text code
> kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist
> [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*
> [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt
> [BUGFIX] x86: Prohibit probing on thunk functions and restore

Ok, the whole series is looking pretty good now.

Can I apply the 4 bugfixes first, or do the last 3 bugfixes depend on
the 3 non-bugfixes patches before them in any way?

Thanks,

Ingo

Subject: Re: [PATCH -tip v9 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts

(2014/04/17 17:37), Ingo Molnar wrote:
>
> * Masami Hiramatsu <[email protected]> wrote:
>
>> Masami Hiramatsu (26):
>> [BUGFIX]kprobes/x86: Fix page-fault handling logic
>> kprobes/x86: Allow to handle reentered kprobe on singlestepping
>> kprobes: Prohibit probing on .entry.text code
>> kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist
>> [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*
>> [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt
>> [BUGFIX] x86: Prohibit probing on thunk functions and restore
>
> Ok, the whole series is looking pretty good now.
>
> Can I apply the 4 bugfixes first, or do the last 3 bugfixes depend on
> the 3 non-bugfixes patches before them in any way?

Right, the last 3 bugfixes depends on NOKPROBE_SYMBOL(), so you can
apply only the first patch...

Thank you,

>
> Thanks,
>
> Ingo
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: [tip:perf/urgent] kprobes/x86: Fix page-fault handling logic

Commit-ID: 6381c24cd6d5d6373620426ab0a96c80ed953e20
Gitweb: http://git.kernel.org/tip/6381c24cd6d5d6373620426ab0a96c80ed953e20
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:16:44 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 17 Apr 2014 10:57:02 +0200

kprobes/x86: Fix page-fault handling logic

Current kprobes in-kernel page fault handler doesn't
expect that its single-stepping can be interrupted by
an NMI handler which may cause a page fault(e.g. perf
with callback tracing).

In that case, the page-fault handled by kprobes and it
misunderstands the page-fault has been caused by the
single-stepping code and tries to recover IP address
to probed address.

But the truth is the page-fault has been caused by the
NMI handler, and do_page_fault failes to handle real
page fault because the IP address is modified and
causes Kernel BUGs like below.

----
[ 2264.726905] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 2264.727190] IP: [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40

To handle this correctly, I fixed the kprobes fault
handler to ensure the faulted ip address is its own
single-step buffer instead of checking current kprobe
state.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Sandeepa Prabhu <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 79a3f96..61b17dc 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -897,9 +897,10 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();

- switch (kcb->kprobe_status) {
- case KPROBE_HIT_SS:
- case KPROBE_REENTER:
+ if (unlikely(regs->ip == (unsigned long)cur->ainsn.insn)) {
+ /* This must happen on single-stepping */
+ WARN_ON(kcb->kprobe_status != KPROBE_HIT_SS &&
+ kcb->kprobe_status != KPROBE_REENTER);
/*
* We are here because the instruction being single
* stepped caused a page fault. We reset the current
@@ -914,9 +915,8 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
else
reset_current_kprobe();
preempt_enable_no_resched();
- break;
- case KPROBE_HIT_ACTIVE:
- case KPROBE_HIT_SSDONE:
+ } else if (kcb->kprobe_status == KPROBE_HIT_ACTIVE ||
+ kcb->kprobe_status == KPROBE_HIT_SSDONE) {
/*
* We increment the nmissed count for accounting,
* we can also use npre/npostfault count for accounting
@@ -945,10 +945,8 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
* fixup routine could not handle it,
* Let do_page_fault() fix it.
*/
- break;
- default:
- break;
}
+
return 0;
}

2014-04-17 14:40:48

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH -tip v9 17/26] notifier: Use NOKPROBE_SYMBOL macro in notifier

On Thu, Apr 17, 2014 at 05:18:35PM +0900, Masami Hiramatsu wrote:
> Use NOKPROBE_SYMBOL macro to protect functions from
> kprobes instead of __kprobes annotation in notifier.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Reviewed-by: Steven Rostedt <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: "Paul E. McKenney" <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> kernel/notifier.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/notifier.c b/kernel/notifier.c
> index db4c8b0..4803da6 100644
> --- a/kernel/notifier.c
> +++ b/kernel/notifier.c
> @@ -71,9 +71,9 @@ static int notifier_chain_unregister(struct notifier_block **nl,
> * @returns: notifier_call_chain returns the value returned by the
> * last notifier function called.
> */
> -static int __kprobes notifier_call_chain(struct notifier_block **nl,
> - unsigned long val, void *v,
> - int nr_to_call, int *nr_calls)
> +static int notifier_call_chain(struct notifier_block **nl,
> + unsigned long val, void *v,
> + int nr_to_call, int *nr_calls)
> {
> int ret = NOTIFY_DONE;
> struct notifier_block *nb, *next_nb;
> @@ -102,6 +102,7 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
> }
> return ret;
> }
> +NOKPROBE_SYMBOL(notifier_call_chain);
>
> /*
> * Atomic notifier chain routines. Registration and unregistration
> @@ -172,9 +173,9 @@ EXPORT_SYMBOL_GPL(atomic_notifier_chain_unregister);
> * Otherwise the return value is the return value
> * of the last notifier function called.
> */
> -int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
> - unsigned long val, void *v,
> - int nr_to_call, int *nr_calls)
> +int __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
> + unsigned long val, void *v,
> + int nr_to_call, int *nr_calls)
> {
> int ret;
>
> @@ -184,13 +185,15 @@ int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
> return ret;
> }
> EXPORT_SYMBOL_GPL(__atomic_notifier_call_chain);
> +NOKPROBE_SYMBOL(__atomic_notifier_call_chain);
>
> -int __kprobes atomic_notifier_call_chain(struct atomic_notifier_head *nh,
> - unsigned long val, void *v)
> +int atomic_notifier_call_chain(struct atomic_notifier_head *nh,
> + unsigned long val, void *v)
> {
> return __atomic_notifier_call_chain(nh, val, v, -1, NULL);
> }
> EXPORT_SYMBOL_GPL(atomic_notifier_call_chain);
> +NOKPROBE_SYMBOL(atomic_notifier_call_chain);
>
> /*
> * Blocking notifier chain routines. All access to the chain is
> @@ -527,7 +530,7 @@ EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
>
> static ATOMIC_NOTIFIER_HEAD(die_chain);
>
> -int notrace __kprobes notify_die(enum die_val val, const char *str,
> +int notrace notify_die(enum die_val val, const char *str,
> struct pt_regs *regs, long err, int trap, int sig)
> {
> struct die_args args = {
> @@ -540,6 +543,7 @@ int notrace __kprobes notify_die(enum die_val val, const char *str,
> };
> return atomic_notifier_call_chain(&die_chain, val, &args);
> }
> +NOKPROBE_SYMBOL(notify_die);
>
> int register_die_notifier(struct notifier_block *nb)
> {
>
>

2014-04-24 08:58:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip v9 22/26] kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text


* Masami Hiramatsu <[email protected]> wrote:

> Use kprobe_blackpoint for blacklisting .entry.text and .kprobees.text
> instead of arch_within_kprobe_blacklist. This also makes them visible
> via (debugfs)/kprobes/blacklist.

This description references kprobe_blackpoint, which name I don't
think exists anymore.

Thanks,

Ingo

2014-04-24 09:01:45

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits


* Masami Hiramatsu <[email protected]> wrote:

> Introduce kprobe cache to reduce cache misshits for
> massive multiple kprobes.
> For stress testing kprobes, we need to activate kprobes
> as many as possible. This situation causes cache miss
> hit storm on kprobe hash-list. kprobe hashlist is already
> enlarged to 4k entries and this is still small for 40k
> kprobes.
>
> For example, when registering 40k probes on the hlist and
> enabling 20k probes, perf tools shows still a lot of
> cache-misses are on the get_kprobe.
> ----
> Samples: 633 of event 'cache-misses', Event count (approx.): 3414776
> + 68.13% [k] get_kprobe
> + 4.38% [k] ftrace_lookup_ip
> + 2.54% [k] kprobe_ftrace_handler
> ----
>
> Also, I found that the most of the kprobes are not hit.
> In that case, to reduce cache-misses, we can reduce the
> random memory access by introducing a per-cpu cache which
> caches the address of frequently used kprobe data structure
> and its probe address.
>
> With kpcache enabled, the get_kprobe_cached goes down to
> around 4-5% of cache-misses with 20k probes.
> ----
> Samples: 729 of event 'cache-misses', Event count (approx.): 690125
> + 14.49% [k] ftrace_lookup_ip
> + 5.61% [k] kprobe_trace_func
> + 5.17% [k] kprobe_ftrace_handler
> + 4.62% [k] get_kprobe_cached
> ----
>
> Of course this reduces the enabling time too.
>
> Without this fix (just enlarge hash table):
> (2934 sec, 1 min intervals for each 2000 probes enabled)
>
> ----
> Enabling trace events: start at 1393921862
> 0 1393921864 a2mp_chan_alloc_skb_cb_38581
> ...
> 19999 1393924928 nfs4_open_confirm_done_11785
> ----
>
> With this fix:
> (2025 sec, 1 min intervals for each 2000 probes enabled)

That's a nice speedup.

So I don't think this should be a Kconfig entry, just enable it
unconditionally. That will further simplify the code.

Thanks,

Ingo

2014-04-24 09:12:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module


* Masami Hiramatsu <[email protected]> wrote:

> To blacklist the functions in a module (e.g. user-defined
> kprobe handler and the functions invoked from it), expand
> blacklist support for modules.
> With this change, users can use NOKPROBE_SYMBOL() macro in
> their own modules.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Cc: Ananth N Mavinakayanahalli <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Rob Landley <[email protected]>
> Cc: Rusty Russell <[email protected]>
> ---
> Documentation/kprobes.txt | 8 ++++++
> include/linux/module.h | 5 ++++
> kernel/kprobes.c | 63 ++++++++++++++++++++++++++++++++++++++-------
> kernel/module.c | 6 ++++
> 4 files changed, 72 insertions(+), 10 deletions(-)
>
> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
> index 4bbeca8..2845956 100644
> --- a/Documentation/kprobes.txt
> +++ b/Documentation/kprobes.txt
> @@ -512,6 +512,14 @@ int enable_jprobe(struct jprobe *jp);
> Enables *probe which has been disabled by disable_*probe(). You must specify
> the probe which has been registered.
>
> +4.9 NOKPROBE_SYMBOL()
> +
> +#include <linux/kprobes.h>
> +NOKPROBE_SYMBOL(FUNCTION);
> +
> +Protects given FUNCTION from other kprobes. This is useful for handler
> +functions and functions called from the handlers.
> +
> 5. Kprobes Features and Limitations
>
> Kprobes allows multiple probes at the same address. Currently,
> diff --git a/include/linux/module.h b/include/linux/module.h
> index f520a76..2fdb673 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -16,6 +16,7 @@
> #include <linux/kobject.h>
> #include <linux/moduleparam.h>
> #include <linux/jump_label.h>
> +#include <linux/kprobes.h>

This include breaks the x86 build:

CC arch/x86/kernel/jump_label.o
In file included from arch/x86/kernel/jump_label.c:14:0:
/fast/mingo/tip/arch/x86/include/asm/kprobes.h:35:12: error: conflicting types for ‘kprobe_opcode_t' typedef u8 kprobe_opcode_t;
[...]

But the #include kprobes.h is unnecessary to begin with, as no kprobe
specific types are used.

> #include <linux/export.h>
>
> #include <linux/percpu.h>
> @@ -357,6 +358,10 @@ struct module {
> unsigned int num_ftrace_callsites;
> unsigned long *ftrace_callsites;
> #endif
> +#ifdef CONFIG_KPROBES
> + unsigned int num_kprobe_blacklist;
> + unsigned long *kprobe_blacklist;
> +#endif

There's a small coding style problem here.

More importantly, I think more should be done to make sure that module
symbols are marked properly: since the module is going to register the
kprobes handler, that would be a perfect place to emit a warning,
right?

In fact, why don't kprobe handlers get added to the exclusion list
explicitly, when the handler gets registered? With such an approach
handlers are automatically nokprobe and don't need any annotation -
which is a far more robust usage model.

So I'm skipping this patch and the next one that makes use of it.

Thanks,

Ingo

Subject: [tip:perf/kprobes] kprobes/x86: Allow to handle reentered kprobe on single-stepping

Commit-ID: 6a5022a56ac37da7bffece043331a101ed0040b1
Gitweb: http://git.kernel.org/tip/6a5022a56ac37da7bffece043331a101ed0040b1
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:16:51 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:55 +0200

kprobes/x86: Allow to handle reentered kprobe on single-stepping

Since the NMI handlers(e.g. perf) can interrupt in the
single stepping (or preparing the single stepping, do_debug
etc.), we should consider a kprobe is hit in the NMI
handler. Even in that case, the kprobe is allowed to be
reentered as same as the kprobes hit in kprobe handlers
(KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE).

The real issue will happen when a kprobe hit while another
reentered kprobe is processing (KPROBE_REENTER), because
we already consumed a saved-area for the previous kprobe.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Jonathan Lebon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 61b17dc..da7bdaa 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -531,10 +531,11 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb
switch (kcb->kprobe_status) {
case KPROBE_HIT_SSDONE:
case KPROBE_HIT_ACTIVE:
+ case KPROBE_HIT_SS:
kprobes_inc_nmissed_count(p);
setup_singlestep(p, regs, kcb, 1);
break;
- case KPROBE_HIT_SS:
+ case KPROBE_REENTER:
/* A probe has been hit in the codepath leading up to, or just
* after, single-stepping of a probed instruction. This entire
* codepath should strictly reside in .kprobes.text section.

Subject: [tip:perf/kprobes] kprobes: Prohibit probing on .entry.text code

Commit-ID: be8f274323c26ddc7e6fd6c44254b7abcdbe6389
Gitweb: http://git.kernel.org/tip/be8f274323c26ddc7e6fd6c44254b7abcdbe6389
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:16:58 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:56 +0200

kprobes: Prohibit probing on .entry.text code

.entry.text is a code area which is used for interrupt/syscall
entries, which includes many sensitive code.
Thus, it is better to prohibit probing on all of such code
instead of a part of that.
Since some symbols are already registered on kprobe blacklist,
this also removes them from the blacklist.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jan Kiszka <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Jonathan Lebon <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/entry_32.S | 33 ---------------------------------
arch/x86/kernel/entry_64.S | 20 --------------------
arch/x86/kernel/kprobes/core.c | 8 ++++++++
include/linux/kprobes.h | 1 +
kernel/kprobes.c | 13 ++++++++-----
5 files changed, 17 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index a2a4f46..0ca5bf1 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -315,10 +315,6 @@ ENTRY(ret_from_kernel_thread)
ENDPROC(ret_from_kernel_thread)

/*
- * Interrupt exit functions should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
-/*
* Return to user mode is not as complex as all this looks,
* but we want the default path for a system call return to
* go as quickly as possible which is why some of this is
@@ -372,10 +368,6 @@ need_resched:
END(resume_kernel)
#endif
CFI_ENDPROC
-/*
- * End of kprobes section
- */
- .popsection

/* SYSENTER_RETURN points to after the "sysenter" instruction in
the vsyscall page. See vsyscall-sysentry.S, which defines the symbol. */
@@ -495,10 +487,6 @@ sysexit_audit:
PTGS_TO_GS_EX
ENDPROC(ia32_sysenter_target)

-/*
- * syscall stub including irq exit should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
# system call handler stub
ENTRY(system_call)
RING0_INT_FRAME # can't unwind into user space anyway
@@ -691,10 +679,6 @@ syscall_badsys:
jmp resume_userspace
END(syscall_badsys)
CFI_ENDPROC
-/*
- * End of kprobes section
- */
- .popsection

.macro FIXUP_ESPFIX_STACK
/*
@@ -781,10 +765,6 @@ common_interrupt:
ENDPROC(common_interrupt)
CFI_ENDPROC

-/*
- * Irq entries should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
#define BUILD_INTERRUPT3(name, nr, fn) \
ENTRY(name) \
RING0_INT_FRAME; \
@@ -961,10 +941,6 @@ ENTRY(spurious_interrupt_bug)
jmp error_code
CFI_ENDPROC
END(spurious_interrupt_bug)
-/*
- * End of kprobes section
- */
- .popsection

#ifdef CONFIG_XEN
/* Xen doesn't set %esp to be precisely what the normal sysenter
@@ -1239,11 +1215,6 @@ return_to_handler:
jmp *%ecx
#endif

-/*
- * Some functions should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
-
#ifdef CONFIG_TRACING
ENTRY(trace_page_fault)
RING0_EC_FRAME
@@ -1453,7 +1424,3 @@ ENTRY(async_page_fault)
END(async_page_fault)
#endif

-/*
- * End of kprobes section
- */
- .popsection
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1e96c36..43bb389 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -487,8 +487,6 @@ ENDPROC(native_usergs_sysret64)
TRACE_IRQS_OFF
.endm

-/* save complete stack frame */
- .pushsection .kprobes.text, "ax"
ENTRY(save_paranoid)
XCPT_FRAME 1 RDI+8
cld
@@ -517,7 +515,6 @@ ENTRY(save_paranoid)
1: ret
CFI_ENDPROC
END(save_paranoid)
- .popsection

/*
* A newly forked process directly context switches into this address.
@@ -975,10 +972,6 @@ END(interrupt)
call \func
.endm

-/*
- * Interrupt entry/exit should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
/*
* The interrupt stubs push (~vector+0x80) onto the stack and
* then jump to common_interrupt.
@@ -1113,10 +1106,6 @@ ENTRY(retint_kernel)

CFI_ENDPROC
END(common_interrupt)
-/*
- * End of kprobes section
- */
- .popsection

/*
* APIC interrupts.
@@ -1477,11 +1466,6 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
hyperv_callback_vector hyperv_vector_handler
#endif /* CONFIG_HYPERV */

-/*
- * Some functions should be protected against kprobes
- */
- .pushsection .kprobes.text, "ax"
-
paranoidzeroentry_ist debug do_debug DEBUG_STACK
paranoidzeroentry_ist int3 do_int3 DEBUG_STACK
paranoiderrorentry stack_segment do_stack_segment
@@ -1898,7 +1882,3 @@ ENTRY(ignore_sysret)
CFI_ENDPROC
END(ignore_sysret)

-/*
- * End of kprobes section
- */
- .popsection
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index da7bdaa..7751b3d 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1065,6 +1065,14 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
return 0;
}

+bool arch_within_kprobe_blacklist(unsigned long addr)
+{
+ return (addr >= (unsigned long)__kprobes_text_start &&
+ addr < (unsigned long)__kprobes_text_end) ||
+ (addr >= (unsigned long)__entry_text_start &&
+ addr < (unsigned long)__entry_text_end);
+}
+
int __init arch_init_kprobes(void)
{
return 0;
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 925eaf2..cdf9251 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -265,6 +265,7 @@ extern void arch_disarm_kprobe(struct kprobe *p);
extern int arch_init_kprobes(void);
extern void show_registers(struct pt_regs *regs);
extern void kprobes_inc_nmissed_count(struct kprobe *p);
+extern bool arch_within_kprobe_blacklist(unsigned long addr);

struct kprobe_insn_cache {
struct mutex mutex;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index ceeadfc..5b5ac76 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -96,9 +96,6 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
static struct kprobe_blackpoint kprobe_blacklist[] = {
{"preempt_schedule",},
{"native_get_debugreg",},
- {"irq_entries_start",},
- {"common_interrupt",},
- {"mcount",}, /* mcount can be called from everywhere */
{NULL} /* Terminator */
};

@@ -1324,12 +1321,18 @@ out:
return ret;
}

+bool __weak arch_within_kprobe_blacklist(unsigned long addr)
+{
+ /* The __kprobes marked functions and entry code must not be probed */
+ return addr >= (unsigned long)__kprobes_text_start &&
+ addr < (unsigned long)__kprobes_text_end;
+}
+
static int __kprobes in_kprobes_functions(unsigned long addr)
{
struct kprobe_blackpoint *kb;

- if (addr >= (unsigned long)__kprobes_text_start &&
- addr < (unsigned long)__kprobes_text_end)
+ if (arch_within_kprobe_blacklist(addr))
return -EINVAL;
/*
* If there exists a kprobe_blacklist, verify and

Subject: [tip:perf/kprobes] kprobes, x86: Prohibit probing on native_set_debugreg()/load_idt()

Commit-ID: 8027197220e02d5cebbbfdff36c2827661fbc692
Gitweb: http://git.kernel.org/tip/8027197220e02d5cebbbfdff36c2827661fbc692
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:19 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:58 +0200

kprobes, x86: Prohibit probing on native_set_debugreg()/load_idt()

Since the kprobes uses do_debug for single stepping,
functions called from do_debug() before notify_die() must not
be probed.

And also native_load_idt() is called from paranoid_exit when
returning int3, this also must not be probed.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Alok Kataria <[email protected]>
Cc: Chris Wright <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/paravirt.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index e136869..548d25f 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -390,8 +390,10 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.end_context_switch = paravirt_nop,
};

-/* At this point, native_get_debugreg has a real function entry */
+/* At this point, native_get/set_debugreg has real function entries */
NOKPROBE_SYMBOL(native_get_debugreg);
+NOKPROBE_SYMBOL(native_set_debugreg);
+NOKPROBE_SYMBOL(native_load_idt);

struct pv_apic_ops pv_apic_ops = {
#ifdef CONFIG_X86_LOCAL_APIC

Subject: [tip:perf/kprobes] kprobes, x86: Call exception_enter after kprobes handled

Commit-ID: ecd50f714c421c759354632dd00f70c718c95b10
Gitweb: http://git.kernel.org/tip/ecd50f714c421c759354632dd00f70c718c95b10
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:40 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:03:00 +0200

kprobes, x86: Call exception_enter after kprobes handled

Move exception_enter() call after kprobes handler
is done. Since the exception_enter() involves
many other functions (like printk), it can cause
recursive int3/break loop when kprobes probe such
functions.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/traps.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e5d4a70..ba9abe9 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -327,7 +327,6 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
if (poke_int3_handler(regs))
return;

- prev_state = exception_enter();
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -338,6 +337,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
if (kprobe_int3_handler(regs))
return;
#endif
+ prev_state = exception_enter();

if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -415,8 +415,6 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
unsigned long dr6;
int si_code;

- prev_state = exception_enter();
-
get_debugreg(dr6, 6);

/* Filter out all the reserved bits which are preset to 1 */
@@ -449,6 +447,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
if (kprobe_debug_handler(regs))
goto exit;
#endif
+ prev_state = exception_enter();

if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
SIGTRAP) == NOTIFY_STOP)

Subject: [tip:perf/kprobes] kprobes/x86: Allow probe on some kprobe preparation functions

Commit-ID: 7ec8a97a990da8e3ba87175a757731e17f74072e
Gitweb: http://git.kernel.org/tip/7ec8a97a990da8e3ba87175a757731e17f74072e
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:47 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:03:01 +0200

kprobes/x86: Allow probe on some kprobe preparation functions

There is no need to prohibit probing on the functions
used in preparation phase. Those are safely probed because
those are not invoked from breakpoint/fault/debug handlers,
there is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist:

can_boost
can_probe
can_optimize
is_IF_modifier
__copy_instruction
copy_optimized_instructions
arch_copy_kprobe
arch_prepare_kprobe
arch_arm_kprobe
arch_disarm_kprobe
arch_remove_kprobe
arch_trampoline_kprobe
arch_prepare_kprobe_ftrace
arch_prepare_optimized_kprobe
arch_check_optimized_kprobe
arch_within_optimized_kprobe
__arch_remove_optimized_kprobe
arch_remove_optimized_kprobe
arch_optimize_kprobes
arch_unoptimize_kprobe

I tested those functions by putting kprobes on all
instructions in the functions with the bash script
I sent to LKML. See:

https://lkml.org/lkml/2014/3/27/33

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Jonathan Lebon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 20 ++++++++++----------
arch/x86/kernel/kprobes/ftrace.c | 2 +-
arch/x86/kernel/kprobes/opt.c | 24 ++++++++++++------------
3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 9b80aec..bd71713 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -159,7 +159,7 @@ static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
* Returns non-zero if opcode is boostable.
* RIP relative instructions are adjusted at copying time in 64 bits mode
*/
-int __kprobes can_boost(kprobe_opcode_t *opcodes)
+int can_boost(kprobe_opcode_t *opcodes)
{
kprobe_opcode_t opcode;
kprobe_opcode_t *orig_opcodes = opcodes;
@@ -260,7 +260,7 @@ unsigned long recover_probed_instruction(kprobe_opcode_t *buf, unsigned long add
}

/* Check if paddr is at an instruction boundary */
-static int __kprobes can_probe(unsigned long paddr)
+static int can_probe(unsigned long paddr)
{
unsigned long addr, __addr, offset = 0;
struct insn insn;
@@ -299,7 +299,7 @@ static int __kprobes can_probe(unsigned long paddr)
/*
* Returns non-zero if opcode modifies the interrupt flag.
*/
-static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
+static int is_IF_modifier(kprobe_opcode_t *insn)
{
/* Skip prefixes */
insn = skip_prefixes(insn);
@@ -322,7 +322,7 @@ static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
* If not, return null.
* Only applicable to 64-bit x86.
*/
-int __kprobes __copy_instruction(u8 *dest, u8 *src)
+int __copy_instruction(u8 *dest, u8 *src)
{
struct insn insn;
kprobe_opcode_t buf[MAX_INSN_SIZE];
@@ -365,7 +365,7 @@ int __kprobes __copy_instruction(u8 *dest, u8 *src)
return insn.length;
}

-static int __kprobes arch_copy_kprobe(struct kprobe *p)
+static int arch_copy_kprobe(struct kprobe *p)
{
int ret;

@@ -392,7 +392,7 @@ static int __kprobes arch_copy_kprobe(struct kprobe *p)
return 0;
}

-int __kprobes arch_prepare_kprobe(struct kprobe *p)
+int arch_prepare_kprobe(struct kprobe *p)
{
if (alternatives_text_reserved(p->addr, p->addr))
return -EINVAL;
@@ -407,17 +407,17 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return arch_copy_kprobe(p);
}

-void __kprobes arch_arm_kprobe(struct kprobe *p)
+void arch_arm_kprobe(struct kprobe *p)
{
text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
}

-void __kprobes arch_disarm_kprobe(struct kprobe *p)
+void arch_disarm_kprobe(struct kprobe *p)
{
text_poke(p->addr, &p->opcode, 1);
}

-void __kprobes arch_remove_kprobe(struct kprobe *p)
+void arch_remove_kprobe(struct kprobe *p)
{
if (p->ainsn.insn) {
free_insn_slot(p->ainsn.insn, (p->ainsn.boostable == 1));
@@ -1060,7 +1060,7 @@ int __init arch_init_kprobes(void)
return 0;
}

-int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+int arch_trampoline_kprobe(struct kprobe *p)
{
return 0;
}
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 23ef5c5..dcaa131 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -85,7 +85,7 @@ end:
local_irq_restore(flags);
}

-int __kprobes arch_prepare_kprobe_ftrace(struct kprobe *p)
+int arch_prepare_kprobe_ftrace(struct kprobe *p)
{
p->ainsn.insn = NULL;
p->ainsn.boostable = -1;
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 898160b..fba7fb0 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -77,7 +77,7 @@ found:
}

/* Insert a move instruction which sets a pointer to eax/rdi (1st arg). */
-static void __kprobes synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
+static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
{
#ifdef CONFIG_X86_64
*addr++ = 0x48;
@@ -169,7 +169,7 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_
local_irq_restore(flags);
}

-static int __kprobes copy_optimized_instructions(u8 *dest, u8 *src)
+static int copy_optimized_instructions(u8 *dest, u8 *src)
{
int len = 0, ret;

@@ -189,7 +189,7 @@ static int __kprobes copy_optimized_instructions(u8 *dest, u8 *src)
}

/* Check whether insn is indirect jump */
-static int __kprobes insn_is_indirect_jump(struct insn *insn)
+static int insn_is_indirect_jump(struct insn *insn)
{
return ((insn->opcode.bytes[0] == 0xff &&
(X86_MODRM_REG(insn->modrm.value) & 6) == 4) || /* Jump */
@@ -224,7 +224,7 @@ static int insn_jump_into_range(struct insn *insn, unsigned long start, int len)
}

/* Decode whole function to ensure any instructions don't jump into target */
-static int __kprobes can_optimize(unsigned long paddr)
+static int can_optimize(unsigned long paddr)
{
unsigned long addr, size = 0, offset = 0;
struct insn insn;
@@ -275,7 +275,7 @@ static int __kprobes can_optimize(unsigned long paddr)
}

/* Check optimized_kprobe can actually be optimized. */
-int __kprobes arch_check_optimized_kprobe(struct optimized_kprobe *op)
+int arch_check_optimized_kprobe(struct optimized_kprobe *op)
{
int i;
struct kprobe *p;
@@ -290,15 +290,15 @@ int __kprobes arch_check_optimized_kprobe(struct optimized_kprobe *op)
}

/* Check the addr is within the optimized instructions. */
-int __kprobes
-arch_within_optimized_kprobe(struct optimized_kprobe *op, unsigned long addr)
+int arch_within_optimized_kprobe(struct optimized_kprobe *op,
+ unsigned long addr)
{
return ((unsigned long)op->kp.addr <= addr &&
(unsigned long)op->kp.addr + op->optinsn.size > addr);
}

/* Free optimized instruction slot */
-static __kprobes
+static
void __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
{
if (op->optinsn.insn) {
@@ -308,7 +308,7 @@ void __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
}
}

-void __kprobes arch_remove_optimized_kprobe(struct optimized_kprobe *op)
+void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
{
__arch_remove_optimized_kprobe(op, 1);
}
@@ -318,7 +318,7 @@ void __kprobes arch_remove_optimized_kprobe(struct optimized_kprobe *op)
* Target instructions MUST be relocatable (checked inside)
* This is called when new aggr(opt)probe is allocated or reused.
*/
-int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
+int arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
{
u8 *buf;
int ret;
@@ -372,7 +372,7 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
* Replace breakpoints (int3) with relative jumps.
* Caller must call with locking kprobe_mutex and text_mutex.
*/
-void __kprobes arch_optimize_kprobes(struct list_head *oplist)
+void arch_optimize_kprobes(struct list_head *oplist)
{
struct optimized_kprobe *op, *tmp;
u8 insn_buf[RELATIVEJUMP_SIZE];
@@ -398,7 +398,7 @@ void __kprobes arch_optimize_kprobes(struct list_head *oplist)
}

/* Replace a relative jump with a breakpoint (int3). */
-void __kprobes arch_unoptimize_kprobe(struct optimized_kprobe *op)
+void arch_unoptimize_kprobe(struct optimized_kprobe *op)
{
u8 insn_buf[RELATIVEJUMP_SIZE];

Subject: [tip:perf/kprobes] kprobes: Allow probe on some kprobe functions

Commit-ID: 55479f64756fc508182a05e35e52f01395a50d4d
Gitweb: http://git.kernel.org/tip/55479f64756fc508182a05e35e52f01395a50d4d
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:54 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:03:01 +0200

kprobes: Allow probe on some kprobe functions

There is no need to prohibit probing on the functions
used for preparation, registeration, optimization,
controll etc. Those are safely probed because those are
not invoked from breakpoint/fault/debug handlers,
there is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist:

add_new_kprobe
aggr_kprobe_disabled
alloc_aggr_kprobe
alloc_aggr_kprobe
arm_all_kprobes
__arm_kprobe
arm_kprobe
arm_kprobe_ftrace
check_kprobe_address_safe
collect_garbage_slots
collect_garbage_slots
collect_one_slot
debugfs_kprobe_init
__disable_kprobe
disable_kprobe
disarm_all_kprobes
__disarm_kprobe
disarm_kprobe
disarm_kprobe_ftrace
do_free_cleaned_kprobes
do_optimize_kprobes
do_unoptimize_kprobes
enable_kprobe
force_unoptimize_kprobe
free_aggr_kprobe
free_aggr_kprobe
__free_insn_slot
__get_insn_slot
get_optimized_kprobe
__get_valid_kprobe
init_aggr_kprobe
init_aggr_kprobe
in_nokprobe_functions
kick_kprobe_optimizer
kill_kprobe
kill_optimized_kprobe
kprobe_addr
kprobe_optimizer
kprobe_queued
kprobe_seq_next
kprobe_seq_start
kprobe_seq_stop
kprobes_module_callback
kprobes_open
optimize_all_kprobes
optimize_kprobe
prepare_kprobe
prepare_optimized_kprobe
register_aggr_kprobe
register_jprobe
register_jprobes
register_kprobe
register_kprobes
register_kretprobe
register_kretprobe
register_kretprobes
register_kretprobes
report_probe
show_kprobe_addr
try_to_optimize_kprobe
unoptimize_all_kprobes
unoptimize_kprobe
unregister_jprobe
unregister_jprobes
unregister_kprobe
__unregister_kprobe_bottom
unregister_kprobes
__unregister_kprobe_top
unregister_kretprobe
unregister_kretprobe
unregister_kretprobes
unregister_kretprobes
wait_for_kprobe_optimizer

I tested those functions by putting kprobes on all
instructions in the functions with the bash script
I sent to LKML. See:

https://lkml.org/lkml/2014/3/27/33

Signed-off-by: Masami Hiramatsu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/kprobes.c | 153 +++++++++++++++++++++++++++----------------------------
1 file changed, 76 insertions(+), 77 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 5ffc687..4db2cc6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -138,13 +138,13 @@ struct kprobe_insn_cache kprobe_insn_slots = {
.insn_size = MAX_INSN_SIZE,
.nr_garbage = 0,
};
-static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c);
+static int collect_garbage_slots(struct kprobe_insn_cache *c);

/**
* __get_insn_slot() - Find a slot on an executable page for an instruction.
* We allocate an executable page if there's no room on existing ones.
*/
-kprobe_opcode_t __kprobes *__get_insn_slot(struct kprobe_insn_cache *c)
+kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
{
struct kprobe_insn_page *kip;
kprobe_opcode_t *slot = NULL;
@@ -201,7 +201,7 @@ out:
}

/* Return 1 if all garbages are collected, otherwise 0. */
-static int __kprobes collect_one_slot(struct kprobe_insn_page *kip, int idx)
+static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
{
kip->slot_used[idx] = SLOT_CLEAN;
kip->nused--;
@@ -222,7 +222,7 @@ static int __kprobes collect_one_slot(struct kprobe_insn_page *kip, int idx)
return 0;
}

-static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c)
+static int collect_garbage_slots(struct kprobe_insn_cache *c)
{
struct kprobe_insn_page *kip, *next;

@@ -244,8 +244,8 @@ static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c)
return 0;
}

-void __kprobes __free_insn_slot(struct kprobe_insn_cache *c,
- kprobe_opcode_t *slot, int dirty)
+void __free_insn_slot(struct kprobe_insn_cache *c,
+ kprobe_opcode_t *slot, int dirty)
{
struct kprobe_insn_page *kip;

@@ -361,7 +361,7 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
}

/* Free optimized instructions and optimized_kprobe */
-static __kprobes void free_aggr_kprobe(struct kprobe *p)
+static void free_aggr_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -399,7 +399,7 @@ static inline int kprobe_disarmed(struct kprobe *p)
}

/* Return true(!0) if the probe is queued on (un)optimizing lists */
-static int __kprobes kprobe_queued(struct kprobe *p)
+static int kprobe_queued(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -415,7 +415,7 @@ static int __kprobes kprobe_queued(struct kprobe *p)
* Return an optimized kprobe whose optimizing code replaces
* instructions including addr (exclude breakpoint).
*/
-static struct kprobe *__kprobes get_optimized_kprobe(unsigned long addr)
+static struct kprobe *get_optimized_kprobe(unsigned long addr)
{
int i;
struct kprobe *p = NULL;
@@ -447,7 +447,7 @@ static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
* Optimize (replace a breakpoint with a jump) kprobes listed on
* optimizing_list.
*/
-static __kprobes void do_optimize_kprobes(void)
+static void do_optimize_kprobes(void)
{
/* Optimization never be done when disarmed */
if (kprobes_all_disarmed || !kprobes_allow_optimization ||
@@ -475,7 +475,7 @@ static __kprobes void do_optimize_kprobes(void)
* Unoptimize (replace a jump with a breakpoint and remove the breakpoint
* if need) kprobes listed on unoptimizing_list.
*/
-static __kprobes void do_unoptimize_kprobes(void)
+static void do_unoptimize_kprobes(void)
{
struct optimized_kprobe *op, *tmp;

@@ -507,7 +507,7 @@ static __kprobes void do_unoptimize_kprobes(void)
}

/* Reclaim all kprobes on the free_list */
-static __kprobes void do_free_cleaned_kprobes(void)
+static void do_free_cleaned_kprobes(void)
{
struct optimized_kprobe *op, *tmp;

@@ -519,13 +519,13 @@ static __kprobes void do_free_cleaned_kprobes(void)
}

/* Start optimizer after OPTIMIZE_DELAY passed */
-static __kprobes void kick_kprobe_optimizer(void)
+static void kick_kprobe_optimizer(void)
{
schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY);
}

/* Kprobe jump optimizer */
-static __kprobes void kprobe_optimizer(struct work_struct *work)
+static void kprobe_optimizer(struct work_struct *work)
{
mutex_lock(&kprobe_mutex);
/* Lock modules while optimizing kprobes */
@@ -561,7 +561,7 @@ static __kprobes void kprobe_optimizer(struct work_struct *work)
}

/* Wait for completing optimization and unoptimization */
-static __kprobes void wait_for_kprobe_optimizer(void)
+static void wait_for_kprobe_optimizer(void)
{
mutex_lock(&kprobe_mutex);

@@ -580,7 +580,7 @@ static __kprobes void wait_for_kprobe_optimizer(void)
}

/* Optimize kprobe if p is ready to be optimized */
-static __kprobes void optimize_kprobe(struct kprobe *p)
+static void optimize_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -614,7 +614,7 @@ static __kprobes void optimize_kprobe(struct kprobe *p)
}

/* Short cut to direct unoptimizing */
-static __kprobes void force_unoptimize_kprobe(struct optimized_kprobe *op)
+static void force_unoptimize_kprobe(struct optimized_kprobe *op)
{
get_online_cpus();
arch_unoptimize_kprobe(op);
@@ -624,7 +624,7 @@ static __kprobes void force_unoptimize_kprobe(struct optimized_kprobe *op)
}

/* Unoptimize a kprobe if p is optimized */
-static __kprobes void unoptimize_kprobe(struct kprobe *p, bool force)
+static void unoptimize_kprobe(struct kprobe *p, bool force)
{
struct optimized_kprobe *op;

@@ -684,7 +684,7 @@ static void reuse_unused_kprobe(struct kprobe *ap)
}

/* Remove optimized instructions */
-static void __kprobes kill_optimized_kprobe(struct kprobe *p)
+static void kill_optimized_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -710,7 +710,7 @@ static void __kprobes kill_optimized_kprobe(struct kprobe *p)
}

/* Try to prepare optimized instructions */
-static __kprobes void prepare_optimized_kprobe(struct kprobe *p)
+static void prepare_optimized_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -719,7 +719,7 @@ static __kprobes void prepare_optimized_kprobe(struct kprobe *p)
}

/* Allocate new optimized_kprobe and try to prepare optimized instructions */
-static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
+static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
{
struct optimized_kprobe *op;

@@ -734,13 +734,13 @@ static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
return &op->kp;
}

-static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p);
+static void init_aggr_kprobe(struct kprobe *ap, struct kprobe *p);

/*
* Prepare an optimized_kprobe and optimize it
* NOTE: p must be a normal registered kprobe
*/
-static __kprobes void try_to_optimize_kprobe(struct kprobe *p)
+static void try_to_optimize_kprobe(struct kprobe *p)
{
struct kprobe *ap;
struct optimized_kprobe *op;
@@ -774,7 +774,7 @@ out:
}

#ifdef CONFIG_SYSCTL
-static void __kprobes optimize_all_kprobes(void)
+static void optimize_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -797,7 +797,7 @@ out:
mutex_unlock(&kprobe_mutex);
}

-static void __kprobes unoptimize_all_kprobes(void)
+static void unoptimize_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -848,7 +848,7 @@ int proc_kprobes_optimization_handler(struct ctl_table *table, int write,
#endif /* CONFIG_SYSCTL */

/* Put a breakpoint for a probe. Must be called with text_mutex locked */
-static void __kprobes __arm_kprobe(struct kprobe *p)
+static void __arm_kprobe(struct kprobe *p)
{
struct kprobe *_p;

@@ -863,7 +863,7 @@ static void __kprobes __arm_kprobe(struct kprobe *p)
}

/* Remove the breakpoint of a probe. Must be called with text_mutex locked */
-static void __kprobes __disarm_kprobe(struct kprobe *p, bool reopt)
+static void __disarm_kprobe(struct kprobe *p, bool reopt)
{
struct kprobe *_p;

@@ -898,13 +898,13 @@ static void reuse_unused_kprobe(struct kprobe *ap)
BUG_ON(kprobe_unused(ap));
}

-static __kprobes void free_aggr_kprobe(struct kprobe *p)
+static void free_aggr_kprobe(struct kprobe *p)
{
arch_remove_kprobe(p);
kfree(p);
}

-static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
+static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
{
return kzalloc(sizeof(struct kprobe), GFP_KERNEL);
}
@@ -918,7 +918,7 @@ static struct ftrace_ops kprobe_ftrace_ops __read_mostly = {
static int kprobe_ftrace_enabled;

/* Must ensure p->addr is really on ftrace */
-static int __kprobes prepare_kprobe(struct kprobe *p)
+static int prepare_kprobe(struct kprobe *p)
{
if (!kprobe_ftrace(p))
return arch_prepare_kprobe(p);
@@ -927,7 +927,7 @@ static int __kprobes prepare_kprobe(struct kprobe *p)
}

/* Caller must lock kprobe_mutex */
-static void __kprobes arm_kprobe_ftrace(struct kprobe *p)
+static void arm_kprobe_ftrace(struct kprobe *p)
{
int ret;

@@ -942,7 +942,7 @@ static void __kprobes arm_kprobe_ftrace(struct kprobe *p)
}

/* Caller must lock kprobe_mutex */
-static void __kprobes disarm_kprobe_ftrace(struct kprobe *p)
+static void disarm_kprobe_ftrace(struct kprobe *p)
{
int ret;

@@ -962,7 +962,7 @@ static void __kprobes disarm_kprobe_ftrace(struct kprobe *p)
#endif

/* Arm a kprobe with text_mutex */
-static void __kprobes arm_kprobe(struct kprobe *kp)
+static void arm_kprobe(struct kprobe *kp)
{
if (unlikely(kprobe_ftrace(kp))) {
arm_kprobe_ftrace(kp);
@@ -979,7 +979,7 @@ static void __kprobes arm_kprobe(struct kprobe *kp)
}

/* Disarm a kprobe with text_mutex */
-static void __kprobes disarm_kprobe(struct kprobe *kp, bool reopt)
+static void disarm_kprobe(struct kprobe *kp, bool reopt)
{
if (unlikely(kprobe_ftrace(kp))) {
disarm_kprobe_ftrace(kp);
@@ -1189,7 +1189,7 @@ static void __kprobes cleanup_rp_inst(struct kretprobe *rp)
* Add the new probe to ap->list. Fail if this is the
* second jprobe at the address - two jprobes can't coexist
*/
-static int __kprobes add_new_kprobe(struct kprobe *ap, struct kprobe *p)
+static int add_new_kprobe(struct kprobe *ap, struct kprobe *p)
{
BUG_ON(kprobe_gone(ap) || kprobe_gone(p));

@@ -1213,7 +1213,7 @@ static int __kprobes add_new_kprobe(struct kprobe *ap, struct kprobe *p)
* Fill in the required fields of the "manager kprobe". Replace the
* earlier kprobe in the hlist with the manager kprobe
*/
-static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
+static void init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
{
/* Copy p's insn slot to ap */
copy_kprobe(p, ap);
@@ -1239,8 +1239,7 @@ static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
* This is the second or subsequent kprobe at the address - handle
* the intricacies
*/
-static int __kprobes register_aggr_kprobe(struct kprobe *orig_p,
- struct kprobe *p)
+static int register_aggr_kprobe(struct kprobe *orig_p, struct kprobe *p)
{
int ret = 0;
struct kprobe *ap = orig_p;
@@ -1318,7 +1317,7 @@ bool __weak arch_within_kprobe_blacklist(unsigned long addr)
addr < (unsigned long)__kprobes_text_end;
}

-static bool __kprobes within_kprobe_blacklist(unsigned long addr)
+static bool within_kprobe_blacklist(unsigned long addr)
{
struct kprobe_blacklist_entry *ent;

@@ -1342,7 +1341,7 @@ static bool __kprobes within_kprobe_blacklist(unsigned long addr)
* This returns encoded errors if it fails to look up symbol or invalid
* combination of parameters.
*/
-static kprobe_opcode_t __kprobes *kprobe_addr(struct kprobe *p)
+static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
{
kprobe_opcode_t *addr = p->addr;

@@ -1365,7 +1364,7 @@ invalid:
}

/* Check passed kprobe is valid and return kprobe in kprobe_table. */
-static struct kprobe * __kprobes __get_valid_kprobe(struct kprobe *p)
+static struct kprobe *__get_valid_kprobe(struct kprobe *p)
{
struct kprobe *ap, *list_p;

@@ -1397,8 +1396,8 @@ static inline int check_kprobe_rereg(struct kprobe *p)
return ret;
}

-static __kprobes int check_kprobe_address_safe(struct kprobe *p,
- struct module **probed_mod)
+static int check_kprobe_address_safe(struct kprobe *p,
+ struct module **probed_mod)
{
int ret = 0;
unsigned long ftrace_addr;
@@ -1460,7 +1459,7 @@ out:
return ret;
}

-int __kprobes register_kprobe(struct kprobe *p)
+int register_kprobe(struct kprobe *p)
{
int ret;
struct kprobe *old_p;
@@ -1522,7 +1521,7 @@ out:
EXPORT_SYMBOL_GPL(register_kprobe);

/* Check if all probes on the aggrprobe are disabled */
-static int __kprobes aggr_kprobe_disabled(struct kprobe *ap)
+static int aggr_kprobe_disabled(struct kprobe *ap)
{
struct kprobe *kp;

@@ -1538,7 +1537,7 @@ static int __kprobes aggr_kprobe_disabled(struct kprobe *ap)
}

/* Disable one kprobe: Make sure called under kprobe_mutex is locked */
-static struct kprobe *__kprobes __disable_kprobe(struct kprobe *p)
+static struct kprobe *__disable_kprobe(struct kprobe *p)
{
struct kprobe *orig_p;

@@ -1565,7 +1564,7 @@ static struct kprobe *__kprobes __disable_kprobe(struct kprobe *p)
/*
* Unregister a kprobe without a scheduler synchronization.
*/
-static int __kprobes __unregister_kprobe_top(struct kprobe *p)
+static int __unregister_kprobe_top(struct kprobe *p)
{
struct kprobe *ap, *list_p;

@@ -1622,7 +1621,7 @@ disarmed:
return 0;
}

-static void __kprobes __unregister_kprobe_bottom(struct kprobe *p)
+static void __unregister_kprobe_bottom(struct kprobe *p)
{
struct kprobe *ap;

@@ -1638,7 +1637,7 @@ static void __kprobes __unregister_kprobe_bottom(struct kprobe *p)
/* Otherwise, do nothing. */
}

-int __kprobes register_kprobes(struct kprobe **kps, int num)
+int register_kprobes(struct kprobe **kps, int num)
{
int i, ret = 0;

@@ -1656,13 +1655,13 @@ int __kprobes register_kprobes(struct kprobe **kps, int num)
}
EXPORT_SYMBOL_GPL(register_kprobes);

-void __kprobes unregister_kprobe(struct kprobe *p)
+void unregister_kprobe(struct kprobe *p)
{
unregister_kprobes(&p, 1);
}
EXPORT_SYMBOL_GPL(unregister_kprobe);

-void __kprobes unregister_kprobes(struct kprobe **kps, int num)
+void unregister_kprobes(struct kprobe **kps, int num)
{
int i;

@@ -1691,7 +1690,7 @@ unsigned long __weak arch_deref_entry_point(void *entry)
return (unsigned long)entry;
}

-int __kprobes register_jprobes(struct jprobe **jps, int num)
+int register_jprobes(struct jprobe **jps, int num)
{
struct jprobe *jp;
int ret = 0, i;
@@ -1722,19 +1721,19 @@ int __kprobes register_jprobes(struct jprobe **jps, int num)
}
EXPORT_SYMBOL_GPL(register_jprobes);

-int __kprobes register_jprobe(struct jprobe *jp)
+int register_jprobe(struct jprobe *jp)
{
return register_jprobes(&jp, 1);
}
EXPORT_SYMBOL_GPL(register_jprobe);

-void __kprobes unregister_jprobe(struct jprobe *jp)
+void unregister_jprobe(struct jprobe *jp)
{
unregister_jprobes(&jp, 1);
}
EXPORT_SYMBOL_GPL(unregister_jprobe);

-void __kprobes unregister_jprobes(struct jprobe **jps, int num)
+void unregister_jprobes(struct jprobe **jps, int num)
{
int i;

@@ -1799,7 +1798,7 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
return 0;
}

-int __kprobes register_kretprobe(struct kretprobe *rp)
+int register_kretprobe(struct kretprobe *rp)
{
int ret = 0;
struct kretprobe_instance *inst;
@@ -1852,7 +1851,7 @@ int __kprobes register_kretprobe(struct kretprobe *rp)
}
EXPORT_SYMBOL_GPL(register_kretprobe);

-int __kprobes register_kretprobes(struct kretprobe **rps, int num)
+int register_kretprobes(struct kretprobe **rps, int num)
{
int ret = 0, i;

@@ -1870,13 +1869,13 @@ int __kprobes register_kretprobes(struct kretprobe **rps, int num)
}
EXPORT_SYMBOL_GPL(register_kretprobes);

-void __kprobes unregister_kretprobe(struct kretprobe *rp)
+void unregister_kretprobe(struct kretprobe *rp)
{
unregister_kretprobes(&rp, 1);
}
EXPORT_SYMBOL_GPL(unregister_kretprobe);

-void __kprobes unregister_kretprobes(struct kretprobe **rps, int num)
+void unregister_kretprobes(struct kretprobe **rps, int num)
{
int i;

@@ -1899,24 +1898,24 @@ void __kprobes unregister_kretprobes(struct kretprobe **rps, int num)
EXPORT_SYMBOL_GPL(unregister_kretprobes);

#else /* CONFIG_KRETPROBES */
-int __kprobes register_kretprobe(struct kretprobe *rp)
+int register_kretprobe(struct kretprobe *rp)
{
return -ENOSYS;
}
EXPORT_SYMBOL_GPL(register_kretprobe);

-int __kprobes register_kretprobes(struct kretprobe **rps, int num)
+int register_kretprobes(struct kretprobe **rps, int num)
{
return -ENOSYS;
}
EXPORT_SYMBOL_GPL(register_kretprobes);

-void __kprobes unregister_kretprobe(struct kretprobe *rp)
+void unregister_kretprobe(struct kretprobe *rp)
{
}
EXPORT_SYMBOL_GPL(unregister_kretprobe);

-void __kprobes unregister_kretprobes(struct kretprobe **rps, int num)
+void unregister_kretprobes(struct kretprobe **rps, int num)
{
}
EXPORT_SYMBOL_GPL(unregister_kretprobes);
@@ -1930,7 +1929,7 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
#endif /* CONFIG_KRETPROBES */

/* Set the kprobe gone and remove its instruction buffer. */
-static void __kprobes kill_kprobe(struct kprobe *p)
+static void kill_kprobe(struct kprobe *p)
{
struct kprobe *kp;

@@ -1954,7 +1953,7 @@ static void __kprobes kill_kprobe(struct kprobe *p)
}

/* Disable one kprobe */
-int __kprobes disable_kprobe(struct kprobe *kp)
+int disable_kprobe(struct kprobe *kp)
{
int ret = 0;

@@ -1970,7 +1969,7 @@ int __kprobes disable_kprobe(struct kprobe *kp)
EXPORT_SYMBOL_GPL(disable_kprobe);

/* Enable one kprobe */
-int __kprobes enable_kprobe(struct kprobe *kp)
+int enable_kprobe(struct kprobe *kp)
{
int ret = 0;
struct kprobe *p;
@@ -2043,8 +2042,8 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
}

/* Module notifier call back, checking kprobes on the module */
-static int __kprobes kprobes_module_callback(struct notifier_block *nb,
- unsigned long val, void *data)
+static int kprobes_module_callback(struct notifier_block *nb,
+ unsigned long val, void *data)
{
struct module *mod = data;
struct hlist_head *head;
@@ -2145,7 +2144,7 @@ static int __init init_kprobes(void)
}

#ifdef CONFIG_DEBUG_FS
-static void __kprobes report_probe(struct seq_file *pi, struct kprobe *p,
+static void report_probe(struct seq_file *pi, struct kprobe *p,
const char *sym, int offset, char *modname, struct kprobe *pp)
{
char *kprobe_type;
@@ -2174,12 +2173,12 @@ static void __kprobes report_probe(struct seq_file *pi, struct kprobe *p,
(kprobe_ftrace(pp) ? "[FTRACE]" : ""));
}

-static void __kprobes *kprobe_seq_start(struct seq_file *f, loff_t *pos)
+static void *kprobe_seq_start(struct seq_file *f, loff_t *pos)
{
return (*pos < KPROBE_TABLE_SIZE) ? pos : NULL;
}

-static void __kprobes *kprobe_seq_next(struct seq_file *f, void *v, loff_t *pos)
+static void *kprobe_seq_next(struct seq_file *f, void *v, loff_t *pos)
{
(*pos)++;
if (*pos >= KPROBE_TABLE_SIZE)
@@ -2187,12 +2186,12 @@ static void __kprobes *kprobe_seq_next(struct seq_file *f, void *v, loff_t *pos)
return pos;
}

-static void __kprobes kprobe_seq_stop(struct seq_file *f, void *v)
+static void kprobe_seq_stop(struct seq_file *f, void *v)
{
/* Nothing to do */
}

-static int __kprobes show_kprobe_addr(struct seq_file *pi, void *v)
+static int show_kprobe_addr(struct seq_file *pi, void *v)
{
struct hlist_head *head;
struct kprobe *p, *kp;
@@ -2223,7 +2222,7 @@ static const struct seq_operations kprobes_seq_ops = {
.show = show_kprobe_addr
};

-static int __kprobes kprobes_open(struct inode *inode, struct file *filp)
+static int kprobes_open(struct inode *inode, struct file *filp)
{
return seq_open(filp, &kprobes_seq_ops);
}
@@ -2235,7 +2234,7 @@ static const struct file_operations debugfs_kprobes_operations = {
.release = seq_release,
};

-static void __kprobes arm_all_kprobes(void)
+static void arm_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -2263,7 +2262,7 @@ already_enabled:
return;
}

-static void __kprobes disarm_all_kprobes(void)
+static void disarm_all_kprobes(void)
{
struct hlist_head *head;
struct kprobe *p;
@@ -2347,7 +2346,7 @@ static const struct file_operations fops_kp = {
.llseek = default_llseek,
};

-static int __kprobes debugfs_kprobe_init(void)
+static int __init debugfs_kprobe_init(void)
{
struct dentry *dir, *file;
unsigned int value = 1;

Subject: [tip:perf/kprobes] kprobes, ftrace: Allow probing on some functions

Commit-ID: fbc1963d2c1c4eb4651132a2c5c9d6111ada17d3
Gitweb: http://git.kernel.org/tip/fbc1963d2c1c4eb4651132a2c5c9d6111ada17d3
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:00 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:03:02 +0200

kprobes, ftrace: Allow probing on some functions

There is no need to prohibit probing on the functions
used for preparation and uprobe only fetch functions.
Those are safely probed because those are not invoked
from kprobe's breakpoint/fault/debug handlers. So there
is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist:

update_bitfield_fetch_param
free_bitfield_fetch_param
kprobe_register
FETCH_FUNC_NAME(stack, type) in trace_uprobe.c
FETCH_FUNC_NAME(memory, type) in trace_uprobe.c
FETCH_FUNC_NAME(memory, string) in trace_uprobe.c
FETCH_FUNC_NAME(memory, string_size) in trace_uprobe.c
FETCH_FUNC_NAME(file_offset, type) in trace_uprobe.c

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/trace/trace_kprobe.c | 5 ++---
kernel/trace/trace_probe.c | 4 ++--
kernel/trace/trace_uprobe.c | 20 ++++++++++----------
3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 903ae28..aa5f0bf 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1196,9 +1196,8 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
* kprobe_trace_self_tests_init() does enable_trace_probe/disable_trace_probe
* lockless, but we can't race with this __init function.
*/
-static __kprobes
-int kprobe_register(struct ftrace_event_call *event,
- enum trace_reg type, void *data)
+static int kprobe_register(struct ftrace_event_call *event,
+ enum trace_reg type, void *data)
{
struct trace_kprobe *tk = (struct trace_kprobe *)event->data;
struct ftrace_event_file *file = data;
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 8364a42..d3a91e4 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -183,7 +183,7 @@ DEFINE_BASIC_FETCH_FUNCS(bitfield)
#define fetch_bitfield_string NULL
#define fetch_bitfield_string_size NULL

-static __kprobes void
+static void
update_bitfield_fetch_param(struct bitfield_fetch_param *data)
{
/*
@@ -196,7 +196,7 @@ update_bitfield_fetch_param(struct bitfield_fetch_param *data)
update_symbol_cache(data->orig.data);
}

-static __kprobes void
+static void
free_bitfield_fetch_param(struct bitfield_fetch_param *data)
{
/*
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index c082a74..991e3b7 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -108,8 +108,8 @@ static unsigned long get_user_stack_nth(struct pt_regs *regs, unsigned int n)
* Uprobes-specific fetch functions
*/
#define DEFINE_FETCH_stack(type) \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+static void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
+ void *offset, void *dest) \
{ \
*(type *)dest = (type)get_user_stack_nth(regs, \
((unsigned long)offset)); \
@@ -120,8 +120,8 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
#define fetch_stack_string_size NULL

#define DEFINE_FETCH_memory(type) \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
- void *addr, void *dest) \
+static void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
+ void *addr, void *dest) \
{ \
type retval; \
void __user *vaddr = (void __force __user *) addr; \
@@ -136,8 +136,8 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
-static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
{
long ret;
u32 rloc = *(u32 *)dest;
@@ -158,8 +158,8 @@ static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
}
}

-static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest)
{
int len;
void __user *vaddr = (void __force __user *) addr;
@@ -184,8 +184,8 @@ static unsigned long translate_user_vaddr(void *file_offset)
}

#define DEFINE_FETCH_file_offset(type) \
-static __kprobes void FETCH_FUNC_NAME(file_offset, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+static void FETCH_FUNC_NAME(file_offset, type)(struct pt_regs *regs, \
+ void *offset, void *dest)\
{ \
void *vaddr = (void *)translate_user_vaddr(offset); \
\

Subject: [tip:perf/kprobes] kprobes, x86: Allow kprobes on text_poke/ hw_breakpoint

Commit-ID: 9c54b6164eeb292a0eac86c6913bd8daaff35e62
Gitweb: http://git.kernel.org/tip/9c54b6164eeb292a0eac86c6913bd8daaff35e62
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:07 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:03:02 +0200

kprobes, x86: Allow kprobes on text_poke/hw_breakpoint

Allow kprobes on text_poke/hw_breakpoint because
those are not related to the critical int3-debug
recursive path of kprobes at this moment.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Fengguang Wu <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/alternative.c | 3 +--
arch/x86/kernel/hw_breakpoint.c | 5 ++---
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index df94598..703130f 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -5,7 +5,6 @@
#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/stringify.h>
-#include <linux/kprobes.h>
#include <linux/mm.h>
#include <linux/vmalloc.h>
#include <linux/memory.h>
@@ -551,7 +550,7 @@ void *__init_or_module text_poke_early(void *addr, const void *opcode,
*
* Note: Must be called under text_mutex.
*/
-void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+void *text_poke(void *addr, const void *opcode, size_t len)
{
unsigned long flags;
char *vaddr;
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index a67b47c..5f9cf20 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -32,7 +32,6 @@
#include <linux/irqflags.h>
#include <linux/notifier.h>
#include <linux/kallsyms.h>
-#include <linux/kprobes.h>
#include <linux/percpu.h>
#include <linux/kdebug.h>
#include <linux/kernel.h>
@@ -424,7 +423,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
* NOTIFY_STOP returned for all other cases
*
*/
-static int __kprobes hw_breakpoint_handler(struct die_args *args)
+static int hw_breakpoint_handler(struct die_args *args)
{
int i, cpu, rc = NOTIFY_STOP;
struct perf_event *bp;
@@ -511,7 +510,7 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
/*
* Handle debug exception notifications.
*/
-int __kprobes hw_breakpoint_exceptions_notify(
+int hw_breakpoint_exceptions_notify(
struct notifier_block *unused, unsigned long val, void *data)
{
if (val != DIE_DEBUG)

Subject: [tip:perf/kprobes] kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes

Commit-ID: 820aede0209a51549e8a014c8030e29229920e4e
Gitweb: http://git.kernel.org/tip/820aede0209a51549e8a014c8030e29229920e4e
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:21 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:26:38 +0200

kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: David S. Miller <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/kprobes.c | 67 ++++++++++++++++++++++++++++++++++----------------------
1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4db2cc6..a21b4e6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -301,7 +301,7 @@ static inline void reset_kprobe_instance(void)
* OR
* - with preemption disabled - from arch/xxx/kernel/kprobes.c
*/
-struct kprobe __kprobes *get_kprobe(void *addr)
+struct kprobe *get_kprobe(void *addr)
{
struct hlist_head *head;
struct kprobe *p;
@@ -314,8 +314,9 @@ struct kprobe __kprobes *get_kprobe(void *addr)

return NULL;
}
+NOKPROBE_SYMBOL(get_kprobe);

-static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);
+static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);

/* Return true if the kprobe is an aggregator */
static inline int kprobe_aggrprobe(struct kprobe *p)
@@ -347,7 +348,7 @@ static bool kprobes_allow_optimization;
* Call all pre_handler on the list, but ignores its return value.
* This must be called from arch-dep optimized caller.
*/
-void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
+void opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe *kp;

@@ -359,6 +360,7 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
reset_kprobe_instance();
}
}
+NOKPROBE_SYMBOL(opt_pre_handler);

/* Free optimized instructions and optimized_kprobe */
static void free_aggr_kprobe(struct kprobe *p)
@@ -995,7 +997,7 @@ static void disarm_kprobe(struct kprobe *kp, bool reopt)
* Aggregate handlers for multiple kprobes support - these handlers
* take care of invoking the individual kprobe handlers on p->list
*/
-static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
+static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe *kp;

@@ -1009,9 +1011,10 @@ static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
}
return 0;
}
+NOKPROBE_SYMBOL(aggr_pre_handler);

-static void __kprobes aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
- unsigned long flags)
+static void aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags)
{
struct kprobe *kp;

@@ -1023,9 +1026,10 @@ static void __kprobes aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
}
}
}
+NOKPROBE_SYMBOL(aggr_post_handler);

-static int __kprobes aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
- int trapnr)
+static int aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
+ int trapnr)
{
struct kprobe *cur = __this_cpu_read(kprobe_instance);

@@ -1039,8 +1043,9 @@ static int __kprobes aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
}
return 0;
}
+NOKPROBE_SYMBOL(aggr_fault_handler);

-static int __kprobes aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
+static int aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe *cur = __this_cpu_read(kprobe_instance);
int ret = 0;
@@ -1052,9 +1057,10 @@ static int __kprobes aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
reset_kprobe_instance();
return ret;
}
+NOKPROBE_SYMBOL(aggr_break_handler);

/* Walks the list and increments nmissed count for multiprobe case */
-void __kprobes kprobes_inc_nmissed_count(struct kprobe *p)
+void kprobes_inc_nmissed_count(struct kprobe *p)
{
struct kprobe *kp;
if (!kprobe_aggrprobe(p)) {
@@ -1065,9 +1071,10 @@ void __kprobes kprobes_inc_nmissed_count(struct kprobe *p)
}
return;
}
+NOKPROBE_SYMBOL(kprobes_inc_nmissed_count);

-void __kprobes recycle_rp_inst(struct kretprobe_instance *ri,
- struct hlist_head *head)
+void recycle_rp_inst(struct kretprobe_instance *ri,
+ struct hlist_head *head)
{
struct kretprobe *rp = ri->rp;

@@ -1082,8 +1089,9 @@ void __kprobes recycle_rp_inst(struct kretprobe_instance *ri,
/* Unregistering */
hlist_add_head(&ri->hlist, head);
}
+NOKPROBE_SYMBOL(recycle_rp_inst);

-void __kprobes kretprobe_hash_lock(struct task_struct *tsk,
+void kretprobe_hash_lock(struct task_struct *tsk,
struct hlist_head **head, unsigned long *flags)
__acquires(hlist_lock)
{
@@ -1094,17 +1102,19 @@ __acquires(hlist_lock)
hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_lock_irqsave(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_hash_lock);

-static void __kprobes kretprobe_table_lock(unsigned long hash,
- unsigned long *flags)
+static void kretprobe_table_lock(unsigned long hash,
+ unsigned long *flags)
__acquires(hlist_lock)
{
raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_lock_irqsave(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_table_lock);

-void __kprobes kretprobe_hash_unlock(struct task_struct *tsk,
- unsigned long *flags)
+void kretprobe_hash_unlock(struct task_struct *tsk,
+ unsigned long *flags)
__releases(hlist_lock)
{
unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
@@ -1113,14 +1123,16 @@ __releases(hlist_lock)
hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_unlock_irqrestore(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_hash_unlock);

-static void __kprobes kretprobe_table_unlock(unsigned long hash,
- unsigned long *flags)
+static void kretprobe_table_unlock(unsigned long hash,
+ unsigned long *flags)
__releases(hlist_lock)
{
raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_unlock_irqrestore(hlist_lock, *flags);
}
+NOKPROBE_SYMBOL(kretprobe_table_unlock);

/*
* This function is called from finish_task_switch when task tk becomes dead,
@@ -1128,7 +1140,7 @@ __releases(hlist_lock)
* with this task. These left over instances represent probed functions
* that have been called but will never return.
*/
-void __kprobes kprobe_flush_task(struct task_struct *tk)
+void kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
struct hlist_head *head, empty_rp;
@@ -1153,6 +1165,7 @@ void __kprobes kprobe_flush_task(struct task_struct *tk)
kfree(ri);
}
}
+NOKPROBE_SYMBOL(kprobe_flush_task);

static inline void free_rp_inst(struct kretprobe *rp)
{
@@ -1165,7 +1178,7 @@ static inline void free_rp_inst(struct kretprobe *rp)
}
}

-static void __kprobes cleanup_rp_inst(struct kretprobe *rp)
+static void cleanup_rp_inst(struct kretprobe *rp)
{
unsigned long flags, hash;
struct kretprobe_instance *ri;
@@ -1184,6 +1197,7 @@ static void __kprobes cleanup_rp_inst(struct kretprobe *rp)
}
free_rp_inst(rp);
}
+NOKPROBE_SYMBOL(cleanup_rp_inst);

/*
* Add the new probe to ap->list. Fail if this is the
@@ -1758,8 +1772,7 @@ EXPORT_SYMBOL_GPL(unregister_jprobes);
* This kprobe pre_handler is registered with every kretprobe. When probe
* hits it will set up the return probe.
*/
-static int __kprobes pre_handler_kretprobe(struct kprobe *p,
- struct pt_regs *regs)
+static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
{
struct kretprobe *rp = container_of(p, struct kretprobe, kp);
unsigned long hash, flags = 0;
@@ -1797,6 +1810,7 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
}
return 0;
}
+NOKPROBE_SYMBOL(pre_handler_kretprobe);

int register_kretprobe(struct kretprobe *rp)
{
@@ -1920,11 +1934,11 @@ void unregister_kretprobes(struct kretprobe **rps, int num)
}
EXPORT_SYMBOL_GPL(unregister_kretprobes);

-static int __kprobes pre_handler_kretprobe(struct kprobe *p,
- struct pt_regs *regs)
+static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
{
return 0;
}
+NOKPROBE_SYMBOL(pre_handler_kretprobe);

#endif /* CONFIG_KRETPROBES */

@@ -2002,12 +2016,13 @@ out:
}
EXPORT_SYMBOL_GPL(enable_kprobe);

-void __kprobes dump_kprobe(struct kprobe *kp)
+void dump_kprobe(struct kprobe *kp)
{
printk(KERN_WARNING "Dumping kprobe:\n");
printk(KERN_WARNING "Name: %s\nAddress: %p\nOffset: %x\n",
kp->symbol_name, kp->addr, kp->offset);
}
+NOKPROBE_SYMBOL(dump_kprobe);

/*
* Lookup and populate the kprobe_blacklist.

Subject: [tip:perf/kprobes] kprobes, notifier: Use NOKPROBE_SYMBOL macro in notifier

Commit-ID: b40a2cb6e0c218c6d0f12c7f7e683e75972973c1
Gitweb: http://git.kernel.org/tip/b40a2cb6e0c218c6d0f12c7f7e683e75972973c1
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:35 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:26:39 +0200

kprobes, notifier: Use NOKPROBE_SYMBOL macro in notifier

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in notifier.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/notifier.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index db4c8b0..4803da6 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -71,9 +71,9 @@ static int notifier_chain_unregister(struct notifier_block **nl,
* @returns: notifier_call_chain returns the value returned by the
* last notifier function called.
*/
-static int __kprobes notifier_call_chain(struct notifier_block **nl,
- unsigned long val, void *v,
- int nr_to_call, int *nr_calls)
+static int notifier_call_chain(struct notifier_block **nl,
+ unsigned long val, void *v,
+ int nr_to_call, int *nr_calls)
{
int ret = NOTIFY_DONE;
struct notifier_block *nb, *next_nb;
@@ -102,6 +102,7 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
}
return ret;
}
+NOKPROBE_SYMBOL(notifier_call_chain);

/*
* Atomic notifier chain routines. Registration and unregistration
@@ -172,9 +173,9 @@ EXPORT_SYMBOL_GPL(atomic_notifier_chain_unregister);
* Otherwise the return value is the return value
* of the last notifier function called.
*/
-int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
- unsigned long val, void *v,
- int nr_to_call, int *nr_calls)
+int __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
+ unsigned long val, void *v,
+ int nr_to_call, int *nr_calls)
{
int ret;

@@ -184,13 +185,15 @@ int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
return ret;
}
EXPORT_SYMBOL_GPL(__atomic_notifier_call_chain);
+NOKPROBE_SYMBOL(__atomic_notifier_call_chain);

-int __kprobes atomic_notifier_call_chain(struct atomic_notifier_head *nh,
- unsigned long val, void *v)
+int atomic_notifier_call_chain(struct atomic_notifier_head *nh,
+ unsigned long val, void *v)
{
return __atomic_notifier_call_chain(nh, val, v, -1, NULL);
}
EXPORT_SYMBOL_GPL(atomic_notifier_call_chain);
+NOKPROBE_SYMBOL(atomic_notifier_call_chain);

/*
* Blocking notifier chain routines. All access to the chain is
@@ -527,7 +530,7 @@ EXPORT_SYMBOL_GPL(srcu_init_notifier_head);

static ATOMIC_NOTIFIER_HEAD(die_chain);

-int notrace __kprobes notify_die(enum die_val val, const char *str,
+int notrace notify_die(enum die_val val, const char *str,
struct pt_regs *regs, long err, int trap, int sig)
{
struct die_args args = {
@@ -540,6 +543,7 @@ int notrace __kprobes notify_die(enum die_val val, const char *str,
};
return atomic_notifier_call_chain(&die_chain, val, &args);
}
+NOKPROBE_SYMBOL(notify_die);

int register_die_notifier(struct notifier_block *nb)
{

Subject: [tip:perf/kprobes] kprobes: Show blacklist entries via debugfs

Commit-ID: 637247403abff8c963bc7be8002b3f49ea604563
Gitweb: http://git.kernel.org/tip/637247403abff8c963bc7be8002b3f49ea604563
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:49 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:26:41 +0200

kprobes: Show blacklist entries via debugfs

Show blacklist entries (function names with the address
range) via /sys/kernel/debug/kprobes/blacklist.

Note that at this point the blacklist supports only
in vmlinux, not module. So the list is fixed and
not updated.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: David S. Miller <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/kprobes.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index a21b4e6..3214289 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2249,6 +2249,46 @@ static const struct file_operations debugfs_kprobes_operations = {
.release = seq_release,
};

+/* kprobes/blacklist -- shows which functions can not be probed */
+static void *kprobe_blacklist_seq_start(struct seq_file *m, loff_t *pos)
+{
+ return seq_list_start(&kprobe_blacklist, *pos);
+}
+
+static void *kprobe_blacklist_seq_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ return seq_list_next(v, &kprobe_blacklist, pos);
+}
+
+static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
+{
+ struct kprobe_blacklist_entry *ent =
+ list_entry(v, struct kprobe_blacklist_entry, list);
+
+ seq_printf(m, "0x%p-0x%p\t%ps\n", (void *)ent->start_addr,
+ (void *)ent->end_addr, (void *)ent->start_addr);
+ return 0;
+}
+
+static const struct seq_operations kprobe_blacklist_seq_ops = {
+ .start = kprobe_blacklist_seq_start,
+ .next = kprobe_blacklist_seq_next,
+ .stop = kprobe_seq_stop, /* Reuse void function */
+ .show = kprobe_blacklist_seq_show,
+};
+
+static int kprobe_blacklist_open(struct inode *inode, struct file *filp)
+{
+ return seq_open(filp, &kprobe_blacklist_seq_ops);
+}
+
+static const struct file_operations debugfs_kprobe_blacklist_ops = {
+ .open = kprobe_blacklist_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
static void arm_all_kprobes(void)
{
struct hlist_head *head;
@@ -2372,19 +2412,24 @@ static int __init debugfs_kprobe_init(void)

file = debugfs_create_file("list", 0444, dir, NULL,
&debugfs_kprobes_operations);
- if (!file) {
- debugfs_remove(dir);
- return -ENOMEM;
- }
+ if (!file)
+ goto error;

file = debugfs_create_file("enabled", 0600, dir,
&value, &fops_kp);
- if (!file) {
- debugfs_remove(dir);
- return -ENOMEM;
- }
+ if (!file)
+ goto error;
+
+ file = debugfs_create_file("blacklist", 0444, dir, NULL,
+ &debugfs_kprobe_blacklist_ops);
+ if (!file)
+ goto error;

return 0;
+
+error:
+ debugfs_remove(dir);
+ return -ENOMEM;
}

late_initcall(debugfs_kprobe_init);

Subject: [tip:perf/kprobes] kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation

Commit-ID: 9326638cbee2d36b051ed2a69f3e4e107e5f86bd
Gitweb: http://git.kernel.org/tip/9326638cbee2d36b051ed2a69f3e4e107e5f86bd
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:14 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:26:38 +0200

kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation

Use NOKPROBE_SYMBOL macro for protecting functions
from kprobes instead of __kprobes annotation under
arch/x86.

This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

This just folds a bunch of previous NOKPROBE_SYMBOL()
cleanup patches for x86 to one patch.

Signed-off-by: Masami Hiramatsu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: Andrew Morton <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fernando Luis Vázquez Cao <[email protected]>
Cc: Gleb Natapov <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Jonathan Lebon <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Raghavendra K T <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Cc: Srivatsa Vaddagiri <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Vineet Gupta <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/traps.h | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 3 +-
arch/x86/kernel/cpu/perf_event.c | 3 +-
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 3 +-
arch/x86/kernel/dumpstack.c | 9 ++--
arch/x86/kernel/kprobes/core.c | 77 ++++++++++++++++++++------------
arch/x86/kernel/kprobes/ftrace.c | 15 ++++---
arch/x86/kernel/kprobes/opt.c | 8 ++--
arch/x86/kernel/kvm.c | 4 +-
arch/x86/kernel/nmi.c | 18 +++++---
arch/x86/kernel/traps.c | 20 ++++++---
arch/x86/mm/fault.c | 29 +++++++-----
12 files changed, 122 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 58d66fe..ca32508 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -68,7 +68,7 @@ dotraplinkage void do_segment_not_present(struct pt_regs *, long);
dotraplinkage void do_stack_segment(struct pt_regs *, long);
#ifdef CONFIG_X86_64
dotraplinkage void do_double_fault(struct pt_regs *, long);
-asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *);
+asmlinkage struct pt_regs *sync_regs(struct pt_regs *);
#endif
dotraplinkage void do_general_protection(struct pt_regs *, long);
dotraplinkage void do_page_fault(struct pt_regs *, unsigned long);
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index a698d71..73eb5b3 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -60,7 +60,7 @@ void arch_trigger_all_cpu_backtrace(void)
smp_mb__after_clear_bit();
}

-static int __kprobes
+static int
arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
{
int cpu;
@@ -80,6 +80,7 @@ arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)

return NMI_DONE;
}
+NOKPROBE_SYMBOL(arch_trigger_all_cpu_backtrace_handler);

static int __init register_trigger_all_cpu_backtrace(void)
{
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index ae407f7..5fc8771 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1292,7 +1292,7 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}

-static int __kprobes
+static int
perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
u64 start_clock;
@@ -1310,6 +1310,7 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)

return ret;
}
+NOKPROBE_SYMBOL(perf_event_nmi_handler);

struct event_constraint emptyconstraint;
struct event_constraint unconstrained;
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 4c36bbe..cbb1be3e 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -593,7 +593,7 @@ out:
return 1;
}

-static int __kprobes
+static int
perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
int handled = 0;
@@ -606,6 +606,7 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)

return handled;
}
+NOKPROBE_SYMBOL(perf_ibs_nmi_handler);

static __init int perf_ibs_pmu_init(struct perf_ibs *perf_ibs, char *name)
{
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index d9c12d3..b74ebc7 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -200,7 +200,7 @@ static arch_spinlock_t die_lock = __ARCH_SPIN_LOCK_UNLOCKED;
static int die_owner = -1;
static unsigned int die_nest_count;

-unsigned __kprobes long oops_begin(void)
+unsigned long oops_begin(void)
{
int cpu;
unsigned long flags;
@@ -223,8 +223,9 @@ unsigned __kprobes long oops_begin(void)
return flags;
}
EXPORT_SYMBOL_GPL(oops_begin);
+NOKPROBE_SYMBOL(oops_begin);

-void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
+void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
if (regs && kexec_should_crash(current))
crash_kexec(regs);
@@ -247,8 +248,9 @@ void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
panic("Fatal exception");
do_exit(signr);
}
+NOKPROBE_SYMBOL(oops_end);

-int __kprobes __die(const char *str, struct pt_regs *regs, long err)
+int __die(const char *str, struct pt_regs *regs, long err)
{
#ifdef CONFIG_X86_32
unsigned short ss;
@@ -291,6 +293,7 @@ int __kprobes __die(const char *str, struct pt_regs *regs, long err)
#endif
return 0;
}
+NOKPROBE_SYMBOL(__die);

/*
* This is gone through when something in the kernel has done something bad
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index bd71713..7596df6 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -112,7 +112,8 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = {

const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist);

-static void __kprobes __synthesize_relative_insn(void *from, void *to, u8 op)
+static nokprobe_inline void
+__synthesize_relative_insn(void *from, void *to, u8 op)
{
struct __arch_relative_insn {
u8 op;
@@ -125,21 +126,23 @@ static void __kprobes __synthesize_relative_insn(void *from, void *to, u8 op)
}

/* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
-void __kprobes synthesize_reljump(void *from, void *to)
+void synthesize_reljump(void *from, void *to)
{
__synthesize_relative_insn(from, to, RELATIVEJUMP_OPCODE);
}
+NOKPROBE_SYMBOL(synthesize_reljump);

/* Insert a call instruction at address 'from', which calls address 'to'.*/
-void __kprobes synthesize_relcall(void *from, void *to)
+void synthesize_relcall(void *from, void *to)
{
__synthesize_relative_insn(from, to, RELATIVECALL_OPCODE);
}
+NOKPROBE_SYMBOL(synthesize_relcall);

/*
* Skip the prefixes of the instruction.
*/
-static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
+static kprobe_opcode_t *skip_prefixes(kprobe_opcode_t *insn)
{
insn_attr_t attr;

@@ -154,6 +157,7 @@ static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
#endif
return insn;
}
+NOKPROBE_SYMBOL(skip_prefixes);

/*
* Returns non-zero if opcode is boostable.
@@ -425,7 +429,8 @@ void arch_remove_kprobe(struct kprobe *p)
}
}

-static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
+static nokprobe_inline void
+save_previous_kprobe(struct kprobe_ctlblk *kcb)
{
kcb->prev_kprobe.kp = kprobe_running();
kcb->prev_kprobe.status = kcb->kprobe_status;
@@ -433,7 +438,8 @@ static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
kcb->prev_kprobe.saved_flags = kcb->kprobe_saved_flags;
}

-static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
+static nokprobe_inline void
+restore_previous_kprobe(struct kprobe_ctlblk *kcb)
{
__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
kcb->kprobe_status = kcb->prev_kprobe.status;
@@ -441,8 +447,9 @@ static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
kcb->kprobe_saved_flags = kcb->prev_kprobe.saved_flags;
}

-static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
+static nokprobe_inline void
+set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
__this_cpu_write(current_kprobe, p);
kcb->kprobe_saved_flags = kcb->kprobe_old_flags
@@ -451,7 +458,7 @@ static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
kcb->kprobe_saved_flags &= ~X86_EFLAGS_IF;
}

-static void __kprobes clear_btf(void)
+static nokprobe_inline void clear_btf(void)
{
if (test_thread_flag(TIF_BLOCKSTEP)) {
unsigned long debugctl = get_debugctlmsr();
@@ -461,7 +468,7 @@ static void __kprobes clear_btf(void)
}
}

-static void __kprobes restore_btf(void)
+static nokprobe_inline void restore_btf(void)
{
if (test_thread_flag(TIF_BLOCKSTEP)) {
unsigned long debugctl = get_debugctlmsr();
@@ -471,8 +478,7 @@ static void __kprobes restore_btf(void)
}
}

-void __kprobes
-arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
+void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
{
unsigned long *sara = stack_addr(regs);

@@ -481,9 +487,10 @@ arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) &kretprobe_trampoline;
}
+NOKPROBE_SYMBOL(arch_prepare_kretprobe);

-static void __kprobes
-setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb, int reenter)
+static void setup_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb, int reenter)
{
if (setup_detour_execution(p, regs, reenter))
return;
@@ -519,14 +526,15 @@ setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *k
else
regs->ip = (unsigned long)p->ainsn.insn;
}
+NOKPROBE_SYMBOL(setup_singlestep);

/*
* We have reentered the kprobe_handler(), since another probe was hit while
* within the handler. We save the original kprobes variables and just single
* step on the instruction of the new probe without calling any user handlers.
*/
-static int __kprobes
-reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+static int reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
switch (kcb->kprobe_status) {
case KPROBE_HIT_SSDONE:
@@ -554,12 +562,13 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb

return 1;
}
+NOKPROBE_SYMBOL(reenter_kprobe);

/*
* Interrupts are disabled on entry as trap3 is an interrupt gate and they
* remain disabled throughout this function.
*/
-int __kprobes kprobe_int3_handler(struct pt_regs *regs)
+int kprobe_int3_handler(struct pt_regs *regs)
{
kprobe_opcode_t *addr;
struct kprobe *p;
@@ -622,12 +631,13 @@ int __kprobes kprobe_int3_handler(struct pt_regs *regs)
preempt_enable_no_resched();
return 0;
}
+NOKPROBE_SYMBOL(kprobe_int3_handler);

/*
* When a retprobed function returns, this code saves registers and
* calls trampoline_handler() runs, which calls the kretprobe's handler.
*/
-static void __used __kprobes kretprobe_trampoline_holder(void)
+static void __used kretprobe_trampoline_holder(void)
{
asm volatile (
".global kretprobe_trampoline\n"
@@ -658,11 +668,13 @@ static void __used __kprobes kretprobe_trampoline_holder(void)
#endif
" ret\n");
}
+NOKPROBE_SYMBOL(kretprobe_trampoline_holder);
+NOKPROBE_SYMBOL(kretprobe_trampoline);

/*
* Called from kretprobe_trampoline
*/
-__visible __used __kprobes void *trampoline_handler(struct pt_regs *regs)
+__visible __used void *trampoline_handler(struct pt_regs *regs)
{
struct kretprobe_instance *ri = NULL;
struct hlist_head *head, empty_rp;
@@ -748,6 +760,7 @@ __visible __used __kprobes void *trampoline_handler(struct pt_regs *regs)
}
return (void *)orig_ret_address;
}
+NOKPROBE_SYMBOL(trampoline_handler);

/*
* Called after single-stepping. p->addr is the address of the
@@ -776,8 +789,8 @@ __visible __used __kprobes void *trampoline_handler(struct pt_regs *regs)
* jump instruction after the copied instruction, that jumps to the next
* instruction after the probepoint.
*/
-static void __kprobes
-resume_execution(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+static void resume_execution(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
unsigned long *tos = stack_addr(regs);
unsigned long copy_ip = (unsigned long)p->ainsn.insn;
@@ -852,12 +865,13 @@ resume_execution(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *k
no_change:
restore_btf();
}
+NOKPROBE_SYMBOL(resume_execution);

/*
* Interrupts are disabled on entry as trap1 is an interrupt gate and they
* remain disabled throughout this function.
*/
-int __kprobes kprobe_debug_handler(struct pt_regs *regs)
+int kprobe_debug_handler(struct pt_regs *regs)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -892,8 +906,9 @@ out:

return 1;
}
+NOKPROBE_SYMBOL(kprobe_debug_handler);

-int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
+int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -950,12 +965,13 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)

return 0;
}
+NOKPROBE_SYMBOL(kprobe_fault_handler);

/*
* Wrapper routine for handling exceptions.
*/
-int __kprobes
-kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data)
+int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val,
+ void *data)
{
struct die_args *args = data;
int ret = NOTIFY_DONE;
@@ -975,8 +991,9 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
}
return ret;
}
+NOKPROBE_SYMBOL(kprobe_exceptions_notify);

-int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
+int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
struct jprobe *jp = container_of(p, struct jprobe, kp);
unsigned long addr;
@@ -1000,8 +1017,9 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
regs->ip = (unsigned long)(jp->entry);
return 1;
}
+NOKPROBE_SYMBOL(setjmp_pre_handler);

-void __kprobes jprobe_return(void)
+void jprobe_return(void)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();

@@ -1017,8 +1035,10 @@ void __kprobes jprobe_return(void)
" nop \n"::"b"
(kcb->jprobe_saved_sp):"memory");
}
+NOKPROBE_SYMBOL(jprobe_return);
+NOKPROBE_SYMBOL(jprobe_return_end);

-int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
+int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
u8 *addr = (u8 *) (regs->ip - 1);
@@ -1046,6 +1066,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
}
return 0;
}
+NOKPROBE_SYMBOL(longjmp_break_handler);

bool arch_within_kprobe_blacklist(unsigned long addr)
{
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index dcaa131..717b02a 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -25,8 +25,9 @@

#include "common.h"

-static int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
+static nokprobe_inline
+int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
/*
* Emulate singlestep (and also recover regs->ip)
@@ -41,18 +42,19 @@ static int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
return 1;
}

-int __kprobes skip_singlestep(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
+int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
{
if (kprobe_ftrace(p))
return __skip_singlestep(p, regs, kcb);
else
return 0;
}
+NOKPROBE_SYMBOL(skip_singlestep);

/* Ftrace callback handler for kprobes */
-void __kprobes kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
- struct ftrace_ops *ops, struct pt_regs *regs)
+void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
+ struct ftrace_ops *ops, struct pt_regs *regs)
{
struct kprobe *p;
struct kprobe_ctlblk *kcb;
@@ -84,6 +86,7 @@ void __kprobes kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
end:
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(kprobe_ftrace_handler);

int arch_prepare_kprobe_ftrace(struct kprobe *p)
{
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index fba7fb0..f304773 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -138,7 +138,8 @@ asm (
#define INT3_SIZE sizeof(kprobe_opcode_t)

/* Optimized kprobe call back function: called from optinsn */
-static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
+static void
+optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
unsigned long flags;
@@ -168,6 +169,7 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op, struct pt_
}
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(optimized_callback);

static int copy_optimized_instructions(u8 *dest, u8 *src)
{
@@ -424,8 +426,7 @@ extern void arch_unoptimize_kprobes(struct list_head *oplist,
}
}

-int __kprobes
-setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
+int setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
{
struct optimized_kprobe *op;

@@ -441,3 +442,4 @@ setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
}
return 0;
}
+NOKPROBE_SYMBOL(setup_detour_execution);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 0331cb3..d81abcb 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -251,8 +251,9 @@ u32 kvm_read_and_reset_pf_reason(void)
return reason;
}
EXPORT_SYMBOL_GPL(kvm_read_and_reset_pf_reason);
+NOKPROBE_SYMBOL(kvm_read_and_reset_pf_reason);

-dotraplinkage void __kprobes
+dotraplinkage void
do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
{
enum ctx_state prev_state;
@@ -276,6 +277,7 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
break;
}
}
+NOKPROBE_SYMBOL(do_async_page_fault);

static void __init paravirt_ops_setup(void)
{
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index b4872b9..c3e985d 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -110,7 +110,7 @@ static void nmi_max_handler(struct irq_work *w)
a->handler, whole_msecs, decimal_msecs);
}

-static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2b)
+static int nmi_handle(unsigned int type, struct pt_regs *regs, bool b2b)
{
struct nmi_desc *desc = nmi_to_desc(type);
struct nmiaction *a;
@@ -146,6 +146,7 @@ static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2
/* return total number of NMI events handled */
return handled;
}
+NOKPROBE_SYMBOL(nmi_handle);

int __register_nmi_handler(unsigned int type, struct nmiaction *action)
{
@@ -208,7 +209,7 @@ void unregister_nmi_handler(unsigned int type, const char *name)
}
EXPORT_SYMBOL_GPL(unregister_nmi_handler);

-static __kprobes void
+static void
pci_serr_error(unsigned char reason, struct pt_regs *regs)
{
/* check to see if anyone registered against these types of errors */
@@ -238,8 +239,9 @@ pci_serr_error(unsigned char reason, struct pt_regs *regs)
reason = (reason & NMI_REASON_CLEAR_MASK) | NMI_REASON_CLEAR_SERR;
outb(reason, NMI_REASON_PORT);
}
+NOKPROBE_SYMBOL(pci_serr_error);

-static __kprobes void
+static void
io_check_error(unsigned char reason, struct pt_regs *regs)
{
unsigned long i;
@@ -269,8 +271,9 @@ io_check_error(unsigned char reason, struct pt_regs *regs)
reason &= ~NMI_REASON_CLEAR_IOCHK;
outb(reason, NMI_REASON_PORT);
}
+NOKPROBE_SYMBOL(io_check_error);

-static __kprobes void
+static void
unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
{
int handled;
@@ -298,11 +301,12 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)

pr_emerg("Dazed and confused, but trying to continue\n");
}
+NOKPROBE_SYMBOL(unknown_nmi_error);

static DEFINE_PER_CPU(bool, swallow_nmi);
static DEFINE_PER_CPU(unsigned long, last_nmi_rip);

-static __kprobes void default_do_nmi(struct pt_regs *regs)
+static void default_do_nmi(struct pt_regs *regs)
{
unsigned char reason = 0;
int handled;
@@ -401,6 +405,7 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
else
unknown_nmi_error(reason, regs);
}
+NOKPROBE_SYMBOL(default_do_nmi);

/*
* NMIs can hit breakpoints which will cause it to lose its
@@ -520,7 +525,7 @@ static inline void nmi_nesting_postprocess(void)
}
#endif

-dotraplinkage notrace __kprobes void
+dotraplinkage notrace void
do_nmi(struct pt_regs *regs, long error_code)
{
nmi_nesting_preprocess(regs);
@@ -537,6 +542,7 @@ do_nmi(struct pt_regs *regs, long error_code)
/* On i386, may loop back to preprocess */
nmi_nesting_postprocess();
}
+NOKPROBE_SYMBOL(do_nmi);

void stop_nmi(void)
{
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index ba9abe9..3c8ae7d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -106,7 +106,7 @@ static inline void preempt_conditional_cli(struct pt_regs *regs)
preempt_count_dec();
}

-static int __kprobes
+static nokprobe_inline int
do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
struct pt_regs *regs, long error_code)
{
@@ -136,7 +136,7 @@ do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
return -1;
}

-static void __kprobes
+static void
do_trap(int trapnr, int signr, char *str, struct pt_regs *regs,
long error_code, siginfo_t *info)
{
@@ -173,6 +173,7 @@ do_trap(int trapnr, int signr, char *str, struct pt_regs *regs,
else
force_sig(signr, tsk);
}
+NOKPROBE_SYMBOL(do_trap);

#define DO_ERROR(trapnr, signr, str, name) \
dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
@@ -263,7 +264,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
}
#endif

-dotraplinkage void __kprobes
+dotraplinkage void
do_general_protection(struct pt_regs *regs, long error_code)
{
struct task_struct *tsk;
@@ -309,9 +310,10 @@ do_general_protection(struct pt_regs *regs, long error_code)
exit:
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_general_protection);

/* May run on IST stack. */
-dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_code)
+dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
{
enum ctx_state prev_state;

@@ -355,6 +357,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
exit:
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_int3);

#ifdef CONFIG_X86_64
/*
@@ -362,7 +365,7 @@ exit:
* for scheduling or signal handling. The actual stack switch is done in
* entry.S
*/
-asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
+asmlinkage struct pt_regs *sync_regs(struct pt_regs *eregs)
{
struct pt_regs *regs = eregs;
/* Did already sync */
@@ -381,6 +384,7 @@ asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
*regs = *eregs;
return regs;
}
+NOKPROBE_SYMBOL(sync_regs);
#endif

/*
@@ -407,7 +411,7 @@ asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
*
* May run on IST stack.
*/
-dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
+dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
{
struct task_struct *tsk = current;
enum ctx_state prev_state;
@@ -491,6 +495,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
exit:
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_debug);

/*
* Note that we play around with the 'TS' bit in an attempt to get
@@ -662,7 +667,7 @@ void math_state_restore(void)
}
EXPORT_SYMBOL_GPL(math_state_restore);

-dotraplinkage void __kprobes
+dotraplinkage void
do_device_not_available(struct pt_regs *regs, long error_code)
{
enum ctx_state prev_state;
@@ -688,6 +693,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
#endif
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_device_not_available);

#ifdef CONFIG_X86_32
dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 8e57229..f83bd0d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -8,7 +8,7 @@
#include <linux/kdebug.h> /* oops_begin/end, ... */
#include <linux/module.h> /* search_exception_table */
#include <linux/bootmem.h> /* max_low_pfn */
-#include <linux/kprobes.h> /* __kprobes, ... */
+#include <linux/kprobes.h> /* NOKPROBE_SYMBOL, ... */
#include <linux/mmiotrace.h> /* kmmio_handler, ... */
#include <linux/perf_event.h> /* perf_sw_event */
#include <linux/hugetlb.h> /* hstate_index_to_shift */
@@ -45,7 +45,7 @@ enum x86_pf_error_code {
* Returns 0 if mmiotrace is disabled, or if the fault is not
* handled by mmiotrace:
*/
-static inline int __kprobes
+static nokprobe_inline int
kmmio_fault(struct pt_regs *regs, unsigned long addr)
{
if (unlikely(is_kmmio_active()))
@@ -54,7 +54,7 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr)
return 0;
}

-static inline int __kprobes kprobes_fault(struct pt_regs *regs)
+static nokprobe_inline int kprobes_fault(struct pt_regs *regs)
{
int ret = 0;

@@ -261,7 +261,7 @@ void vmalloc_sync_all(void)
*
* Handle a fault on the vmalloc or module mapping area
*/
-static noinline __kprobes int vmalloc_fault(unsigned long address)
+static noinline int vmalloc_fault(unsigned long address)
{
unsigned long pgd_paddr;
pmd_t *pmd_k;
@@ -291,6 +291,7 @@ static noinline __kprobes int vmalloc_fault(unsigned long address)

return 0;
}
+NOKPROBE_SYMBOL(vmalloc_fault);

/*
* Did it hit the DOS screen memory VA from vm86 mode?
@@ -358,7 +359,7 @@ void vmalloc_sync_all(void)
*
* This assumes no large pages in there.
*/
-static noinline __kprobes int vmalloc_fault(unsigned long address)
+static noinline int vmalloc_fault(unsigned long address)
{
pgd_t *pgd, *pgd_ref;
pud_t *pud, *pud_ref;
@@ -425,6 +426,7 @@ static noinline __kprobes int vmalloc_fault(unsigned long address)

return 0;
}
+NOKPROBE_SYMBOL(vmalloc_fault);

#ifdef CONFIG_CPU_SUP_AMD
static const char errata93_warning[] =
@@ -927,7 +929,7 @@ static int spurious_fault_check(unsigned long error_code, pte_t *pte)
* There are no security implications to leaving a stale TLB when
* increasing the permissions on a page.
*/
-static noinline __kprobes int
+static noinline int
spurious_fault(unsigned long error_code, unsigned long address)
{
pgd_t *pgd;
@@ -975,6 +977,7 @@ spurious_fault(unsigned long error_code, unsigned long address)

return ret;
}
+NOKPROBE_SYMBOL(spurious_fault);

int show_unhandled_signals = 1;

@@ -1030,7 +1033,7 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
* {,trace_}do_page_fault() have notrace on. Having this an actual function
* guarantees there's a function trace entry.
*/
-static void __kprobes noinline
+static noinline void
__do_page_fault(struct pt_regs *regs, unsigned long error_code,
unsigned long address)
{
@@ -1253,8 +1256,9 @@ good_area:

up_read(&mm->mmap_sem);
}
+NOKPROBE_SYMBOL(__do_page_fault);

-dotraplinkage void __kprobes notrace
+dotraplinkage void notrace
do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
unsigned long address = read_cr2(); /* Get the faulting address */
@@ -1272,10 +1276,12 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
__do_page_fault(regs, error_code, address);
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(do_page_fault);

#ifdef CONFIG_TRACING
-static void trace_page_fault_entries(unsigned long address, struct pt_regs *regs,
- unsigned long error_code)
+static nokprobe_inline void
+trace_page_fault_entries(unsigned long address, struct pt_regs *regs,
+ unsigned long error_code)
{
if (user_mode(regs))
trace_page_fault_user(address, regs, error_code);
@@ -1283,7 +1289,7 @@ static void trace_page_fault_entries(unsigned long address, struct pt_regs *regs
trace_page_fault_kernel(address, regs, error_code);
}

-dotraplinkage void __kprobes notrace
+dotraplinkage void notrace
trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
/*
@@ -1300,4 +1306,5 @@ trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
__do_page_fault(regs, error_code, address);
exception_exit(prev_state);
}
+NOKPROBE_SYMBOL(trace_do_page_fault);
#endif /* CONFIG_TRACING */

Subject: [tip:perf/kprobes] kprobes, sched: Use NOKPROBE_SYMBOL macro in sched

Commit-ID: edafe3a56dbd42c499245b222e9f7e80099356e5
Gitweb: http://git.kernel.org/tip/edafe3a56dbd42c499245b222e9f7e80099356e5
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:42 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:26:40 +0200

kprobes, sched: Use NOKPROBE_SYMBOL macro in sched

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in sched/core.c.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
kernel/sched/core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6863631..00781cc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2480,7 +2480,7 @@ notrace unsigned long get_parent_ip(unsigned long addr)
#if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \
defined(CONFIG_PREEMPT_TRACER))

-void __kprobes preempt_count_add(int val)
+void preempt_count_add(int val)
{
#ifdef CONFIG_DEBUG_PREEMPT
/*
@@ -2506,8 +2506,9 @@ void __kprobes preempt_count_add(int val)
}
}
EXPORT_SYMBOL(preempt_count_add);
+NOKPROBE_SYMBOL(preempt_count_add);

-void __kprobes preempt_count_sub(int val)
+void preempt_count_sub(int val)
{
#ifdef CONFIG_DEBUG_PREEMPT
/*
@@ -2528,6 +2529,7 @@ void __kprobes preempt_count_sub(int val)
__preempt_count_sub(val);
}
EXPORT_SYMBOL(preempt_count_sub);
+NOKPROBE_SYMBOL(preempt_count_sub);

#endif

Subject: [tip:perf/kprobes] kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace

Commit-ID: 3da0f18007e5b87b573cf6ae8c445d59e757d274
Gitweb: http://git.kernel.org/tip/3da0f18007e5b87b573cf6ae8c445d59e757d274
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:18:28 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:26:39 +0200

kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in ftrace.
This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/trace/trace_event_perf.c | 5 ++--
kernel/trace/trace_kprobe.c | 66 ++++++++++++++++++++++++-----------------
kernel/trace/trace_probe.c | 61 ++++++++++++++++++++-----------------
kernel/trace/trace_probe.h | 15 +++++-----
4 files changed, 82 insertions(+), 65 deletions(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index c894614..5d12bb4 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -248,8 +248,8 @@ void perf_trace_del(struct perf_event *p_event, int flags)
tp_event->class->reg(tp_event, TRACE_REG_PERF_DEL, p_event);
}

-__kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
- struct pt_regs *regs, int *rctxp)
+void *perf_trace_buf_prepare(int size, unsigned short type,
+ struct pt_regs *regs, int *rctxp)
{
struct trace_entry *entry;
unsigned long flags;
@@ -281,6 +281,7 @@ __kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
return raw_data;
}
EXPORT_SYMBOL_GPL(perf_trace_buf_prepare);
+NOKPROBE_SYMBOL(perf_trace_buf_prepare);

#ifdef CONFIG_FUNCTION_TRACER
static void
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index aa5f0bf..242e4ec 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -40,27 +40,27 @@ struct trace_kprobe {
(sizeof(struct probe_arg) * (n)))


-static __kprobes bool trace_kprobe_is_return(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
{
return tk->rp.handler != NULL;
}

-static __kprobes const char *trace_kprobe_symbol(struct trace_kprobe *tk)
+static nokprobe_inline const char *trace_kprobe_symbol(struct trace_kprobe *tk)
{
return tk->symbol ? tk->symbol : "unknown";
}

-static __kprobes unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
+static nokprobe_inline unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
{
return tk->rp.kp.offset;
}

-static __kprobes bool trace_kprobe_has_gone(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
{
return !!(kprobe_gone(&tk->rp.kp));
}

-static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
+static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
struct module *mod)
{
int len = strlen(mod->name);
@@ -68,7 +68,7 @@ static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
return strncmp(mod->name, name, len) == 0 && name[len] == ':';
}

-static __kprobes bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
{
return !!strchr(trace_kprobe_symbol(tk), ':');
}
@@ -132,19 +132,21 @@ struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
* Kprobes-specific fetch functions
*/
#define DEFINE_FETCH_stack(type) \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
+static void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
(unsigned int)((unsigned long)offset)); \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(stack, type));
+
DEFINE_BASIC_FETCH_FUNCS(stack)
/* No string on the stack entry */
#define fetch_stack_string NULL
#define fetch_stack_string_size NULL

#define DEFINE_FETCH_memory(type) \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
+static void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
void *addr, void *dest) \
{ \
type retval; \
@@ -152,14 +154,16 @@ static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
*(type *)dest = 0; \
else \
*(type *)dest = retval; \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(memory, type));
+
DEFINE_BASIC_FETCH_FUNCS(memory)
/*
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
-static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
{
long ret;
int maxlen = get_rloc_len(*(u32 *)dest);
@@ -193,10 +197,11 @@ static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
get_rloc_offs(*(u32 *)dest));
}
}
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(memory, string));

/* Return the length of string -- including null terminal byte */
-static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest)
{
mm_segment_t old_fs;
int ret, len = 0;
@@ -219,17 +224,19 @@ static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
else
*(u32 *)dest = len;
}
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(memory, string_size));

#define DEFINE_FETCH_symbol(type) \
-__kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, void *data, void *dest)\
{ \
struct symbol_cache *sc = data; \
if (sc->addr) \
fetch_memory_##type(regs, (void *)sc->addr, dest); \
else \
*(type *)dest = 0; \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(symbol, type));
+
DEFINE_BASIC_FETCH_FUNCS(symbol)
DEFINE_FETCH_symbol(string)
DEFINE_FETCH_symbol(string_size)
@@ -907,7 +914,7 @@ static const struct file_operations kprobe_profile_ops = {
};

/* Kprobe handler */
-static __kprobes void
+static nokprobe_inline void
__kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
struct ftrace_event_file *ftrace_file)
{
@@ -943,7 +950,7 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
entry, irq_flags, pc, regs);
}

-static __kprobes void
+static void
kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs)
{
struct event_file_link *link;
@@ -951,9 +958,10 @@ kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs)
list_for_each_entry_rcu(link, &tk->tp.files, list)
__kprobe_trace_func(tk, regs, link->file);
}
+NOKPROBE_SYMBOL(kprobe_trace_func);

/* Kretprobe handler */
-static __kprobes void
+static nokprobe_inline void
__kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs,
struct ftrace_event_file *ftrace_file)
@@ -991,7 +999,7 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
entry, irq_flags, pc, regs);
}

-static __kprobes void
+static void
kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
@@ -1000,6 +1008,7 @@ kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
list_for_each_entry_rcu(link, &tk->tp.files, list)
__kretprobe_trace_func(tk, ri, regs, link->file);
}
+NOKPROBE_SYMBOL(kretprobe_trace_func);

/* Event entry printers */
static enum print_line_t
@@ -1131,7 +1140,7 @@ static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
#ifdef CONFIG_PERF_EVENTS

/* Kprobe profile handler */
-static __kprobes void
+static void
kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
{
struct ftrace_event_call *call = &tk->tp.call;
@@ -1158,9 +1167,10 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
store_trace_args(sizeof(*entry), &tk->tp, regs, (u8 *)&entry[1], dsize);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}
+NOKPROBE_SYMBOL(kprobe_perf_func);

/* Kretprobe profile handler */
-static __kprobes void
+static void
kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
@@ -1188,6 +1198,7 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
store_trace_args(sizeof(*entry), &tk->tp, regs, (u8 *)&entry[1], dsize);
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
}
+NOKPROBE_SYMBOL(kretprobe_perf_func);
#endif /* CONFIG_PERF_EVENTS */

/*
@@ -1223,8 +1234,7 @@ static int kprobe_register(struct ftrace_event_call *event,
return 0;
}

-static __kprobes
-int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
+static int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
{
struct trace_kprobe *tk = container_of(kp, struct trace_kprobe, rp.kp);

@@ -1238,9 +1248,10 @@ int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs)
#endif
return 0; /* We don't tweek kernel, so just return 0 */
}
+NOKPROBE_SYMBOL(kprobe_dispatcher);

-static __kprobes
-int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
+static int
+kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct trace_kprobe *tk = container_of(ri->rp, struct trace_kprobe, rp);

@@ -1254,6 +1265,7 @@ int kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
#endif
return 0; /* We don't tweek kernel, so just return 0 */
}
+NOKPROBE_SYMBOL(kretprobe_dispatcher);

static struct trace_event_functions kretprobe_funcs = {
.trace = print_kretprobe_event
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index d3a91e4..d4b9fc2 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -37,13 +37,13 @@ const char *reserved_field_names[] = {

/* Printing in basic type function template */
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt) \
-__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
- const char *name, \
- void *data, void *ent) \
+int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, const char *name, \
+ void *data, void *ent) \
{ \
return trace_seq_printf(s, " %s=" fmt, name, *(type *)data); \
} \
-const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
+const char PRINT_TYPE_FMT_NAME(type)[] = fmt; \
+NOKPROBE_SYMBOL(PRINT_TYPE_FUNC_NAME(type));

DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "0x%x")
@@ -55,9 +55,8 @@ DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")

/* Print type function for string type */
-__kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
- const char *name,
- void *data, void *ent)
+int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, const char *name,
+ void *data, void *ent)
{
int len = *(u32 *)data >> 16;

@@ -67,6 +66,7 @@ __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
return trace_seq_printf(s, " %s=\"%s\"", name,
(const char *)get_loc_data(data, ent));
}
+NOKPROBE_SYMBOL(PRINT_TYPE_FUNC_NAME(string));

const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";

@@ -81,23 +81,24 @@ const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";

/* Data fetch function templates */
#define DEFINE_FETCH_reg(type) \
-__kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
- void *offset, void *dest) \
+void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_register(regs, \
(unsigned int)((unsigned long)offset)); \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(reg, type));
DEFINE_BASIC_FETCH_FUNCS(reg)
/* No string on the register */
#define fetch_reg_string NULL
#define fetch_reg_string_size NULL

#define DEFINE_FETCH_retval(type) \
-__kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
- void *dummy, void *dest) \
+void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
+ void *dummy, void *dest) \
{ \
*(type *)dest = (type)regs_return_value(regs); \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(retval, type));
DEFINE_BASIC_FETCH_FUNCS(retval)
/* No string on the retval */
#define fetch_retval_string NULL
@@ -112,8 +113,8 @@ struct deref_fetch_param {
};

#define DEFINE_FETCH_deref(type) \
-__kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
+ void *data, void *dest) \
{ \
struct deref_fetch_param *dprm = data; \
unsigned long addr; \
@@ -123,12 +124,13 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
dprm->fetch(regs, (void *)addr, dest); \
} else \
*(type *)dest = 0; \
-}
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(deref, type));
DEFINE_BASIC_FETCH_FUNCS(deref)
DEFINE_FETCH_deref(string)

-__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
- void *data, void *dest)
+void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
+ void *data, void *dest)
{
struct deref_fetch_param *dprm = data;
unsigned long addr;
@@ -140,16 +142,18 @@ __kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
} else
*(string_size *)dest = 0;
}
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(deref, string_size));

-static __kprobes void update_deref_fetch_param(struct deref_fetch_param *data)
+static void update_deref_fetch_param(struct deref_fetch_param *data)
{
if (CHECK_FETCH_FUNCS(deref, data->orig.fn))
update_deref_fetch_param(data->orig.data);
else if (CHECK_FETCH_FUNCS(symbol, data->orig.fn))
update_symbol_cache(data->orig.data);
}
+NOKPROBE_SYMBOL(update_deref_fetch_param);

-static __kprobes void free_deref_fetch_param(struct deref_fetch_param *data)
+static void free_deref_fetch_param(struct deref_fetch_param *data)
{
if (CHECK_FETCH_FUNCS(deref, data->orig.fn))
free_deref_fetch_param(data->orig.data);
@@ -157,6 +161,7 @@ static __kprobes void free_deref_fetch_param(struct deref_fetch_param *data)
free_symbol_cache(data->orig.data);
kfree(data);
}
+NOKPROBE_SYMBOL(free_deref_fetch_param);

/* Bitfield fetch function */
struct bitfield_fetch_param {
@@ -166,8 +171,8 @@ struct bitfield_fetch_param {
};

#define DEFINE_FETCH_bitfield(type) \
-__kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
- void *data, void *dest) \
+void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
+ void *data, void *dest) \
{ \
struct bitfield_fetch_param *bprm = data; \
type buf = 0; \
@@ -177,8 +182,8 @@ __kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
buf >>= bprm->low_shift; \
} \
*(type *)dest = buf; \
-}
-
+} \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(bitfield, type));
DEFINE_BASIC_FETCH_FUNCS(bitfield)
#define fetch_bitfield_string NULL
#define fetch_bitfield_string_size NULL
@@ -255,17 +260,17 @@ fail:
}

/* Special function : only accept unsigned long */
-static __kprobes void fetch_kernel_stack_address(struct pt_regs *regs,
- void *dummy, void *dest)
+static void fetch_kernel_stack_address(struct pt_regs *regs, void *dummy, void *dest)
{
*(unsigned long *)dest = kernel_stack_pointer(regs);
}
+NOKPROBE_SYMBOL(fetch_kernel_stack_address);

-static __kprobes void fetch_user_stack_address(struct pt_regs *regs,
- void *dummy, void *dest)
+static void fetch_user_stack_address(struct pt_regs *regs, void *dummy, void *dest)
{
*(unsigned long *)dest = user_stack_pointer(regs);
}
+NOKPROBE_SYMBOL(fetch_user_stack_address);

static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
fetch_func_t orig_fn,
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index fb1ab5d..4f815fb 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -81,13 +81,13 @@
*/
#define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs))

-static inline void *get_rloc_data(u32 *dl)
+static nokprobe_inline void *get_rloc_data(u32 *dl)
{
return (u8 *)dl + get_rloc_offs(*dl);
}

/* For data_loc conversion */
-static inline void *get_loc_data(u32 *dl, void *ent)
+static nokprobe_inline void *get_loc_data(u32 *dl, void *ent)
{
return (u8 *)ent + get_rloc_offs(*dl);
}
@@ -136,9 +136,8 @@ typedef u32 string_size;

/* Printing in basic type function template */
#define DECLARE_BASIC_PRINT_TYPE_FUNC(type) \
-__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
- const char *name, \
- void *data, void *ent); \
+int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, const char *name, \
+ void *data, void *ent); \
extern const char PRINT_TYPE_FMT_NAME(type)[]

DECLARE_BASIC_PRINT_TYPE_FUNC(u8);
@@ -303,7 +302,7 @@ static inline bool trace_probe_is_registered(struct trace_probe *tp)
return !!(tp->flags & TP_FLAG_REGISTERED);
}

-static inline __kprobes void call_fetch(struct fetch_param *fprm,
+static nokprobe_inline void call_fetch(struct fetch_param *fprm,
struct pt_regs *regs, void *dest)
{
return fprm->fn(regs, fprm->data, dest);
@@ -351,7 +350,7 @@ extern ssize_t traceprobe_probes_write(struct file *file,
extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));

/* Sum up total data length for dynamic arraies (strings) */
-static inline __kprobes int
+static nokprobe_inline int
__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
{
int i, ret = 0;
@@ -367,7 +366,7 @@ __get_data_size(struct trace_probe *tp, struct pt_regs *regs)
}

/* Store the value of each argument */
-static inline __kprobes void
+static nokprobe_inline void
store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
u8 *data, int maxlen)
{

Subject: [tip:perf/kprobes] kprobes: Introduce NOKPROBE_SYMBOL() macro to maintain kprobes blacklist

Commit-ID: 376e242429bf8539ef39a080ac113c8799840b13
Gitweb: http://git.kernel.org/tip/376e242429bf8539ef39a080ac113c8799840b13
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:05 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:56 +0200

kprobes: Introduce NOKPROBE_SYMBOL() macro to maintain kprobes blacklist

Introduce NOKPROBE_SYMBOL() macro which builds a kprobes
blacklist at kernel build time.

The usage of this macro is similar to EXPORT_SYMBOL(),
placed after the function definition:

NOKPROBE_SYMBOL(function);

Since this macro will inhibit inlining of static/inline
functions, this patch also introduces a nokprobe_inline macro
for static/inline functions. In this case, we must use
NOKPROBE_SYMBOL() for the inline function caller.

When CONFIG_KPROBES=y, the macro stores the given function
address in the "_kprobe_blacklist" section.

Since the data structures are not fully initialized by the
macro (because there is no "size" information), those
are re-initialized at boot time by using kallsyms.

Signed-off-by: Masami Hiramatsu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: Alok Kataria <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christopher Li <[email protected]>
Cc: Chris Wright <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Jan-Simon Möller <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
Documentation/kprobes.txt | 16 +++++-
arch/x86/include/asm/asm.h | 7 +++
arch/x86/kernel/paravirt.c | 4 ++
include/asm-generic/vmlinux.lds.h | 9 ++++
include/linux/compiler.h | 2 +
include/linux/kprobes.h | 20 ++++++--
kernel/kprobes.c | 100 ++++++++++++++++++++------------------
kernel/sched/core.c | 1 +
8 files changed, 107 insertions(+), 52 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 0cfb00f..4bbeca8 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -22,8 +22,9 @@ Appendix B: The kprobes sysctl interface

Kprobes enables you to dynamically break into any kernel routine and
collect debugging and performance information non-disruptively. You
-can trap at almost any kernel code address, specifying a handler
+can trap at almost any kernel code address(*), specifying a handler
routine to be invoked when the breakpoint is hit.
+(*: some parts of the kernel code can not be trapped, see 1.5 Blacklist)

There are currently three types of probes: kprobes, jprobes, and
kretprobes (also called return probes). A kprobe can be inserted
@@ -273,6 +274,19 @@ using one of the following techniques:
or
- Execute 'sysctl -w debug.kprobes_optimization=n'

+1.5 Blacklist
+
+Kprobes can probe most of the kernel except itself. This means
+that there are some functions where kprobes cannot probe. Probing
+(trapping) such functions can cause a recursive trap (e.g. double
+fault) or the nested probe handler may never be called.
+Kprobes manages such functions as a blacklist.
+If you want to add a function into the blacklist, you just need
+to (1) include linux/kprobes.h and (2) use NOKPROBE_SYMBOL() macro
+to specify a blacklisted function.
+Kprobes checks the given probe address against the blacklist and
+rejects registering it, if the given address is in the blacklist.
+
2. Architectures Supported

Kprobes, jprobes, and return probes are implemented on the following
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 4582e8e..7730c1c 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -57,6 +57,12 @@
.long (from) - . ; \
.long (to) - . + 0x7ffffff0 ; \
.popsection
+
+# define _ASM_NOKPROBE(entry) \
+ .pushsection "_kprobe_blacklist","aw" ; \
+ _ASM_ALIGN ; \
+ _ASM_PTR (entry); \
+ .popsection
#else
# define _ASM_EXTABLE(from,to) \
" .pushsection \"__ex_table\",\"a\"\n" \
@@ -71,6 +77,7 @@
" .long (" #from ") - .\n" \
" .long (" #to ") - . + 0x7ffffff0\n" \
" .popsection\n"
+/* For C file, we already have NOKPROBE_SYMBOL macro */
#endif

#endif /* _ASM_X86_ASM_H */
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 1b10af8..e136869 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -23,6 +23,7 @@
#include <linux/efi.h>
#include <linux/bcd.h>
#include <linux/highmem.h>
+#include <linux/kprobes.h>

#include <asm/bug.h>
#include <asm/paravirt.h>
@@ -389,6 +390,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.end_context_switch = paravirt_nop,
};

+/* At this point, native_get_debugreg has a real function entry */
+NOKPROBE_SYMBOL(native_get_debugreg);
+
struct pv_apic_ops pv_apic_ops = {
#ifdef CONFIG_X86_LOCAL_APIC
.startup_ipi_hook = paravirt_nop,
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 146e4ff..40ceb3c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -109,6 +109,14 @@
#define BRANCH_PROFILE()
#endif

+#ifdef CONFIG_KPROBES
+#define KPROBE_BLACKLIST() VMLINUX_SYMBOL(__start_kprobe_blacklist) = .; \
+ *(_kprobe_blacklist) \
+ VMLINUX_SYMBOL(__stop_kprobe_blacklist) = .;
+#else
+#define KPROBE_BLACKLIST()
+#endif
+
#ifdef CONFIG_EVENT_TRACING
#define FTRACE_EVENTS() . = ALIGN(8); \
VMLINUX_SYMBOL(__start_ftrace_events) = .; \
@@ -507,6 +515,7 @@
*(.init.rodata) \
FTRACE_EVENTS() \
TRACE_SYSCALLS() \
+ KPROBE_BLACKLIST() \
MEM_DISCARD(init.rodata) \
CLK_OF_TABLES() \
RESERVEDMEM_OF_TABLES() \
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index ee7239e..0300c0f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -374,7 +374,9 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
/* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */
#ifdef CONFIG_KPROBES
# define __kprobes __attribute__((__section__(".kprobes.text")))
+# define nokprobe_inline __always_inline
#else
# define __kprobes
+# define nokprobe_inline inline
#endif
#endif /* __LINUX_COMPILER_H */
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index cdf9251..e059507 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -205,10 +205,10 @@ struct kretprobe_blackpoint {
void *addr;
};

-struct kprobe_blackpoint {
- const char *name;
+struct kprobe_blacklist_entry {
+ struct list_head list;
unsigned long start_addr;
- unsigned long range;
+ unsigned long end_addr;
};

#ifdef CONFIG_KPROBES
@@ -477,4 +477,18 @@ static inline int enable_jprobe(struct jprobe *jp)
return enable_kprobe(&jp->kp);
}

+#ifdef CONFIG_KPROBES
+/*
+ * Blacklist ganerating macro. Specify functions which is not probed
+ * by using this macro.
+ */
+#define __NOKPROBE_SYMBOL(fname) \
+static unsigned long __used \
+ __attribute__((section("_kprobe_blacklist"))) \
+ _kbl_addr_##fname = (unsigned long)fname;
+#define NOKPROBE_SYMBOL(fname) __NOKPROBE_SYMBOL(fname)
+#else
+#define NOKPROBE_SYMBOL(fname)
+#endif
+
#endif /* _LINUX_KPROBES_H */
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 5b5ac76..5ffc687 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -86,18 +86,8 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
return &(kretprobe_table_locks[hash].lock);
}

-/*
- * Normally, functions that we'd want to prohibit kprobes in, are marked
- * __kprobes. But, there are cases where such functions already belong to
- * a different section (__sched for preempt_schedule)
- *
- * For such cases, we now have a blacklist
- */
-static struct kprobe_blackpoint kprobe_blacklist[] = {
- {"preempt_schedule",},
- {"native_get_debugreg",},
- {NULL} /* Terminator */
-};
+/* Blacklist -- list of struct kprobe_blacklist_entry */
+static LIST_HEAD(kprobe_blacklist);

#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
/*
@@ -1328,24 +1318,22 @@ bool __weak arch_within_kprobe_blacklist(unsigned long addr)
addr < (unsigned long)__kprobes_text_end;
}

-static int __kprobes in_kprobes_functions(unsigned long addr)
+static bool __kprobes within_kprobe_blacklist(unsigned long addr)
{
- struct kprobe_blackpoint *kb;
+ struct kprobe_blacklist_entry *ent;

if (arch_within_kprobe_blacklist(addr))
- return -EINVAL;
+ return true;
/*
* If there exists a kprobe_blacklist, verify and
* fail any probe registration in the prohibited area
*/
- for (kb = kprobe_blacklist; kb->name != NULL; kb++) {
- if (kb->start_addr) {
- if (addr >= kb->start_addr &&
- addr < (kb->start_addr + kb->range))
- return -EINVAL;
- }
+ list_for_each_entry(ent, &kprobe_blacklist, list) {
+ if (addr >= ent->start_addr && addr < ent->end_addr)
+ return true;
}
- return 0;
+
+ return false;
}

/*
@@ -1436,7 +1424,7 @@ static __kprobes int check_kprobe_address_safe(struct kprobe *p,

/* Ensure it is not in reserved area nor out of text */
if (!kernel_text_address((unsigned long) p->addr) ||
- in_kprobes_functions((unsigned long) p->addr) ||
+ within_kprobe_blacklist((unsigned long) p->addr) ||
jump_label_text_reserved(p->addr, p->addr)) {
ret = -EINVAL;
goto out;
@@ -2022,6 +2010,38 @@ void __kprobes dump_kprobe(struct kprobe *kp)
kp->symbol_name, kp->addr, kp->offset);
}

+/*
+ * Lookup and populate the kprobe_blacklist.
+ *
+ * Unlike the kretprobe blacklist, we'll need to determine
+ * the range of addresses that belong to the said functions,
+ * since a kprobe need not necessarily be at the beginning
+ * of a function.
+ */
+static int __init populate_kprobe_blacklist(unsigned long *start,
+ unsigned long *end)
+{
+ unsigned long *iter;
+ struct kprobe_blacklist_entry *ent;
+ unsigned long offset = 0, size = 0;
+
+ for (iter = start; iter < end; iter++) {
+ if (!kallsyms_lookup_size_offset(*iter, &size, &offset)) {
+ pr_err("Failed to find blacklist %p\n", (void *)*iter);
+ continue;
+ }
+
+ ent = kmalloc(sizeof(*ent), GFP_KERNEL);
+ if (!ent)
+ return -ENOMEM;
+ ent->start_addr = *iter;
+ ent->end_addr = *iter + size;
+ INIT_LIST_HEAD(&ent->list);
+ list_add_tail(&ent->list, &kprobe_blacklist);
+ }
+ return 0;
+}
+
/* Module notifier call back, checking kprobes on the module */
static int __kprobes kprobes_module_callback(struct notifier_block *nb,
unsigned long val, void *data)
@@ -2065,14 +2085,13 @@ static struct notifier_block kprobe_module_nb = {
.priority = 0
};

+/* Markers of _kprobe_blacklist section */
+extern unsigned long __start_kprobe_blacklist[];
+extern unsigned long __stop_kprobe_blacklist[];
+
static int __init init_kprobes(void)
{
int i, err = 0;
- unsigned long offset = 0, size = 0;
- char *modname, namebuf[KSYM_NAME_LEN];
- const char *symbol_name;
- void *addr;
- struct kprobe_blackpoint *kb;

/* FIXME allocate the probe table, currently defined statically */
/* initialize all list heads */
@@ -2082,26 +2101,11 @@ static int __init init_kprobes(void)
raw_spin_lock_init(&(kretprobe_table_locks[i].lock));
}

- /*
- * Lookup and populate the kprobe_blacklist.
- *
- * Unlike the kretprobe blacklist, we'll need to determine
- * the range of addresses that belong to the said functions,
- * since a kprobe need not necessarily be at the beginning
- * of a function.
- */
- for (kb = kprobe_blacklist; kb->name != NULL; kb++) {
- kprobe_lookup_name(kb->name, addr);
- if (!addr)
- continue;
-
- kb->start_addr = (unsigned long)addr;
- symbol_name = kallsyms_lookup(kb->start_addr,
- &size, &offset, &modname, namebuf);
- if (!symbol_name)
- kb->range = 0;
- else
- kb->range = size;
+ err = populate_kprobe_blacklist(__start_kprobe_blacklist,
+ __stop_kprobe_blacklist);
+ if (err) {
+ pr_err("kprobes: failed to populate blacklist: %d\n", err);
+ pr_err("Please take care of using kprobes.\n");
}

if (kretprobe_blacklist_size) {
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 268a45e..6863631 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2804,6 +2804,7 @@ asmlinkage void __sched notrace preempt_schedule(void)
barrier();
} while (need_resched());
}
+NOKPROBE_SYMBOL(preempt_schedule);
EXPORT_SYMBOL(preempt_schedule);
#endif /* CONFIG_PREEMPT */

Subject: [tip:perf/kprobes] kprobes/x86: Call exception handlers directly from do_int3/do_debug

Commit-ID: 6f6343f53d133bae516caf3d254bce37d8774625
Gitweb: http://git.kernel.org/tip/6f6343f53d133bae516caf3d254bce37d8774625
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:33 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:59 +0200

kprobes/x86: Call exception handlers directly from do_int3/do_debug

To avoid a kernel crash by probing on lockdep code, call
kprobe_int3_handler() and kprobe_debug_handler()(which was
formerly called post_kprobe_handler()) directly from
do_int3 and do_debug.

Currently kprobes uses notify_die() to hook the int3/debug
exceptoins. Since there is a locking code in notify_die,
the lockdep code can be invoked. And because the lockdep
involves printk() related things, theoretically, we need to
prohibit probing on such code, which means much longer blacklist
we'll have. Instead, hooking the int3/debug for kprobes before
notify_die() can avoid this problem.

Anyway, most of the int3 handlers in the kernel are already
called from do_int3 directly, e.g. ftrace_int3_handler,
poke_int3_handler, kgdb_ll_trap. Actually only
kprobe_exceptions_notify is on the notifier_call_chain.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Jonathan Lebon <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/kprobes.h | 2 ++
arch/x86/kernel/kprobes/core.c | 24 +++---------------------
arch/x86/kernel/traps.c | 10 ++++++++++
3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 9454c16..53cdfb2 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -116,4 +116,6 @@ struct kprobe_ctlblk {
extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
extern int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);
+extern int kprobe_int3_handler(struct pt_regs *regs);
+extern int kprobe_debug_handler(struct pt_regs *regs);
#endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 7751b3d..9b80aec 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -559,7 +559,7 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb
* Interrupts are disabled on entry as trap3 is an interrupt gate and they
* remain disabled throughout this function.
*/
-static int __kprobes kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_int3_handler(struct pt_regs *regs)
{
kprobe_opcode_t *addr;
struct kprobe *p;
@@ -857,7 +857,7 @@ no_change:
* Interrupts are disabled on entry as trap1 is an interrupt gate and they
* remain disabled throughout this function.
*/
-static int __kprobes post_kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_debug_handler(struct pt_regs *regs)
{
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -963,22 +963,7 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
if (args->regs && user_mode_vm(args->regs))
return ret;

- switch (val) {
- case DIE_INT3:
- if (kprobe_handler(args->regs))
- ret = NOTIFY_STOP;
- break;
- case DIE_DEBUG:
- if (post_kprobe_handler(args->regs)) {
- /*
- * Reset the BS bit in dr6 (pointed by args->err) to
- * denote completion of processing
- */
- (*(unsigned long *)ERR_PTR(args->err)) &= ~DR_STEP;
- ret = NOTIFY_STOP;
- }
- break;
- case DIE_GPF:
+ if (val == DIE_GPF) {
/*
* To be potentially processing a kprobe fault and to
* trust the result from kprobe_running(), we have
@@ -987,9 +972,6 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
if (!preemptible() && kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
- break;
- default:
- break;
}
return ret;
}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..e5d4a70 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -334,6 +334,11 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
goto exit;
#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */

+#ifdef CONFIG_KPROBES
+ if (kprobe_int3_handler(regs))
+ return;
+#endif
+
if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
goto exit;
@@ -440,6 +445,11 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
/* Store the virtualized DR6 value */
tsk->thread.debugreg6 = dr6;

+#ifdef CONFIG_KPROBES
+ if (kprobe_debug_handler(regs))
+ goto exit;
+#endif
+
if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
SIGTRAP) == NOTIFY_STOP)
goto exit;

Subject: [tip:perf/kprobes] kprobes, x86: Prohibit probing on debug_stack_*()

Commit-ID: 0f46efeb44e360b78f54a968b4d92e6877c35891
Gitweb: http://git.kernel.org/tip/0f46efeb44e360b78f54a968b4d92e6877c35891
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:12 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:57 +0200

kprobes, x86: Prohibit probing on debug_stack_*()

Prohibit probing on debug_stack_reset and debug_stack_set_zero.
Since the both functions are called from TRACE_IRQS_ON/OFF_DEBUG
macros which run in int3 ist entry, probing it may cause a soft
lockup.

This happens when the kernel built with CONFIG_DYNAMIC_FTRACE=y
and CONFIG_TRACE_IRQFLAGS=y.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/cpu/common.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a135239..5af696d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -8,6 +8,7 @@
#include <linux/delay.h>
#include <linux/sched.h>
#include <linux/init.h>
+#include <linux/kprobes.h>
#include <linux/kgdb.h>
#include <linux/smp.h>
#include <linux/io.h>
@@ -1160,6 +1161,7 @@ int is_debug_stack(unsigned long addr)
(addr <= __get_cpu_var(debug_stack_addr) &&
addr > (__get_cpu_var(debug_stack_addr) - DEBUG_STKSZ));
}
+NOKPROBE_SYMBOL(is_debug_stack);

DEFINE_PER_CPU(u32, debug_idt_ctr);

@@ -1168,6 +1170,7 @@ void debug_stack_set_zero(void)
this_cpu_inc(debug_idt_ctr);
load_current_idt();
}
+NOKPROBE_SYMBOL(debug_stack_set_zero);

void debug_stack_reset(void)
{
@@ -1176,6 +1179,7 @@ void debug_stack_reset(void)
if (this_cpu_dec_return(debug_idt_ctr) == 0)
load_current_idt();
}
+NOKPROBE_SYMBOL(debug_stack_reset);

#else /* CONFIG_X86_64 */

Subject: [tip:perf/kprobes] kprobes, x86: Prohibit probing on thunk functions and restore

Commit-ID: 98def1dedd00f42ded8423c418c971751f46aad2
Gitweb: http://git.kernel.org/tip/98def1dedd00f42ded8423c418c971751f46aad2
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 17 Apr 2014 17:17:26 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 24 Apr 2014 10:02:58 +0200

kprobes, x86: Prohibit probing on thunk functions and restore

thunk/restore functions are also used for tracing irqoff etc.
and those are involved in kprobe's exception handling.
Prohibit probing on them to avoid kernel crash.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/lib/thunk_32.S | 3 ++-
arch/x86/lib/thunk_64.S | 3 +++
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/thunk_32.S b/arch/x86/lib/thunk_32.S
index 2930ae0..28f85c91 100644
--- a/arch/x86/lib/thunk_32.S
+++ b/arch/x86/lib/thunk_32.S
@@ -4,8 +4,8 @@
* (inspired by Andi Kleen's thunk_64.S)
* Subject to the GNU public license, v.2. No warranty of any kind.
*/
-
#include <linux/linkage.h>
+ #include <asm/asm.h>

#ifdef CONFIG_TRACE_IRQFLAGS
/* put return address in eax (arg1) */
@@ -22,6 +22,7 @@
popl %ecx
popl %eax
ret
+ _ASM_NOKPROBE(\name)
.endm

thunk_ra trace_hardirqs_on_thunk,trace_hardirqs_on_caller
diff --git a/arch/x86/lib/thunk_64.S b/arch/x86/lib/thunk_64.S
index a63efd6..92d9fea 100644
--- a/arch/x86/lib/thunk_64.S
+++ b/arch/x86/lib/thunk_64.S
@@ -8,6 +8,7 @@
#include <linux/linkage.h>
#include <asm/dwarf2.h>
#include <asm/calling.h>
+#include <asm/asm.h>

/* rdi: arg1 ... normal C conventions. rax is saved/restored. */
.macro THUNK name, func, put_ret_addr_in_rdi=0
@@ -25,6 +26,7 @@
call \func
jmp restore
CFI_ENDPROC
+ _ASM_NOKPROBE(\name)
.endm

#ifdef CONFIG_TRACE_IRQFLAGS
@@ -43,3 +45,4 @@ restore:
RESTORE_ARGS
ret
CFI_ENDPROC
+ _ASM_NOKPROBE(restore)

Subject: Re: [PATCH -tip v9 22/26] kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text

(2014/04/24 17:58), Ingo Molnar wrote:
>
> * Masami Hiramatsu <[email protected]> wrote:
>
>> Use kprobe_blackpoint for blacklisting .entry.text and .kprobees.text
>> instead of arch_within_kprobe_blacklist. This also makes them visible
>> via (debugfs)/kprobes/blacklist.
>
> This description references kprobe_blackpoint, which name I don't
> think exists anymore.

Oops, yes, that should be kprobe_blacklist, as title said.

Thanks,


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module

(2014/04/24 17:56), Ingo Molnar wrote:
>> diff --git a/include/linux/module.h b/include/linux/module.h
>> index f520a76..2fdb673 100644
>> --- a/include/linux/module.h
>> +++ b/include/linux/module.h
>> @@ -16,6 +16,7 @@
>> #include <linux/kobject.h>
>> #include <linux/moduleparam.h>
>> #include <linux/jump_label.h>
>> +#include <linux/kprobes.h>
>
> This include breaks the x86 build:
>
> CC arch/x86/kernel/jump_label.o
> In file included from arch/x86/kernel/jump_label.c:14:0:
> /fast/mingo/tip/arch/x86/include/asm/kprobes.h:35:12: error: conflicting types for ‘kprobe_opcode_t' typedef u8 kprobe_opcode_t;
> [...]

Hmm, this error seems very odd... and I don't see

>
> But the #include kprobes.h is unnecessary to begin with, as no kprobe
> specific types are used.

OK, anyway I'll remove that.

>
>> #include <linux/export.h>
>>
>> #include <linux/percpu.h>
>> @@ -357,6 +358,10 @@ struct module {
>> unsigned int num_ftrace_callsites;
>> unsigned long *ftrace_callsites;
>> #endif
>> +#ifdef CONFIG_KPROBES
>> + unsigned int num_kprobe_blacklist;
>> + unsigned long *kprobe_blacklist;
>> +#endif
>
> There's a small coding style problem here.
>
> More importantly, I think more should be done to make sure that module
> symbols are marked properly: since the module is going to register the
> kprobes handler, that would be a perfect place to emit a warning,
> right?
>
> In fact, why don't kprobe handlers get added to the exclusion list
> explicitly, when the handler gets registered? With such an approach
> handlers are automatically nokprobe and don't need any annotation -
> which is a far more robust usage model.

Ah, I see. That is because there are some local functions called only
from the kprobe handlers. It is easy to blacklist the kprobe handlers
itself, but not for the functions which are only called from them. :(

So, I can add a patch which automatically add handler functions to
blacklist. But that is another story. I think this patch is also
required.

Thank you,

>
> So I'm skipping this patch and the next one that makes use of it.
>
> Thanks,
>
> Ingo
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-04-24 11:26:16

by Jiri Kosina

[permalink] [raw]
Subject: Re: [tip:perf/kprobes] kprobes, x86: Allow kprobes on text_poke/ hw_breakpoint

On Thu, 24 Apr 2014, tip-bot for Masami Hiramatsu wrote:

> Commit-ID: 9c54b6164eeb292a0eac86c6913bd8daaff35e62
> Gitweb: http://git.kernel.org/tip/9c54b6164eeb292a0eac86c6913bd8daaff35e62
> Author: Masami Hiramatsu <[email protected]>
> AuthorDate: Thu, 17 Apr 2014 17:18:07 +0900
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Thu, 24 Apr 2014 10:03:02 +0200
>
> kprobes, x86: Allow kprobes on text_poke/hw_breakpoint
>
> Allow kprobes on text_poke/hw_breakpoint because
> those are not related to the critical int3-debug
> recursive path of kprobes at this moment.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Reviewed-by: Steven Rostedt <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Fengguang Wu <[email protected]>
> Cc: Jiri Kosina <[email protected]>

Reviewed-by: Jiri Kosina <[email protected]>

> Cc: Oleg Nesterov <[email protected]>
> Cc: Paul Gortmaker <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> arch/x86/kernel/alternative.c | 3 +--
> arch/x86/kernel/hw_breakpoint.c | 5 ++---
> 2 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index df94598..703130f 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -5,7 +5,6 @@
> #include <linux/mutex.h>
> #include <linux/list.h>
> #include <linux/stringify.h>
> -#include <linux/kprobes.h>
> #include <linux/mm.h>
> #include <linux/vmalloc.h>
> #include <linux/memory.h>
> @@ -551,7 +550,7 @@ void *__init_or_module text_poke_early(void *addr, const void *opcode,
> *
> * Note: Must be called under text_mutex.
> */
> -void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> +void *text_poke(void *addr, const void *opcode, size_t len)
> {
> unsigned long flags;
> char *vaddr;
> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
> index a67b47c..5f9cf20 100644
> --- a/arch/x86/kernel/hw_breakpoint.c
> +++ b/arch/x86/kernel/hw_breakpoint.c
> @@ -32,7 +32,6 @@
> #include <linux/irqflags.h>
> #include <linux/notifier.h>
> #include <linux/kallsyms.h>
> -#include <linux/kprobes.h>
> #include <linux/percpu.h>
> #include <linux/kdebug.h>
> #include <linux/kernel.h>
> @@ -424,7 +423,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
> * NOTIFY_STOP returned for all other cases
> *
> */
> -static int __kprobes hw_breakpoint_handler(struct die_args *args)
> +static int hw_breakpoint_handler(struct die_args *args)
> {
> int i, cpu, rc = NOTIFY_STOP;
> struct perf_event *bp;
> @@ -511,7 +510,7 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
> /*
> * Handle debug exception notifications.
> */
> -int __kprobes hw_breakpoint_exceptions_notify(
> +int hw_breakpoint_exceptions_notify(
> struct notifier_block *unused, unsigned long val, void *data)
> {
> if (val != DIE_DEBUG)
>

--
Jiri Kosina
SUSE Labs

2014-04-24 11:27:00

by Jiri Kosina

[permalink] [raw]
Subject: Re: [tip:perf/kprobes] kprobes/x86: Call exception handlers directly from do_int3/do_debug

On Thu, 24 Apr 2014, tip-bot for Masami Hiramatsu wrote:

> Commit-ID: 6f6343f53d133bae516caf3d254bce37d8774625
> Gitweb: http://git.kernel.org/tip/6f6343f53d133bae516caf3d254bce37d8774625
> Author: Masami Hiramatsu <[email protected]>
> AuthorDate: Thu, 17 Apr 2014 17:17:33 +0900
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Thu, 24 Apr 2014 10:02:59 +0200
>
> kprobes/x86: Call exception handlers directly from do_int3/do_debug
>
> To avoid a kernel crash by probing on lockdep code, call
> kprobe_int3_handler() and kprobe_debug_handler()(which was
> formerly called post_kprobe_handler()) directly from
> do_int3 and do_debug.
>
> Currently kprobes uses notify_die() to hook the int3/debug
> exceptoins. Since there is a locking code in notify_die,
> the lockdep code can be invoked. And because the lockdep
> involves printk() related things, theoretically, we need to
> prohibit probing on such code, which means much longer blacklist
> we'll have. Instead, hooking the int3/debug for kprobes before
> notify_die() can avoid this problem.
>
> Anyway, most of the int3 handlers in the kernel are already
> called from do_int3 directly, e.g. ftrace_int3_handler,
> poke_int3_handler, kgdb_ll_trap. Actually only
> kprobe_exceptions_notify is on the notifier_call_chain.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Reviewed-by: Steven Rostedt <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jiri Kosina <[email protected]>

Reviewed-by: Jiri Kosina <[email protected]>

> Cc: Jonathan Lebon <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Rusty Russell <[email protected]>
> Cc: Seiji Aguchi <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> arch/x86/include/asm/kprobes.h | 2 ++
> arch/x86/kernel/kprobes/core.c | 24 +++---------------------
> arch/x86/kernel/traps.c | 10 ++++++++++
> 3 files changed, 15 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
> index 9454c16..53cdfb2 100644
> --- a/arch/x86/include/asm/kprobes.h
> +++ b/arch/x86/include/asm/kprobes.h
> @@ -116,4 +116,6 @@ struct kprobe_ctlblk {
> extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
> extern int kprobe_exceptions_notify(struct notifier_block *self,
> unsigned long val, void *data);
> +extern int kprobe_int3_handler(struct pt_regs *regs);
> +extern int kprobe_debug_handler(struct pt_regs *regs);
> #endif /* _ASM_X86_KPROBES_H */
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index 7751b3d..9b80aec 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -559,7 +559,7 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb
> * Interrupts are disabled on entry as trap3 is an interrupt gate and they
> * remain disabled throughout this function.
> */
> -static int __kprobes kprobe_handler(struct pt_regs *regs)
> +int __kprobes kprobe_int3_handler(struct pt_regs *regs)
> {
> kprobe_opcode_t *addr;
> struct kprobe *p;
> @@ -857,7 +857,7 @@ no_change:
> * Interrupts are disabled on entry as trap1 is an interrupt gate and they
> * remain disabled throughout this function.
> */
> -static int __kprobes post_kprobe_handler(struct pt_regs *regs)
> +int __kprobes kprobe_debug_handler(struct pt_regs *regs)
> {
> struct kprobe *cur = kprobe_running();
> struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> @@ -963,22 +963,7 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
> if (args->regs && user_mode_vm(args->regs))
> return ret;
>
> - switch (val) {
> - case DIE_INT3:
> - if (kprobe_handler(args->regs))
> - ret = NOTIFY_STOP;
> - break;
> - case DIE_DEBUG:
> - if (post_kprobe_handler(args->regs)) {
> - /*
> - * Reset the BS bit in dr6 (pointed by args->err) to
> - * denote completion of processing
> - */
> - (*(unsigned long *)ERR_PTR(args->err)) &= ~DR_STEP;
> - ret = NOTIFY_STOP;
> - }
> - break;
> - case DIE_GPF:
> + if (val == DIE_GPF) {
> /*
> * To be potentially processing a kprobe fault and to
> * trust the result from kprobe_running(), we have
> @@ -987,9 +972,6 @@ kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *d
> if (!preemptible() && kprobe_running() &&
> kprobe_fault_handler(args->regs, args->trapnr))
> ret = NOTIFY_STOP;
> - break;
> - default:
> - break;
> }
> return ret;
> }
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 57409f6..e5d4a70 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -334,6 +334,11 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
> goto exit;
> #endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
>
> +#ifdef CONFIG_KPROBES
> + if (kprobe_int3_handler(regs))
> + return;
> +#endif
> +
> if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
> SIGTRAP) == NOTIFY_STOP)
> goto exit;
> @@ -440,6 +445,11 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> /* Store the virtualized DR6 value */
> tsk->thread.debugreg6 = dr6;
>
> +#ifdef CONFIG_KPROBES
> + if (kprobe_debug_handler(regs))
> + goto exit;
> +#endif
> +
> if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
> SIGTRAP) == NOTIFY_STOP)
> goto exit;
>

--
Jiri Kosina
SUSE Labs

Subject: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits

(2014/04/24 18:01), Ingo Molnar wrote:
>
> * Masami Hiramatsu <[email protected]> wrote:
>
>> Introduce kprobe cache to reduce cache misshits for
>> massive multiple kprobes.
>> For stress testing kprobes, we need to activate kprobes
>> as many as possible. This situation causes cache miss
>> hit storm on kprobe hash-list. kprobe hashlist is already
>> enlarged to 4k entries and this is still small for 40k
>> kprobes.
>>
>> For example, when registering 40k probes on the hlist and
>> enabling 20k probes, perf tools shows still a lot of
>> cache-misses are on the get_kprobe.
>> ----
>> Samples: 633 of event 'cache-misses', Event count (approx.): 3414776
>> + 68.13% [k] get_kprobe
>> + 4.38% [k] ftrace_lookup_ip
>> + 2.54% [k] kprobe_ftrace_handler
>> ----
>>
>> Also, I found that the most of the kprobes are not hit.
>> In that case, to reduce cache-misses, we can reduce the
>> random memory access by introducing a per-cpu cache which
>> caches the address of frequently used kprobe data structure
>> and its probe address.
>>
>> With kpcache enabled, the get_kprobe_cached goes down to
>> around 4-5% of cache-misses with 20k probes.
>> ----
>> Samples: 729 of event 'cache-misses', Event count (approx.): 690125
>> + 14.49% [k] ftrace_lookup_ip
>> + 5.61% [k] kprobe_trace_func
>> + 5.17% [k] kprobe_ftrace_handler
>> + 4.62% [k] get_kprobe_cached
>> ----
>>
>> Of course this reduces the enabling time too.
>>
>> Without this fix (just enlarge hash table):
>> (2934 sec, 1 min intervals for each 2000 probes enabled)
>>
>> ----
>> Enabling trace events: start at 1393921862
>> 0 1393921864 a2mp_chan_alloc_skb_cb_38581
>> ...
>> 19999 1393924928 nfs4_open_confirm_done_11785
>> ----
>>
>> With this fix:
>> (2025 sec, 1 min intervals for each 2000 probes enabled)
>
> That's a nice speedup.

Thanks :)

>
> So I don't think this should be a Kconfig entry, just enable it
> unconditionally. That will further simplify the code.

Hmm, it consumes some amount of memory (36KB/core) just for the
case of several thousand of kprobes. On enterprise servers and desktop
it's OK, no problem. But I think, some embedded systems with small
resources will not want that. So, how about enabling Kconfig by default?

Thank you,


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-04-25 08:20:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module


* Masami Hiramatsu <[email protected]> wrote:

> (2014/04/24 17:56), Ingo Molnar wrote:
> >> diff --git a/include/linux/module.h b/include/linux/module.h
> >> index f520a76..2fdb673 100644
> >> --- a/include/linux/module.h
> >> +++ b/include/linux/module.h
> >> @@ -16,6 +16,7 @@
> >> #include <linux/kobject.h>
> >> #include <linux/moduleparam.h>
> >> #include <linux/jump_label.h>
> >> +#include <linux/kprobes.h>
> >
> > This include breaks the x86 build:
> >
> > CC arch/x86/kernel/jump_label.o
> > In file included from arch/x86/kernel/jump_label.c:14:0:
> > /fast/mingo/tip/arch/x86/include/asm/kprobes.h:35:12: error: conflicting types for ‘kprobe_opcode_t' typedef u8 kprobe_opcode_t;
> > [...]
>
> Hmm, this error seems very odd... and I don't see

Needs 'make allnoconfig' or some similar .config combination.

> > But the #include kprobes.h is unnecessary to begin with, as no kprobe
> > specific types are used.
>
> OK, anyway I'll remove that.
>
> >
> >> #include <linux/export.h>
> >>
> >> #include <linux/percpu.h>
> >> @@ -357,6 +358,10 @@ struct module {
> >> unsigned int num_ftrace_callsites;
> >> unsigned long *ftrace_callsites;
> >> #endif
> >> +#ifdef CONFIG_KPROBES
> >> + unsigned int num_kprobe_blacklist;
> >> + unsigned long *kprobe_blacklist;
> >> +#endif
> >
> > There's a small coding style problem here.
> >
> > More importantly, I think more should be done to make sure that module
> > symbols are marked properly: since the module is going to register the
> > kprobes handler, that would be a perfect place to emit a warning,
> > right?
> >
> > In fact, why don't kprobe handlers get added to the exclusion list
> > explicitly, when the handler gets registered? With such an approach
> > handlers are automatically nokprobe and don't need any annotation -
> > which is a far more robust usage model.
>
> Ah, I see. That is because there are some local functions called
> only from the kprobe handlers. It is easy to blacklist the kprobe
> handlers itself, but not for the functions which are only called
> from them. :(
>
> So, I can add a patch which automatically add handler functions to
> blacklist. But that is another story. I think this patch is also
> required.

Fair enough! I'd even argue to not do the auto-blacklisting I
suggested, to make it really apparent to module authors that
annotations are needed.

Thanks,

Ingo

2014-04-25 08:22:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits


* Masami Hiramatsu <[email protected]> wrote:

> > So I don't think this should be a Kconfig entry, just enable it
> > unconditionally. That will further simplify the code.
>
> Hmm, it consumes some amount of memory (36KB/core) just for the case
> of several thousand of kprobes. On enterprise servers and desktop
> it's OK, no problem. But I think, some embedded systems with small
> resources will not want that. [...]

They'll just disable kprobes in general.

Really, at this point complexity is our main concern.

Thanks,

Ingo

Subject: Re: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits

(2014/04/25 17:20), Ingo Molnar wrote:
>
> * Masami Hiramatsu <[email protected]> wrote:
>
>>> So I don't think this should be a Kconfig entry, just enable it
>>> unconditionally. That will further simplify the code.
>>
>> Hmm, it consumes some amount of memory (36KB/core) just for the case
>> of several thousand of kprobes. On enterprise servers and desktop
>> it's OK, no problem. But I think, some embedded systems with small
>> resources will not want that. [...]
>
> They'll just disable kprobes in general.

No, I'd like to provide kprobes (and dynamic events) to them (including
me) for debugging and dynamic monitoring, instead of modifying code for
adding events on their kernel. To solve some specific issues, specific
events (not generic events) are required. Making local patches to add
such events is an option, but it increases maintenance cost for rebasing.
It is better to pay cost to maintain this kconfig on upstream as the
maintainer for me instead of paying such ugly local cost. :(

Anyway, this option is not easy for beginners, I think it should be
defined with "if EXPERT" option and make it enabled by default.

> Really, at this point complexity is our main concern.

Agreed about complexity issue. However, even if we remove the Kconfig,
we can just save 6 lines of the code, and one #ifdef block.
Can that really solve the complexity problem?

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module

(2014/04/25 17:19), Ingo Molnar wrote:
>
> * Masami Hiramatsu <[email protected]> wrote:
>
>> (2014/04/24 17:56), Ingo Molnar wrote:
>>>> diff --git a/include/linux/module.h b/include/linux/module.h
>>>> index f520a76..2fdb673 100644
>>>> --- a/include/linux/module.h
>>>> +++ b/include/linux/module.h
>>>> @@ -16,6 +16,7 @@
>>>> #include <linux/kobject.h>
>>>> #include <linux/moduleparam.h>
>>>> #include <linux/jump_label.h>
>>>> +#include <linux/kprobes.h>
>>>
>>> This include breaks the x86 build:
>>>
>>> CC arch/x86/kernel/jump_label.o
>>> In file included from arch/x86/kernel/jump_label.c:14:0:
>>> /fast/mingo/tip/arch/x86/include/asm/kprobes.h:35:12: error: conflicting types for ‘kprobe_opcode_t' typedef u8 kprobe_opcode_t;
>>> [...]
>>
>> Hmm, this error seems very odd... and I don't see
>
> Needs 'make allnoconfig' or some similar .config combination.

Ah, OK. I could reproduce it. And as far as I can see, this is jump_label's bug.
jump_label.c includes asm/kprobes.h directly, but that header is for internal use.
It has to include linux/kprobes.h instead. Moreover, it doesn't need kprobes.h anymore.

I'll fix that by another patch.

Thank you,


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module

(2014/04/25 19:12), Masami Hiramatsu wrote:
> (2014/04/25 17:19), Ingo Molnar wrote:
>>
>> * Masami Hiramatsu <[email protected]> wrote:
>>
>>> (2014/04/24 17:56), Ingo Molnar wrote:
>>>>> diff --git a/include/linux/module.h b/include/linux/module.h
>>>>> index f520a76..2fdb673 100644
>>>>> --- a/include/linux/module.h
>>>>> +++ b/include/linux/module.h
>>>>> @@ -16,6 +16,7 @@
>>>>> #include <linux/kobject.h>
>>>>> #include <linux/moduleparam.h>
>>>>> #include <linux/jump_label.h>
>>>>> +#include <linux/kprobes.h>
>>>>
>>>> This include breaks the x86 build:
>>>>
>>>> CC arch/x86/kernel/jump_label.o
>>>> In file included from arch/x86/kernel/jump_label.c:14:0:
>>>> /fast/mingo/tip/arch/x86/include/asm/kprobes.h:35:12: error: conflicting types for ‘kprobe_opcode_t' typedef u8 kprobe_opcode_t;
>>>> [...]
>>>
>>> Hmm, this error seems very odd... and I don't see
>>
>> Needs 'make allnoconfig' or some similar .config combination.
>
> Ah, OK. I could reproduce it. And as far as I can see, this is jump_label's bug.
> jump_label.c includes asm/kprobes.h directly, but that header is for internal use.
> It has to include linux/kprobes.h instead. Moreover, it doesn't need kprobes.h anymore.

Oops, arch/x86/kernel/ftrace.c also has same problem. At this point, I'll just
remove unneeded #include from module.h, which can avoid both.

Thank you,


--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-04-26 07:12:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits


* Masami Hiramatsu <[email protected]> wrote:

> (2014/04/25 17:20), Ingo Molnar wrote:
> >
> > * Masami Hiramatsu <[email protected]> wrote:
> >
> >>> So I don't think this should be a Kconfig entry, just enable it
> >>> unconditionally. That will further simplify the code.
> >>
> >> Hmm, it consumes some amount of memory (36KB/core) just for the
> >> case of several thousand of kprobes. On enterprise servers and
> >> desktop it's OK, no problem. But I think, some embedded systems
> >> with small resources will not want that. [...]
> >
> > They'll just disable kprobes in general.
>
> No, I'd like to provide kprobes (and dynamic events) to them
> (including me) for debugging and dynamic monitoring, instead of
> modifying code for adding events on their kernel. To solve some
> specific issues, specific events (not generic events) are required.
> Making local patches to add such events is an option, but it
> increases maintenance cost for rebasing. It is better to pay cost to
> maintain this kconfig on upstream as the maintainer for me instead
> of paying such ugly local cost. :(
>
> Anyway, this option is not easy for beginners, I think it should be
> defined with "if EXPERT" option and make it enabled by default.
>
> > Really, at this point complexity is our main concern.
>
> Agreed about complexity issue. However, even if we remove the
> Kconfig, we can just save 6 lines of the code, and one #ifdef block.
> Can that really solve the complexity problem?

It's more about the mental picture about how kprobes works. The fewer
binary state flags, the better.

Thanks,

Ingo

Subject: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits

(2014/04/26 16:12), Ingo Molnar wrote:
>
> * Masami Hiramatsu <[email protected]> wrote:
>
>> (2014/04/25 17:20), Ingo Molnar wrote:
>>>
>>> * Masami Hiramatsu <[email protected]> wrote:
>>>
>>>>> So I don't think this should be a Kconfig entry, just enable it
>>>>> unconditionally. That will further simplify the code.
>>>>
>>>> Hmm, it consumes some amount of memory (36KB/core) just for the
>>>> case of several thousand of kprobes. On enterprise servers and
>>>> desktop it's OK, no problem. But I think, some embedded systems
>>>> with small resources will not want that. [...]
>>>
>>> They'll just disable kprobes in general.
>>
>> No, I'd like to provide kprobes (and dynamic events) to them
>> (including me) for debugging and dynamic monitoring, instead of
>> modifying code for adding events on their kernel. To solve some
>> specific issues, specific events (not generic events) are required.
>> Making local patches to add such events is an option, but it
>> increases maintenance cost for rebasing. It is better to pay cost to
>> maintain this kconfig on upstream as the maintainer for me instead
>> of paying such ugly local cost. :(
>>
>> Anyway, this option is not easy for beginners, I think it should be
>> defined with "if EXPERT" option and make it enabled by default.
>>
>>> Really, at this point complexity is our main concern.
>>
>> Agreed about complexity issue. However, even if we remove the
>> Kconfig, we can just save 6 lines of the code, and one #ifdef block.
>> Can that really solve the complexity problem?
>
> It's more about the mental picture about how kprobes works. The fewer
> binary state flags, the better.

OK, how much the amount of code is changed is not a matter.

Hmm, at this point, I'll update it to remove the Kconfig. And also I'd like
to add a note that this makes kprobes bigger, since without kpcache kprobe
hash table is just 4KB, but with kpcache, it consumes 4KB + 36KB/core
(10 times bigger at single core.)
Of course, it is still enough small for many systems which have 256MB or
more DRAM.

And I'll give smaller systems (such as controller/sensor boards for IoT etc.)
an out-of-tree kconfig patch to disable kpcache if they need... That's better
than maintaining additional-event patches.

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-06-13 17:14:08

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [tip:perf/kprobes] kprobes, x86: Call exception_enter after kprobes handled

Hi Masami,

2014-04-24 12:59 GMT+02:00 tip-bot for Masami Hiramatsu <[email protected]>:
> Commit-ID: ecd50f714c421c759354632dd00f70c718c95b10
> Gitweb: http://git.kernel.org/tip/ecd50f714c421c759354632dd00f70c718c95b10
> Author: Masami Hiramatsu <[email protected]>
> AuthorDate: Thu, 17 Apr 2014 17:17:40 +0900
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Thu, 24 Apr 2014 10:03:00 +0200
>
> kprobes, x86: Call exception_enter after kprobes handled
>
> Move exception_enter() call after kprobes handler
> is done. Since the exception_enter() involves
> many other functions (like printk), it can cause
> recursive int3/break loop when kprobes probe such
> functions.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Reviewed-by: Steven Rostedt <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jiri Kosina <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Rusty Russell <[email protected]>
> Cc: Seiji Aguchi <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>

This patch results in exception_enter/exception_exit imbalances:

arch/x86/kernel/traps.c: In function ‘do_debug’:
include/linux/context_tracking.h:46:6: warning: ‘prev_state’ may be
used uninitialized in this function [-Wmaybe-uninitialized]
if (prev_ctx == IN_USER)
^
arch/x86/kernel/traps.c:431:17: note: ‘prev_state’ was declared here
enum ctx_state prev_state;

An obvious solution would be to change all the goto exit before
exception_enter() to return from do_debug(). But if there are any user
of RCU before exception_enter() this won't work. Does
kprobe_debug_handle() use RCU read side critical sections? I'm also
worried about kmemcheck...

> ---
> arch/x86/kernel/traps.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index e5d4a70..ba9abe9 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -327,7 +327,6 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
> if (poke_int3_handler(regs))
> return;
>
> - prev_state = exception_enter();
> #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
> if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
> SIGTRAP) == NOTIFY_STOP)
> @@ -338,6 +337,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
> if (kprobe_int3_handler(regs))
> return;
> #endif
> + prev_state = exception_enter();
>
> if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
> SIGTRAP) == NOTIFY_STOP)
> @@ -415,8 +415,6 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> unsigned long dr6;
> int si_code;
>
> - prev_state = exception_enter();
> -
> get_debugreg(dr6, 6);
>
> /* Filter out all the reserved bits which are preset to 1 */
> @@ -449,6 +447,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> if (kprobe_debug_handler(regs))
> goto exit;
> #endif
> + prev_state = exception_enter();
>
> if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
> SIGTRAP) == NOTIFY_STOP)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Subject: Re: Re: [tip:perf/kprobes] kprobes, x86: Call exception_enter after kprobes handled

Hi Frederic,

(2014/06/14 2:14), Frederic Weisbecker wrote:
> Hi Masami,
>
> 2014-04-24 12:59 GMT+02:00 tip-bot for Masami Hiramatsu <[email protected]>:
>> Commit-ID: ecd50f714c421c759354632dd00f70c718c95b10
>> Gitweb: http://git.kernel.org/tip/ecd50f714c421c759354632dd00f70c718c95b10
>> Author: Masami Hiramatsu <[email protected]>
>> AuthorDate: Thu, 17 Apr 2014 17:17:40 +0900
>> Committer: Ingo Molnar <[email protected]>
>> CommitDate: Thu, 24 Apr 2014 10:03:00 +0200
>>
>> kprobes, x86: Call exception_enter after kprobes handled
>>
>> Move exception_enter() call after kprobes handler
>> is done. Since the exception_enter() involves
>> many other functions (like printk), it can cause
>> recursive int3/break loop when kprobes probe such
>> functions.
>>
>> Signed-off-by: Masami Hiramatsu <[email protected]>
>> Reviewed-by: Steven Rostedt <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Borislav Petkov <[email protected]>
>> Cc: Jiri Kosina <[email protected]>
>> Cc: Kees Cook <[email protected]>
>> Cc: Rusty Russell <[email protected]>
>> Cc: Seiji Aguchi <[email protected]>
>> Link: http://lkml.kernel.org/r/[email protected]
>> Signed-off-by: Ingo Molnar <[email protected]>
>
> This patch results in exception_enter/exception_exit imbalances:
>
> arch/x86/kernel/traps.c: In function ‘do_debug’:
> include/linux/context_tracking.h:46:6: warning: ‘prev_state’ may be
> used uninitialized in this function [-Wmaybe-uninitialized]
> if (prev_ctx == IN_USER)
> ^
> arch/x86/kernel/traps.c:431:17: note: ‘prev_state’ was declared here
> enum ctx_state prev_state;

Oops, obviously there are bugs...

> An obvious solution would be to change all the goto exit before
> exception_enter() to return from do_debug(). But if there are any user
> of RCU before exception_enter() this won't work. Does
> kprobe_debug_andle() use RCU read side critical sections? I'm also
> worried about kmemcheck...

As far as I can check the code again, it is enough to remove this patch and
to add context_track_user_*() to kprobe blacklist, since those checks
in_interrupt() at the entry and returns immediately. It seems
we have no problem on it. I think that was my fault. :(

I'll send a bugfix, thank you!

>
>> ---
>> arch/x86/kernel/traps.c | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
>> index e5d4a70..ba9abe9 100644
>> --- a/arch/x86/kernel/traps.c
>> +++ b/arch/x86/kernel/traps.c
>> @@ -327,7 +327,6 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
>> if (poke_int3_handler(regs))
>> return;
>>
>> - prev_state = exception_enter();
>> #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
>> if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
>> SIGTRAP) == NOTIFY_STOP)
>> @@ -338,6 +337,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
>> if (kprobe_int3_handler(regs))
>> return;
>> #endif
>> + prev_state = exception_enter();
>>
>> if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
>> SIGTRAP) == NOTIFY_STOP)
>> @@ -415,8 +415,6 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
>> unsigned long dr6;
>> int si_code;
>>
>> - prev_state = exception_enter();
>> -
>> get_debugreg(dr6, 6);
>>
>> /* Filter out all the reserved bits which are preset to 1 */
>> @@ -449,6 +447,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
>> if (kprobe_debug_handler(regs))
>> goto exit;
>> #endif
>> + prev_state = exception_enter();
>>
>> if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
>> SIGTRAP) == NOTIFY_STOP)
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: [PATCH -tip ] [Bugfix] x86/kprobes: Fix build errors and blacklist context_track_user

This reverts commit ecd50f714c421c759354632dd00f70c718c95b10
since it causes build errors with CONFIG_CONTEXT_TRACKING and
that has been made from misunderstandings; context_track_user_*()
don't involve much in interrupt context, it just returns
if in_interrupt() is true.

Instead of changing the do_debug/int3(), this just adds
context_track_user_*() to kprobes blacklist, since those are
still can be called right before kprobes handles int3 and debug
exceptions, and probing those will cause infinit loop.

Signed-off-by: Masami Hiramatsu <[email protected]>
Reported-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Seiji Aguchi <[email protected]>
---
arch/x86/kernel/traps.c | 7 ++++---
kernel/context_tracking.c | 3 +++
2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c6eb418..0d0e922 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -343,6 +343,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
if (poke_int3_handler(regs))
return;

+ prev_state = exception_enter();
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -351,9 +352,8 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)

#ifdef CONFIG_KPROBES
if (kprobe_int3_handler(regs))
- return;
+ goto exit;
#endif
- prev_state = exception_enter();

if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -433,6 +433,8 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
unsigned long dr6;
int si_code;

+ prev_state = exception_enter();
+
get_debugreg(dr6, 6);

/* Filter out all the reserved bits which are preset to 1 */
@@ -465,7 +467,6 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
if (kprobe_debug_handler(regs))
goto exit;
#endif
- prev_state = exception_enter();

if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
SIGTRAP) == NOTIFY_STOP)
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 019d450..5664985 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -19,6 +19,7 @@
#include <linux/sched.h>
#include <linux/hardirq.h>
#include <linux/export.h>
+#include <linux/kprobes.h>

#define CREATE_TRACE_POINTS
#include <trace/events/context_tracking.h>
@@ -104,6 +105,7 @@ void context_tracking_user_enter(void)
}
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(context_tracking_user_enter);

#ifdef CONFIG_PREEMPT
/**
@@ -181,6 +183,7 @@ void context_tracking_user_exit(void)
}
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(context_tracking_user_exit);

/**
* __context_tracking_task_switch - context switch the syscall callbacks

Subject: [tip:perf/urgent] x86/kprobes: Fix build errors and blacklist context_track_user

Commit-ID: 4cdf77a828b056258f48a9f6078bd2f77d9704bb
Gitweb: http://git.kernel.org/tip/4cdf77a828b056258f48a9f6078bd2f77d9704bb
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Sat, 14 Jun 2014 06:47:12 +0000
Committer: Ingo Molnar <[email protected]>
CommitDate: Sat, 14 Jun 2014 09:07:44 +0200

x86/kprobes: Fix build errors and blacklist context_track_user

This essentially reverts commit:

ecd50f714c42 ("kprobes, x86: Call exception_enter after kprobes handled")

since it causes build errors with CONFIG_CONTEXT_TRACKING and
that has been made from misunderstandings;
context_track_user_*() don't involve much in interrupt context,
it just returns if in_interrupt() is true.

Instead of changing the do_debug/int3(), this just adds
context_track_user_*() to kprobes blacklist, since those are
still can be called right before kprobes handles int3 and debug
exceptions, and probing those will cause an infinite loop.

Reported-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Kees Cook <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/traps.c | 7 ++++---
kernel/context_tracking.c | 3 +++
2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c6eb418..0d0e922 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -343,6 +343,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
if (poke_int3_handler(regs))
return;

+ prev_state = exception_enter();
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -351,9 +352,8 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)

#ifdef CONFIG_KPROBES
if (kprobe_int3_handler(regs))
- return;
+ goto exit;
#endif
- prev_state = exception_enter();

if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -433,6 +433,8 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
unsigned long dr6;
int si_code;

+ prev_state = exception_enter();
+
get_debugreg(dr6, 6);

/* Filter out all the reserved bits which are preset to 1 */
@@ -465,7 +467,6 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
if (kprobe_debug_handler(regs))
goto exit;
#endif
- prev_state = exception_enter();

if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
SIGTRAP) == NOTIFY_STOP)
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 019d450..5664985 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -19,6 +19,7 @@
#include <linux/sched.h>
#include <linux/hardirq.h>
#include <linux/export.h>
+#include <linux/kprobes.h>

#define CREATE_TRACE_POINTS
#include <trace/events/context_tracking.h>
@@ -104,6 +105,7 @@ void context_tracking_user_enter(void)
}
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(context_tracking_user_enter);

#ifdef CONFIG_PREEMPT
/**
@@ -181,6 +183,7 @@ void context_tracking_user_exit(void)
}
local_irq_restore(flags);
}
+NOKPROBE_SYMBOL(context_tracking_user_exit);

/**
* __context_tracking_task_switch - context switch the syscall callbacks

2014-06-16 15:52:20

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH -tip ] [Bugfix] x86/kprobes: Fix build errors and blacklist context_track_user

On Sat, Jun 14, 2014 at 06:47:12AM +0000, Masami Hiramatsu wrote:
> This reverts commit ecd50f714c421c759354632dd00f70c718c95b10
> since it causes build errors with CONFIG_CONTEXT_TRACKING and
> that has been made from misunderstandings; context_track_user_*()
> don't involve much in interrupt context, it just returns
> if in_interrupt() is true.
>
> Instead of changing the do_debug/int3(), this just adds
> context_track_user_*() to kprobes blacklist, since those are
> still can be called right before kprobes handles int3 and debug
> exceptions, and probing those will cause infinit loop.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> Reported-by: Frederic Weisbecker <[email protected]>

Thanks a lot!

Acked-by: Frederic Weisbecker <[email protected]>