Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758870AbZKYQ7M (ORCPT ); Wed, 25 Nov 2009 11:59:12 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1758861AbZKYQ7I (ORCPT ); Wed, 25 Nov 2009 11:59:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:29662 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758850AbZKYQ5w (ORCPT ); Wed, 25 Nov 2009 11:57:52 -0500 From: Masami Hiramatsu Subject: [PATCH -tip v6 10/11] [RFC] x86: Introduce generic jump patching without stop_machine To: Frederic Weisbecker , Ingo Molnar , Ananth N Mavinakayanahalli , lkml Cc: systemtap , DLE , Masami Hiramatsu , Ananth N Mavinakayanahalli , Ingo Molnar , Jim Keniston , Srikar Dronamraju , Christoph Hellwig , Steven Rostedt , Frederic Weisbecker , "H. Peter Anvin" , Anders Kaseorg , Tim Abbott , Andi Kleen , Jason Baron , Mathieu Desnoyers Date: Wed, 25 Nov 2009 11:56:26 -0500 Message-ID: <20091125165626.6073.78864.stgit@harusame> In-Reply-To: <20091125165510.6073.48721.stgit@harusame> References: <20091125165510.6073.48721.stgit@harusame> User-Agent: StGIT/0.14.3 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6868 Lines: 216 Add text_poke_fixup() which takes a fixup address to where a processor jumps if it hits the modifying address while code modifying. text_poke_fixup() does following steps for this purpose. 1. Setup int3 handler for fixup. 2. Put a breakpoint (int3) on the first byte of modifying region, and synchronize code on all CPUs. 3. Modify other bytes of modifying region, and synchronize code on all CPUs. 4. Modify the first byte of modifying region, and synchronize code on all CPUs. 5. Clear int3 handler. Thus, if some other processor execute modifying address when step2 to step4, it will be jumped to fixup code. This still has many limitations for modifying multi-instructions at once. However, it is enough for 'a 5 bytes nop replacing with a jump' patching, because; - Replaced instruction is just one instruction, which is executed atomically. - Replacing instruction is a jump, so we can set fixup address where the jump goes to. Changes in v6: - Use int3 even if len == 1 (int3 size). Changes in v5 - Add some comments. - Use smp_wmb()/smp_rmb() - Remove unneeded sync_core_all() Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Ingo Molnar Cc: Jim Keniston Cc: Srikar Dronamraju Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Anders Kaseorg Cc: Tim Abbott Cc: Andi Kleen Cc: Jason Baron Cc: Mathieu Desnoyers --- arch/x86/include/asm/alternative.h | 11 ++++ arch/x86/kernel/alternative.c | 102 ++++++++++++++++++++++++++++++++++++ kernel/kprobes.c | 2 - 3 files changed, 114 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index a0e21ad..ed87711 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -162,4 +162,15 @@ static inline void apply_paravirt(struct paravirt_patch_site *start, extern void *text_poke(void *addr, const void *opcode, size_t len); extern void *text_poke_smp(void *addr, const void *opcode, size_t len); +/* + * Setup int3 trap and fixup execution for cross-modifying on SMP case. + * If the other cpus execute modifying instruction, it will hit int3 + * and go to fixup code. This just provides a minimal safety check. + * Additional checks/restrictions are required for completely safe + * cross-modifying. + */ +extern void *text_poke_fixup(void *addr, const void *opcode, size_t len, + void *fixup); +extern void sync_core_all(void); + #endif /* _ASM_X86_ALTERNATIVE_H */ diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 7ce45d7..3117142 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include #include @@ -612,3 +613,104 @@ void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len) return addr; } +/* + * On pentium series, Unsynchronized cross-modifying code + * operations can cause unexpected instruction execution results. + * So after code modified, we should synchronize it on each processor. + */ +static void __kprobes __local_sync_core(void *info) +{ + sync_core(); +} + +void __kprobes sync_core_all(void) +{ + on_each_cpu(__local_sync_core, NULL, 1); +} + +/* Safely cross-code modifying with fixup address */ +static void *patch_fixup_from; +static void *patch_fixup_addr; + +static int __kprobes patch_exceptions_notify(struct notifier_block *self, + unsigned long val, void *data) +{ + struct die_args *args = data; + struct pt_regs *regs = args->regs; + + smp_rmb(); + + if (likely(!patch_fixup_from)) + return NOTIFY_DONE; + + if (val != DIE_INT3 || !regs || user_mode_vm(regs) || + (unsigned long)patch_fixup_from != regs->ip) + return NOTIFY_DONE; + + args->regs->ip = (unsigned long)patch_fixup_addr; + + return NOTIFY_STOP; +} + +/** + * text_poke_fixup() -- cross-modifying kernel text with fixup address. + * @addr: Modifying address. + * @opcode: New instruction. + * @len: length of modifying bytes. + * @fixup: Fixup address. + * + * Note: You must backup replaced instructions before calling this, + * if you need to recover it. + * Note: Must be called under text_mutex. + */ +void *__kprobes text_poke_fixup(void *addr, const void *opcode, size_t len, + void *fixup) +{ + static const unsigned char int3_insn = BREAKPOINT_INSTRUCTION; + static const int int3_size = sizeof(int3_insn); + + /* Preparing fixup address */ + patch_fixup_addr = fixup; + patch_fixup_from = (u8 *)addr + int3_size; /* IP address after int3 */ + smp_wmb(); + + /* Cap by an int3 - expecting synchronously done */ + text_poke(addr, &int3_insn, int3_size); + + if (len - int3_size > 0) { + /* Replace tail bytes */ + text_poke((char *)addr + int3_size, + (const char *)opcode + int3_size, + len - int3_size); + /* Synchronize code cache */ + sync_core_all(); + } + + /* Replace int3 with head byte - expecting synchronously done */ + text_poke(addr, opcode, int3_size); + + /* + * Sync core again - this is for waiting for disabled IRQ code + * quiescent state, IOW, waiting for all running int3 fixup + * handlers. + */ + sync_core_all(); + + /* Cleanup fixup address */ + patch_fixup_from = NULL; + smp_wmb(); + + return addr; +} + +static struct notifier_block patch_exceptions_nb = { + .notifier_call = patch_exceptions_notify, + .priority = 0x7fffffff /* we need to be notified first */ +}; + +static int __init patch_init(void) +{ + return register_die_notifier(&patch_exceptions_nb); +} + +arch_initcall(patch_init); diff --git a/kernel/kprobes.c b/kernel/kprobes.c index d88f4c1..a58c6fa 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1334,7 +1334,7 @@ EXPORT_SYMBOL_GPL(unregister_kprobes); static struct notifier_block kprobe_exceptions_nb = { .notifier_call = kprobe_exceptions_notify, - .priority = 0x7fffffff /* we need to be notified first */ + .priority = 0x7ffffff0 /* High priority, but not first. */ }; unsigned long __weak arch_deref_entry_point(void *entry) -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America), Inc. Software Solutions Division e-mail: mhiramat@redhat.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/