From: Ananth N Mavinakayanahalli <[email protected]>
On RISC architectures like powerpc, instructions are fixed size.
Instruction analysis on such platforms is just a matter of (insn % 4).
Pass the vaddr at which the uprobe is to be inserted so that
arch_uprobe_analyze_insn() can flag misaligned registration requests.
Signed-off-by: Ananth N Mavinakaynahalli <[email protected]>
---
arch/x86/include/asm/uprobes.h | 2 +-
arch/x86/kernel/uprobes.c | 3 ++-
kernel/events/uprobes.c | 2 +-
3 files changed, 4 insertions(+), 3 deletions(-)
Index: uprobes-24may/arch/x86/include/asm/uprobes.h
===================================================================
--- uprobes-24may.orig/arch/x86/include/asm/uprobes.h
+++ uprobes-24may/arch/x86/include/asm/uprobes.h
@@ -48,7 +48,7 @@ struct arch_uprobe_task {
#endif
};
-extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm);
+extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, loff_t vaddr);
extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
Index: uprobes-24may/arch/x86/kernel/uprobes.c
===================================================================
--- uprobes-24may.orig/arch/x86/kernel/uprobes.c
+++ uprobes-24may/arch/x86/kernel/uprobes.c
@@ -409,9 +409,10 @@ static int validate_insn_bits(struct arc
* arch_uprobe_analyze_insn - instruction analysis including validity and fixups.
* @mm: the probed address space.
* @arch_uprobe: the probepoint information.
+ * @vaddr: virtual address at which to install the probepoint
* Return 0 on success or a -ve number on error.
*/
-int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm)
+int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
{
int ret;
struct insn insn;
Index: uprobes-24may/kernel/events/uprobes.c
===================================================================
--- uprobes-24may.orig/kernel/events/uprobes.c
+++ uprobes-24may/kernel/events/uprobes.c
@@ -697,7 +697,7 @@ install_breakpoint(struct uprobe *uprobe
if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
return -EEXIST;
- ret = arch_uprobe_analyze_insn(&uprobe->arch, mm);
+ ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, vaddr);
if (ret)
return ret;
From: Ananth N Mavinakayanahalli <[email protected]>
This is the port of uprobes to powerpc. Usage is similar to x86.
One TODO in this port compared to x86 is the uprobe abort_xol() logic.
x86 depends on the thread_struct.trap_nr (absent in powerpc) to determine
if a signal was caused when the uprobed instruction was single-stepped/
emulated, in which case, we reset the instruction pointer to the probed
address and retry the probe again.
[root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
Added new event:
probe_libc:malloc (on 0xb4860)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc -aR sleep 1
[root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
[ perf record: Woken up 22 times to write data ]
[ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
[root@xxxx ~]# ./bin/perf report --stdio
# ========
# captured on: Mon Jun 4 05:26:31 2012
# hostname : xxxx.ibm.com
# os release : 3.4.0-uprobe
# perf version : 3.4.0
# arch : ppc64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : POWER6 (raw), altivec supported
# cpuid : 62,769
# total memory : 7310528 kB
# cmdline : /root/bin/perf record -e probe_libc:malloc -aR sleep 20
# event : name = probe_libc:malloc, type = 2, config = 0x124, config1 = 0x0, con
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
# Samples: 83K of event 'probe_libc:malloc'
# Event count (approx.): 83484
#
# Overhead Command Shared Object Symbol
# ........ ............ ............. ..........
#
69.05% tar libc-2.12.so [.] malloc
28.57% rm libc-2.12.so [.] malloc
1.32% avahi-daemon libc-2.12.so [.] malloc
0.58% bash libc-2.12.so [.] malloc
0.28% sshd libc-2.12.so [.] malloc
0.08% irqbalance libc-2.12.so [.] malloc
0.05% bzip2 libc-2.12.so [.] malloc
0.04% sleep libc-2.12.so [.] malloc
0.03% multipathd libc-2.12.so [.] malloc
0.01% sendmail libc-2.12.so [.] malloc
0.01% automount libc-2.12.so [.] malloc
Signed-off-by: Ananth N Mavinakayanahalli <[email protected]>
Index: linux-3.5-rc1/arch/powerpc/include/asm/thread_info.h
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/include/asm/thread_info.h 2012-06-03 06:59:26.000000000 +0530
+++ linux-3.5-rc1/arch/powerpc/include/asm/thread_info.h 2012-06-03 21:05:48.226233001 +0530
@@ -96,6 +96,7 @@
#define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */
#define TIF_NOERROR 12 /* Force successful syscall return */
#define TIF_NOTIFY_RESUME 13 /* callback before returning to user */
+#define TIF_UPROBE 14 /* breakpointed or single-stepping */
#define TIF_SYSCALL_TRACEPOINT 15 /* syscall tracepoint instrumentation */
/* as above, but as bit values */
@@ -112,12 +113,13 @@
#define _TIF_RESTOREALL (1<<TIF_RESTOREALL)
#define _TIF_NOERROR (1<<TIF_NOERROR)
#define _TIF_NOTIFY_RESUME (1<<TIF_NOTIFY_RESUME)
+#define _TIF_UPROBE (1<<TIF_UPROBE)
#define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT)
#define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
#define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
- _TIF_NOTIFY_RESUME)
+ _TIF_NOTIFY_RESUME | _TIF_UPROBE)
#define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
/* Bits in local_flags */
Index: linux-3.5-rc1/arch/powerpc/include/asm/uprobes.h
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/include/asm/uprobes.h 2012-05-31 02:23:15.746233001 +0530
+++ linux-3.5-rc1/arch/powerpc/include/asm/uprobes.h 2012-06-03 21:05:48.226233001 +0530
@@ -0,0 +1,49 @@
+#ifndef _ASM_UPROBES_H
+#define _ASM_UPROBES_H
+/*
+ * User-space Probes (UProbes) for powerpc
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2007-2012
+ *
+ * Adapted from the x86 port by Ananth N Mavinakayanahalli <[email protected]>
+ */
+
+#include <linux/notifier.h>
+
+typedef unsigned int uprobe_opcode_t;
+
+#define MAX_UINSN_BYTES 4
+#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES)
+
+#define UPROBE_SWBP_INSN 0x7fe00008
+#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */
+
+struct arch_uprobe {
+ u8 insn[MAX_UINSN_BYTES];
+};
+
+struct arch_uprobe_task {
+};
+
+extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, loff_t vaddr);
+extern unsigned long uprobe_get_swbp_addr(struct pt_regs *regs);
+extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
+extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
+extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+#endif /* _ASM_UPROBES_H */
Index: linux-3.5-rc1/arch/powerpc/kernel/Makefile
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/kernel/Makefile 2012-06-03 06:59:26.000000000 +0530
+++ linux-3.5-rc1/arch/powerpc/kernel/Makefile 2012-06-03 21:05:48.226233001 +0530
@@ -96,6 +96,7 @@
obj-$(CONFIG_BOOTX_TEXT) += btext.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_KPROBES) += kprobes.o
+obj-$(CONFIG_UPROBES) += uprobes.o
obj-$(CONFIG_PPC_UDBG_16550) += legacy_serial.o udbg_16550.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-$(CONFIG_SWIOTLB) += dma-swiotlb.o
Index: linux-3.5-rc1/arch/powerpc/kernel/signal.c
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/kernel/signal.c 2012-06-03 06:59:26.000000000 +0530
+++ linux-3.5-rc1/arch/powerpc/kernel/signal.c 2012-06-03 21:05:48.236233000 +0530
@@ -11,6 +11,7 @@
#include <linux/tracehook.h>
#include <linux/signal.h>
+#include <linux/uprobes.h>
#include <linux/key.h>
#include <asm/hw_breakpoint.h>
#include <asm/uaccess.h>
@@ -157,6 +158,11 @@
void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
{
+ if (thread_info_flags & _TIF_UPROBE) {
+ clear_thread_flag(TIF_UPROBE);
+ uprobe_notify_resume(regs);
+ }
+
if (thread_info_flags & _TIF_SIGPENDING)
do_signal(regs);
Index: linux-3.5-rc1/arch/powerpc/kernel/uprobes.c
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/kernel/uprobes.c 2012-05-31 02:23:15.746233001 +0530
+++ linux-3.5-rc1/arch/powerpc/kernel/uprobes.c 2012-06-03 21:05:48.236233000 +0530
@@ -0,0 +1,163 @@
+/*
+ * User-space Probes (UProbes) for powerpc
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2007-2012
+ *
+ * Adapted from the x86 port by Ananth N Mavinakayanahalli <[email protected]>
+ */
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/ptrace.h>
+#include <linux/uprobes.h>
+#include <linux/uaccess.h>
+
+#include <linux/kdebug.h>
+#include <asm/sstep.h>
+
+/**
+ * arch_uprobe_analyze_insn
+ * @mm: the probed address space.
+ * @arch_uprobe: the probepoint information.
+ * Return 0 on success or a -ve number on error.
+ */
+int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
+{
+ if (vaddr & 0x03)
+ return -EINVAL;
+ return 0;
+}
+
+/*
+ * arch_uprobe_pre_xol - prepare to execute out of line.
+ * @auprobe: the probepoint information.
+ * @regs: reflects the saved user state of current task.
+ */
+int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+ /* FIXME: We don't support abort_xol on powerpc for now */
+ regs->nip = current->utask->xol_vaddr;
+ return 0;
+}
+
+/**
+ * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs
+ * @regs: Reflects the saved state of the task after it has hit a breakpoint
+ * instruction.
+ * Return the address of the breakpoint instruction.
+ */
+unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
+{
+ return instruction_pointer(regs);
+}
+
+/*
+ * If xol insn itself traps and generates a signal (SIGILL/SIGSEGV/etc),
+ * then detect the case where a singlestepped instruction jumps back to its
+ * own address. It is assumed that anything like do_page_fault/do_trap/etc
+ * sets thread.trap_nr != -1.
+ *
+ * FIXME: powerpc however doesn't have thread.trap_nr yet.
+ *
+ * arch_uprobe_pre_xol/arch_uprobe_post_xol save/restore thread.trap_nr,
+ * arch_uprobe_xol_was_trapped() simply checks that ->trap_nr is not equal to
+ * UPROBE_TRAP_NR == -1 set by arch_uprobe_pre_xol().
+ */
+bool arch_uprobe_xol_was_trapped(struct task_struct *t)
+{
+ /* FIXME: We don't support abort_xol on powerpc for now */
+ return false;
+}
+
+/*
+ * Called after single-stepping. To avoid the SMP problems that can
+ * occur when we temporarily put back the original opcode to
+ * single-step, we single-stepped a copy of the instruction.
+ *
+ * This function prepares to resume execution after the single-step.
+ */
+int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+ /* FIXME: We don't support abort_xol on powerpc for now */
+
+ /*
+ * On powerpc, except for loads and stores, most instructions
+ * including ones that alter code flow (branches, calls, returns)
+ * are emulated in the kernel. We get here only if the emulation
+ * support doesn't exist and have to fix-up the next instruction
+ * to be executed.
+ */
+ regs->nip = current->utask->vaddr + MAX_UINSN_BYTES;
+ return 0;
+}
+
+/* callback routine for handling exceptions. */
+int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data)
+{
+ struct die_args *args = data;
+ struct pt_regs *regs = args->regs;
+ int ret = NOTIFY_DONE;
+
+ /* We are only interested in userspace traps */
+ if (regs && !user_mode(regs))
+ return NOTIFY_DONE;
+
+ switch (val) {
+ case DIE_BPT:
+ if (uprobe_pre_sstep_notifier(regs))
+ ret = NOTIFY_STOP;
+ break;
+ case DIE_SSTEP:
+ if (uprobe_post_sstep_notifier(regs))
+ ret = NOTIFY_STOP;
+ default:
+ break;
+ }
+ return ret;
+}
+
+/*
+ * This function gets called when XOL instruction either gets trapped or
+ * the thread has a fatal signal, so reset the instruction pointer to its
+ * probed address.
+ */
+void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+ /* FIXME: We don't support abort_xol on powerpc for now */
+ return;
+}
+
+/*
+ * See if the instruction can be emulated.
+ * Returns true if instruction was emulated, false otherwise.
+ */
+bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+ int ret;
+ unsigned int insn;
+
+ memcpy(&insn, auprobe->insn, MAX_UINSN_BYTES);
+
+ /*
+ * emulate_step() returns 1 if the insn was successfully emulated.
+ * For all other cases, we need to single-step in hardware.
+ */
+ ret = emulate_step(regs, insn);
+ if (ret > 0)
+ return true;
+
+ return false;
+}
Index: linux-3.5-rc1/arch/powerpc/Kconfig
===================================================================
--- linux-3.5-rc1.orig/arch/powerpc/Kconfig 2012-06-03 06:59:26.000000000 +0530
+++ linux-3.5-rc1/arch/powerpc/Kconfig 2012-06-03 21:05:48.246233000 +0530
@@ -237,6 +237,9 @@
config ARCH_SUPPORTS_DEBUG_PAGEALLOC
def_bool y
+config ARCH_SUPPORTS_UPROBES
+ def_bool y
+
config PPC_ADV_DEBUG_REGS
bool
depends on 40x || BOOKE
On Wed, 2012-06-06 at 14:49 +0530, Ananth N Mavinakayanahalli wrote:
> +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
Don't we traditionally use unsigned long to pass vaddrs?
On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> One TODO in this port compared to x86 is the uprobe abort_xol() logic.
> x86 depends on the thread_struct.trap_nr (absent in powerpc) to determine
> if a signal was caused when the uprobed instruction was single-stepped/
> emulated, in which case, we reset the instruction pointer to the probed
> address and retry the probe again.
Another curious difference is that x86 uses an instruction decoder and
contains massive tables to validate we can probe a particular
instruction.
Can we probe all possible PPC instructions?
On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> > One TODO in this port compared to x86 is the uprobe abort_xol() logic.
> > x86 depends on the thread_struct.trap_nr (absent in powerpc) to determine
> > if a signal was caused when the uprobed instruction was single-stepped/
> > emulated, in which case, we reset the instruction pointer to the probed
> > address and retry the probe again.
>
> Another curious difference is that x86 uses an instruction decoder and
> contains massive tables to validate we can probe a particular
> instruction.
>
> Can we probe all possible PPC instructions?
For the kernel, the only ones that are off limits are rfi (return from
interrupt), mtmsr (move to msr). All other instructions can be probed.
Both those instructions are supervisor level, so we won't see them in
userspace at all; so we should be able to probe all user level
instructions.
I am not aware of specific caveats for vector/altivec instructions;
maybe Paul or Ben are more suitable to comment on that.
Ananth
On Wed, Jun 06, 2012 at 11:23:52AM +0200, Peter Zijlstra wrote:
> On Wed, 2012-06-06 at 14:49 +0530, Ananth N Mavinakayanahalli wrote:
> > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
>
> Don't we traditionally use unsigned long to pass vaddrs?
Right. But the vaddr we pass here is vma_info->vaddr which is loff_t.
I guess I should've made that clear in the patch description.
Ananth
* Ananth N Mavinakayanahalli <[email protected]> wrote:
> On Wed, Jun 06, 2012 at 11:23:52AM +0200, Peter Zijlstra wrote:
> > On Wed, 2012-06-06 at 14:49 +0530, Ananth N Mavinakayanahalli wrote:
> > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
> >
> > Don't we traditionally use unsigned long to pass vaddrs?
>
> Right. But the vaddr we pass here is vma_info->vaddr which is loff_t.
> I guess I should've made that clear in the patch description.
Why not fix struct vma_info's vaddr type?
Thanks,
Ingo
On Wed, Jun 06, 2012 at 11:40:15AM +0200, Ingo Molnar wrote:
>
> * Ananth N Mavinakayanahalli <[email protected]> wrote:
>
> > On Wed, Jun 06, 2012 at 11:23:52AM +0200, Peter Zijlstra wrote:
> > > On Wed, 2012-06-06 at 14:49 +0530, Ananth N Mavinakayanahalli wrote:
> > > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
> > >
> > > Don't we traditionally use unsigned long to pass vaddrs?
> >
> > Right. But the vaddr we pass here is vma_info->vaddr which is loff_t.
> > I guess I should've made that clear in the patch description.
>
> Why not fix struct vma_info's vaddr type?
Agreed. Will fix and send v2.
Ananth
* Ingo Molnar <[email protected]> [2012-06-06 11:40:15]:
>
> * Ananth N Mavinakayanahalli <[email protected]> wrote:
>
> > On Wed, Jun 06, 2012 at 11:23:52AM +0200, Peter Zijlstra wrote:
> > > On Wed, 2012-06-06 at 14:49 +0530, Ananth N Mavinakayanahalli wrote:
> > > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
> > >
> > > Don't we traditionally use unsigned long to pass vaddrs?
> >
> > Right. But the vaddr we pass here is vma_info->vaddr which is loff_t.
> > I guess I should've made that clear in the patch description.
>
> Why not fix struct vma_info's vaddr type?
>
Calculating and comparing vaddr results either uses variables of type loff_t.
To avoid typecasting and avoid overflow at each of these places, we used
loff_t.
Ananth, install_breakpoint() already has a variable of type addr of type
unsigned long. Why dont you use addr instead of vaddr.
--
Thanks and regards
Srikar
On 06/06, Ananth N Mavinakayanahalli wrote:
>
> From: Ananth N Mavinakayanahalli <[email protected]>
>
> On RISC architectures like powerpc, instructions are fixed size.
> Instruction analysis on such platforms is just a matter of (insn % 4).
> Pass the vaddr at which the uprobe is to be inserted so that
> arch_uprobe_analyze_insn() can flag misaligned registration requests.
And the next patch checks "vaddr & 0x03".
But why do you need this new arg? arch_uprobe_analyze_insn() could
check "container_of(auprobe, struct uprobe, arch)->offset & 0x3" with
the same effect, no? vm_start/vm_pgoff are obviously page-aligned.
Oleg.
* Oleg Nesterov <[email protected]> [2012-06-06 17:08:48]:
> On 06/06, Ananth N Mavinakayanahalli wrote:
> >
> > From: Ananth N Mavinakayanahalli <[email protected]>
> >
> > On RISC architectures like powerpc, instructions are fixed size.
> > Instruction analysis on such platforms is just a matter of (insn % 4).
> > Pass the vaddr at which the uprobe is to be inserted so that
> > arch_uprobe_analyze_insn() can flag misaligned registration requests.
>
> And the next patch checks "vaddr & 0x03".
>
> But why do you need this new arg? arch_uprobe_analyze_insn() could
> check "container_of(auprobe, struct uprobe, arch)->offset & 0x3" with
> the same effect, no? vm_start/vm_pgoff are obviously page-aligned.
>
We cant use container_of because we moved the definition for struct
uprobe to kernel/events/uprobe.c. This was possible before when struct
uprobe definition was in include/uprobes.h
--
Thanks and Regards
Srikar
On Wed, 2012-06-06 at 15:05 +0530, Ananth N Mavinakayanahalli wrote:
> On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> > On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> > > One TODO in this port compared to x86 is the uprobe abort_xol() logic.
> > > x86 depends on the thread_struct.trap_nr (absent in powerpc) to determine
> > > if a signal was caused when the uprobed instruction was single-stepped/
> > > emulated, in which case, we reset the instruction pointer to the probed
> > > address and retry the probe again.
> >
> > Another curious difference is that x86 uses an instruction decoder and
> > contains massive tables to validate we can probe a particular
> > instruction.
Part of that difference is because the x86 instruction set is a lot more
complex. Another part is due to the lack, back when the x86 code was
created, of robust handling by uprobes of traps by probed instructions.
So we refused to probe instructions that we knew (or strongly suspected)
would generate traps in user mode -- e.g., privileged instructions,
illegal instructions. A couple of times we had to "legalize"
instructions or prefixes that we didn't originally expect to encounter.
> >
> > Can we probe all possible PPC instructions?
>
> For the kernel, the only ones that are off limits are rfi (return from
> interrupt), mtmsr (move to msr). All other instructions can be probed.
>
> Both those instructions are supervisor level, so we won't see them in
> userspace at all; so we should be able to probe all user level
> instructions.
Presumably rfi or mtmsr could show up in the instruction stream via an
erroneous or mischievous asm statement. It'd be good to verify that you
handle that gracefully.
>
> I am not aware of specific caveats for vector/altivec instructions;
> maybe Paul or Ben are more suitable to comment on that.
>
> Ananth
>
Jim
On Wed, Jun 06, 2012 at 05:14:23PM +0530, Srikar Dronamraju wrote:
> * Ingo Molnar <[email protected]> [2012-06-06 11:40:15]:
>
> >
> > * Ananth N Mavinakayanahalli <[email protected]> wrote:
> >
> > > On Wed, Jun 06, 2012 at 11:23:52AM +0200, Peter Zijlstra wrote:
> > > > On Wed, 2012-06-06 at 14:49 +0530, Ananth N Mavinakayanahalli wrote:
> > > > > +int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, loff_t vaddr)
> > > >
> > > > Don't we traditionally use unsigned long to pass vaddrs?
> > >
> > > Right. But the vaddr we pass here is vma_info->vaddr which is loff_t.
> > > I guess I should've made that clear in the patch description.
> >
> > Why not fix struct vma_info's vaddr type?
> >
>
> Calculating and comparing vaddr results either uses variables of type loff_t.
> To avoid typecasting and avoid overflow at each of these places, we used
> loff_t.
>
> Ananth, install_breakpoint() already has a variable of type addr of type
> unsigned long. Why dont you use addr instead of vaddr.
Ok, makes sense. I'll change it in v2.
Ananth
On Wed, Jun 06, 2012 at 11:08:04AM -0700, Jim Keniston wrote:
> On Wed, 2012-06-06 at 15:05 +0530, Ananth N Mavinakayanahalli wrote:
> > On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> > > On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
...
> > For the kernel, the only ones that are off limits are rfi (return from
> > interrupt), mtmsr (move to msr). All other instructions can be probed.
> >
> > Both those instructions are supervisor level, so we won't see them in
> > userspace at all; so we should be able to probe all user level
> > instructions.
>
> Presumably rfi or mtmsr could show up in the instruction stream via an
> erroneous or mischievous asm statement. It'd be good to verify that you
> handle that gracefully.
That'd be flagged elsewhere, by the architecture itself -- you'd get a
privileged instruciton exception if you try execute any instruction not
part of the UISA. I therefore don't think its a necessary check in the
uprobes code.
Ananth
On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> On Wed, Jun 06, 2012 at 11:08:04AM -0700, Jim Keniston wrote:
> > On Wed, 2012-06-06 at 15:05 +0530, Ananth N Mavinakayanahalli wrote:
> > > On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> > > > On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
>
> ...
>
> > > For the kernel, the only ones that are off limits are rfi (return from
> > > interrupt), mtmsr (move to msr). All other instructions can be probed.
> > >
> > > Both those instructions are supervisor level, so we won't see them in
> > > userspace at all; so we should be able to probe all user level
> > > instructions.
> >
> > Presumably rfi or mtmsr could show up in the instruction stream via an
> > erroneous or mischievous asm statement. It'd be good to verify that you
> > handle that gracefully.
>
> That'd be flagged elsewhere, by the architecture itself -- you'd get a
> privileged instruciton exception if you try execute any instruction not
> part of the UISA. I therefore don't think its a necessary check in the
> uprobes code.
But you're not executing the instruction, you're passing it to
emulate_step(). Or am I missing something?
cheers
On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > On Wed, Jun 06, 2012 at 11:08:04AM -0700, Jim Keniston wrote:
> > > On Wed, 2012-06-06 at 15:05 +0530, Ananth N Mavinakayanahalli wrote:
> > > > On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> > > > > On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> >
> > ...
> >
> > > > For the kernel, the only ones that are off limits are rfi (return from
> > > > interrupt), mtmsr (move to msr). All other instructions can be probed.
> > > >
> > > > Both those instructions are supervisor level, so we won't see them in
> > > > userspace at all; so we should be able to probe all user level
> > > > instructions.
> > >
> > > Presumably rfi or mtmsr could show up in the instruction stream via an
> > > erroneous or mischievous asm statement. It'd be good to verify that you
> > > handle that gracefully.
> >
> > That'd be flagged elsewhere, by the architecture itself -- you'd get a
> > privileged instruciton exception if you try execute any instruction not
> > part of the UISA. I therefore don't think its a necessary check in the
> > uprobes code.
>
> But you're not executing the instruction, you're passing it to
> emulate_step(). Or am I missing something?
But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
end up single-stepping using user_enable_single_step(). Same with rfid.
Ananth
On Fri, 2012-06-08 at 11:31 +0530, Ananth N Mavinakayanahalli wrote:
> On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> > On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > > On Wed, Jun 06, 2012 at 11:08:04AM -0700, Jim Keniston wrote:
> > > > On Wed, 2012-06-06 at 15:05 +0530, Ananth N Mavinakayanahalli wrote:
> > > > > On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> > >
> > > ...
> > >
> > > > > For the kernel, the only ones that are off limits are rfi (return from
> > > > > interrupt), mtmsr (move to msr). All other instructions can be probed.
> > > > >
> > > > > Both those instructions are supervisor level, so we won't see them in
> > > > > userspace at all; so we should be able to probe all user level
> > > > > instructions.
> > > >
> > > > Presumably rfi or mtmsr could show up in the instruction stream via an
> > > > erroneous or mischievous asm statement. It'd be good to verify that you
> > > > handle that gracefully.
> > >
> > > That'd be flagged elsewhere, by the architecture itself -- you'd get a
> > > privileged instruciton exception if you try execute any instruction not
> > > part of the UISA. I therefore don't think its a necessary check in the
> > > uprobes code.
> >
> > But you're not executing the instruction, you're passing it to
> > emulate_step(). Or am I missing something?
>
> But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
> end up single-stepping using user_enable_single_step(). Same with rfid.
Right. But that was exactly Jim's point, you may be asked to emulate
those instructions even though you wouldn't expect to see them in
userspace code, so you need to handle it.
Luckily it looks like emulate_step() will do the right thing for you.
It'd be good to test it to make 100% sure.
cheers
On Fri, Jun 08, 2012 at 04:17:44PM +1000, Michael Ellerman wrote:
> On Fri, 2012-06-08 at 11:31 +0530, Ananth N Mavinakayanahalli wrote:
> > On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> > > On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > > > On Wed, Jun 06, 2012 at 11:08:04AM -0700, Jim Keniston wrote:
> > > > > On Wed, 2012-06-06 at 15:05 +0530, Ananth N Mavinakayanahalli wrote:
> > > > > > On Wed, Jun 06, 2012 at 11:27:02AM +0200, Peter Zijlstra wrote:
> > > > > > > On Wed, 2012-06-06 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> > > >
> > > > ...
> > > >
> > > > > > For the kernel, the only ones that are off limits are rfi (return from
> > > > > > interrupt), mtmsr (move to msr). All other instructions can be probed.
> > > > > >
> > > > > > Both those instructions are supervisor level, so we won't see them in
> > > > > > userspace at all; so we should be able to probe all user level
> > > > > > instructions.
> > > > >
> > > > > Presumably rfi or mtmsr could show up in the instruction stream via an
> > > > > erroneous or mischievous asm statement. It'd be good to verify that you
> > > > > handle that gracefully.
> > > >
> > > > That'd be flagged elsewhere, by the architecture itself -- you'd get a
> > > > privileged instruciton exception if you try execute any instruction not
> > > > part of the UISA. I therefore don't think its a necessary check in the
> > > > uprobes code.
> > >
> > > But you're not executing the instruction, you're passing it to
> > > emulate_step(). Or am I missing something?
> >
> > But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
> > end up single-stepping using user_enable_single_step(). Same with rfid.
>
> Right. But that was exactly Jim's point, you may be asked to emulate
> those instructions even though you wouldn't expect to see them in
> userspace code, so you need to handle it.
>
> Luckily it looks like emulate_step() will do the right thing for you.
> It'd be good to test it to make 100% sure.
Sure. Will add that check and send v2.
Ananth
On Fri, 2012-06-08 at 11:49 +0530, Ananth N Mavinakayanahalli wrote:
> On Fri, Jun 08, 2012 at 04:17:44PM +1000, Michael Ellerman wrote:
> > On Fri, 2012-06-08 at 11:31 +0530, Ananth N Mavinakayanahalli wrote:
> > > On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> > > > On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > >
> > > But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
> > > end up single-stepping using user_enable_single_step(). Same with rfid.
> >
> > Right. But that was exactly Jim's point, you may be asked to emulate
> > those instructions even though you wouldn't expect to see them in
> > userspace code, so you need to handle it.
> >
> > Luckily it looks like emulate_step() will do the right thing for you.
> > It'd be good to test it to make 100% sure.
>
> Sure. Will add that check and send v2.
Sorry I didn't mean add a test in the code, I meant construct a test
case to confirm that it works as expected.
cheers
On Fri, Jun 08, 2012 at 04:38:17PM +1000, Michael Ellerman wrote:
> On Fri, 2012-06-08 at 11:49 +0530, Ananth N Mavinakayanahalli wrote:
> > On Fri, Jun 08, 2012 at 04:17:44PM +1000, Michael Ellerman wrote:
> > > On Fri, 2012-06-08 at 11:31 +0530, Ananth N Mavinakayanahalli wrote:
> > > > On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> > > > > On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > > >
> > > > But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
> > > > end up single-stepping using user_enable_single_step(). Same with rfid.
> > >
> > > Right. But that was exactly Jim's point, you may be asked to emulate
> > > those instructions even though you wouldn't expect to see them in
> > > userspace code, so you need to handle it.
> > >
> > > Luckily it looks like emulate_step() will do the right thing for you.
> > > It'd be good to test it to make 100% sure.
> >
> > Sure. Will add that check and send v2.
>
> Sorry I didn't mean add a test in the code, I meant construct a test
> case to confirm that it works as expected.
Michael,
I just hand-coded the instr to emulate_step() and here are the results:
MSR_PR is set
insn = 7c600124, ret = 0 /* mtmsr */
insn = 7c600164, ret = 0 /* mtmsrd */
insn = 4c000024, ret = -1 /* rfid */
insn = 4c000064, ret = 0 /* rfi */
Also verified that standalone programs with those instructions in inline
asm will die with a SIGILL.
So, for mtmsr, mtmsrd and rfi, we have to single-step them which will
result in a SIGILL in turn.
Ananth
On Fri, 2012-06-08 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> On Fri, Jun 08, 2012 at 04:38:17PM +1000, Michael Ellerman wrote:
> > On Fri, 2012-06-08 at 11:49 +0530, Ananth N Mavinakayanahalli wrote:
> > > On Fri, Jun 08, 2012 at 04:17:44PM +1000, Michael Ellerman wrote:
> > > > On Fri, 2012-06-08 at 11:31 +0530, Ananth N Mavinakayanahalli wrote:
> > > > > On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> > > > > > On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > > > >
> > > > > But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
> > > > > end up single-stepping using user_enable_single_step(). Same with rfid.
> > > >
> > > > Right. But that was exactly Jim's point, you may be asked to emulate
> > > > those instructions even though you wouldn't expect to see them in
> > > > userspace code, so you need to handle it.
> > > >
> > > > Luckily it looks like emulate_step() will do the right thing for you.
> > > > It'd be good to test it to make 100% sure.
> > >
> > > Sure. Will add that check and send v2.
> >
> > Sorry I didn't mean add a test in the code, I meant construct a test
> > case to confirm that it works as expected.
>
> Michael,
>
> I just hand-coded the instr to emulate_step() and here are the results:
>
> MSR_PR is set
> insn = 7c600124, ret = 0 /* mtmsr */
> insn = 7c600164, ret = 0 /* mtmsrd */
> insn = 4c000024, ret = -1 /* rfid */
> insn = 4c000064, ret = 0 /* rfi */
>
> Also verified that standalone programs with those instructions in inline
> asm will die with a SIGILL.
>
> So, for mtmsr, mtmsrd and rfi, we have to single-step them which will
> result in a SIGILL in turn.
What happens in the rfid case? You don't handle -1 from emulate_step()
any differently AFAICS, so don't we try to single step that too?
cheers
On Tue, Jun 12, 2012 at 02:01:46PM +1000, Michael Ellerman wrote:
> On Fri, 2012-06-08 at 14:51 +0530, Ananth N Mavinakayanahalli wrote:
> > On Fri, Jun 08, 2012 at 04:38:17PM +1000, Michael Ellerman wrote:
> > > On Fri, 2012-06-08 at 11:49 +0530, Ananth N Mavinakayanahalli wrote:
> > > > On Fri, Jun 08, 2012 at 04:17:44PM +1000, Michael Ellerman wrote:
> > > > > On Fri, 2012-06-08 at 11:31 +0530, Ananth N Mavinakayanahalli wrote:
> > > > > > On Fri, Jun 08, 2012 at 03:51:54PM +1000, Michael Ellerman wrote:
> > > > > > > On Fri, 2012-06-08 at 10:06 +0530, Ananth N Mavinakayanahalli wrote:
> > > > > >
> > > > > > But MSR_PR=1 and hence emulate_step() will return -1 and hence we will
> > > > > > end up single-stepping using user_enable_single_step(). Same with rfid.
> > > > >
> > > > > Right. But that was exactly Jim's point, you may be asked to emulate
> > > > > those instructions even though you wouldn't expect to see them in
> > > > > userspace code, so you need to handle it.
> > > > >
> > > > > Luckily it looks like emulate_step() will do the right thing for you.
> > > > > It'd be good to test it to make 100% sure.
> > > >
> > > > Sure. Will add that check and send v2.
> > >
> > > Sorry I didn't mean add a test in the code, I meant construct a test
> > > case to confirm that it works as expected.
> >
> > Michael,
> >
> > I just hand-coded the instr to emulate_step() and here are the results:
> >
> > MSR_PR is set
> > insn = 7c600124, ret = 0 /* mtmsr */
> > insn = 7c600164, ret = 0 /* mtmsrd */
> > insn = 4c000024, ret = -1 /* rfid */
> > insn = 4c000064, ret = 0 /* rfi */
> >
> > Also verified that standalone programs with those instructions in inline
> > asm will die with a SIGILL.
> >
> > So, for mtmsr, mtmsrd and rfi, we have to single-step them which will
> > result in a SIGILL in turn.
>
> What happens in the rfid case? You don't handle -1 from emulate_step()
> any differently AFAICS, so don't we try to single step that too?
-1 is just emulate_step() flagging cases where instructions must not be
single-stepped (rfi[d], mtmsr that clears MSR_RI). But as with the other
OEA instructions in user space, we fail with a SIGILL.
As the application is hozed in any case if we encounter an OEA
instruction, I'd think there is no point in handling a -1 from
emulate_step() any differently.
Ananth