2013-04-03 16:02:51

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 0/9] uretprobes: Return uprobes implementation

Hello All!

Uretprobes' core implementation. Enables a function's return probes in uprobe-
based event tracing.

Patchset introduce additional handler (ret_handler) in uprobe consumer that
defines uretprobe.

There is a regular uprobe with return probe handler behind every uretprobe.
Once hit the uprobe that has ret_handler set, we hijack the return address of
the probed function and replace it with the address of trampoline. Trampoline
is a preallocated page in probed task's xol area that filled with breakpoint
opcode. In turn, when the return breakpoint is hit, we invoke the ret_handler.

The patchset shouldn't be difficult to read and hopefully the comments to
commits will help. Please, review.

patchset in git:
http://github.com/arapov/linux-aa/commits/uretprobes_v1

previous versions:
v0: https://lkml.org/lkml/2013/3/22/218

RFC reviews:
RFCv4: https://lkml.org/lkml/2013/3/4/246
RFCv3: https://lkml.org/lkml/2013/2/28/148
RFCv2: https://lkml.org/lkml/2013/1/9/157
RFCv1: https://lkml.org/lkml/2012/12/21/133

thanks,
Anton.

Anton Arapov (9):
uretprobes: Introduce uprobe_consumer->ret_handler()
uretprobes: Reserve the first slot in xol_vma for trampoline
uretprobes/x86: Hijack return address
uretprobes/ppc: Hijack return address
uretprobes: Return probe entry, prepare_uretprobe()
uretprobes: Return probe exit, invoke handlers
uretprobes: Limit the depth of return probe nestedness
uretprobes: Remove -ENOSYS as return probes implemented
uretprobes: Documentation update

Documentation/trace/uprobetracer.txt | 126 +++++++++++++---------
arch/powerpc/include/asm/uprobes.h | 1 +
arch/powerpc/kernel/uprobes.c | 13 +++
arch/x86/include/asm/uprobes.h | 1 +
arch/x86/kernel/uprobes.c | 29 +++++
include/linux/uprobes.h | 7 ++
kernel/events/uprobes.c | 202 +++++++++++++++++++++++++++++++++--
7 files changed, 320 insertions(+), 59 deletions(-)

--
1.8.1.4


2013-04-03 16:01:24

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 3/9] uretprobes/x86: Hijack return address

Hijack the return address and replace it with a trampoline address.

v1 changes:
* use force_sig_info()
* rework and simplify logic

RFCv5 changes:
* change the fail return code, because orig_ret_vaddr=0 is possible
* style fixup
RFCv2 changes:
* remove ->doomed flag, kill task immediately

Signed-off-by: Anton Arapov <[email protected]>
---
arch/x86/include/asm/uprobes.h | 1 +
arch/x86/kernel/uprobes.c | 29 +++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)

diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
index 8ff8be7..6e51979 100644
--- a/arch/x86/include/asm/uprobes.h
+++ b/arch/x86/include/asm/uprobes.h
@@ -55,4 +55,5 @@ extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
#endif /* _ASM_UPROBES_H */
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 0ba4cfb..2ed8459 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -697,3 +697,32 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
send_sig(SIGTRAP, current, 0);
return ret;
}
+
+unsigned long
+arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs)
+{
+ int rasize, ncopied;
+ unsigned long orig_ret_vaddr = 0; /* clear high bits for 32-bit apps */
+
+ rasize = is_ia32_task() ? 4 : 8;
+ ncopied = copy_from_user(&orig_ret_vaddr, (void __user *)regs->sp, rasize);
+ if (unlikely(ncopied))
+ return -1;
+
+ /* check whether address has been already hijacked */
+ if (orig_ret_vaddr == trampoline_vaddr)
+ return orig_ret_vaddr;
+
+ ncopied = copy_to_user((void __user *)regs->sp, &trampoline_vaddr, rasize);
+ if (likely(!ncopied))
+ return orig_ret_vaddr;
+
+ if (ncopied != rasize) {
+ pr_err("uprobe: return address clobbered: pid=%d, %%sp=%#lx, "
+ "%%ip=%#lx\n", current->pid, regs->sp, regs->ip);
+
+ force_sig_info(SIGSEGV, SEND_SIG_FORCED, current);
+ }
+
+ return -1;
+}
--
1.8.1.4

2013-04-03 16:01:40

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 5/9] uretprobes: Return probe entry, prepare_uretprobe()

When a uprobe with return probe consumer is hit, prepare_uretprobe()
function is invoked. It creates return_instance, hijacks return address
and replaces it with the trampoline.

* Return instances are kept as stack per uprobed task.
* Return instance is chained, when the original return address is
trampoline's page vaddr (e.g. recursive call of the probed function).

v1 changes:
* preserve address of the breakpoint in return_instance.
* don't forget NULLify return_instances on free_utask.
* simplify prepare_uretprobe().

RFCv6 changes:
* rework prepare_uretprobe() logic in order to make further unwinding
in handler_uretprobe() simplier.
* introduce the 'dirty' field.

RFCv5 changes:
* switch from hlist to simply linked list for tracking ->*return_uprobes.
* preallocate first slot xol_area for return probes, see xol_get_area()
changes.
* add get_trampoline_vaddr() helper, to emphasize area->vaddr overload.

RFCv4 changes:
* get rid of area->rp_trampoline_vaddr as it always the same as ->vaddr.
* cleanup ->return_uprobes list in uprobe_free_utask(), because the
task can exit from inside the ret-probe'd function(s).
* in find_active_uprobe(): Once we inserted "int3" we must ensure that
handle_swbp() will be called even if this uprobe goes away. We have
the reference but it only protects uprobe itself, it can't protect
agains delete_uprobe().
IOW, we must ensure that uprobe_pre_sstep_notifier() can't return 0.

RFCv3 changes:
* protected uprobe with refcounter. See atomic_inc in prepare_uretprobe()
and put_uprobe() in a following patch in handle_uretprobe().

RFCv2 changes:
* get rid of ->return_consumers member from struct uprobe, introduce
ret_handler() in consumer.

Signed-off-by: Anton Arapov <[email protected]>
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 4042cad..5f8960e 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -71,6 +71,7 @@ struct uprobe_task {
enum uprobe_task_state state;
struct arch_uprobe_task autask;

+ struct return_instance *return_instances;
struct uprobe *active_uprobe;

unsigned long xol_vaddr;
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index d3c8201..08ecfff 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -75,6 +75,15 @@ struct uprobe {
struct arch_uprobe arch;
};

+struct return_instance {
+ struct uprobe *uprobe;
+ unsigned long func;
+ unsigned long orig_ret_vaddr; /* original return address */
+ bool chained; /* true, if instance is nested */
+
+ struct return_instance *next; /* keep as stack */
+};
+
/*
* valid_vma: Verify if the specified vma is an executable vma
* Relax restrictions while unregistering: vm_flags might have
@@ -1294,6 +1303,7 @@ unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs)
void uprobe_free_utask(struct task_struct *t)
{
struct uprobe_task *utask = t->utask;
+ struct return_instance *ri, *tmp;

if (!utask)
return;
@@ -1301,6 +1311,15 @@ void uprobe_free_utask(struct task_struct *t)
if (utask->active_uprobe)
put_uprobe(utask->active_uprobe);

+ ri = utask->return_instances;
+ while (ri) {
+ tmp = ri;
+ ri = ri->next;
+
+ put_uprobe(tmp->uprobe);
+ kfree(tmp);
+ }
+
xol_free_insn_slot(t);
kfree(utask);
t->utask = NULL;
@@ -1348,6 +1367,65 @@ static unsigned long get_trampoline_vaddr(void)
return trampoline_vaddr;
}

+static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
+{
+ struct return_instance *ri;
+ struct uprobe_task *utask;
+ unsigned long orig_ret_vaddr, trampoline_vaddr;
+ bool chained = false;
+
+ if (!get_xol_area())
+ return;
+
+ utask = get_utask();
+ if (!utask)
+ return;
+
+ ri = kzalloc(sizeof(struct return_instance), GFP_KERNEL);
+ if (!ri)
+ goto fail;
+
+ trampoline_vaddr = get_trampoline_vaddr();
+ orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
+ if (orig_ret_vaddr == -1)
+ goto fail;
+
+ /*
+ * We don't want to keep trampoline address in stack, rather keep the
+ * original return address of first caller thru all the consequent
+ * instances. This also makes breakpoint unwrapping easier.
+ */
+ if (orig_ret_vaddr == trampoline_vaddr) {
+ if (!utask->return_instances) {
+ /*
+ * This situation is not possible. Likely we have an
+ * attack from user-space.
+ */
+ pr_warn("uprobe: unable to set uretprobe pid/tgid=%d/%d\n",
+ current->pid, current->tgid);
+ goto fail;
+ }
+
+ chained = true;
+ orig_ret_vaddr = utask->return_instances->orig_ret_vaddr;
+ }
+
+ atomic_inc(&uprobe->ref);
+ ri->uprobe = uprobe;
+ ri->func = instruction_pointer(regs);
+ ri->orig_ret_vaddr = orig_ret_vaddr;
+ ri->chained = chained;
+
+ /* add instance to the stack */
+ ri->next = utask->return_instances;
+ utask->return_instances = ri;
+
+ return;
+
+ fail:
+ kfree(ri);
+}
+
/* Prepare to single-step probed instruction out of line. */
static int
pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
@@ -1503,6 +1581,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
{
struct uprobe_consumer *uc;
int remove = UPROBE_HANDLER_REMOVE;
+ bool need_prep = false; /* prepare return uprobe, when needed */

down_read(&uprobe->register_rwsem);
for (uc = uprobe->consumers; uc; uc = uc->next) {
@@ -1513,9 +1592,16 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
WARN(rc & ~UPROBE_HANDLER_MASK,
"bad rc=0x%x from %pf()\n", rc, uc->handler);
}
+
+ if (uc->ret_handler)
+ need_prep = true;
+
remove &= rc;
}

+ if (need_prep && !remove)
+ prepare_uretprobe(uprobe, regs); /* put bp at return */
+
if (remove && uprobe->consumers) {
WARN_ON(!uprobe_is_active(uprobe));
unapply_uprobe(uprobe, current->mm);
@@ -1634,7 +1720,11 @@ void uprobe_notify_resume(struct pt_regs *regs)
*/
int uprobe_pre_sstep_notifier(struct pt_regs *regs)
{
- if (!current->mm || !test_bit(MMF_HAS_UPROBES, &current->mm->flags))
+ if (!current->mm)
+ return 0;
+
+ if (!test_bit(MMF_HAS_UPROBES, &current->mm->flags) &&
+ (!current->utask || !current->utask->return_instances))
return 0;

set_thread_flag(TIF_UPROBE);
--
1.8.1.4

2013-04-03 16:01:46

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 4/9] uretprobes/ppc: Hijack return address

Hijack the return address and replace it with a trampoline address.
PowerPC implementation.

Signed-off-by: Anton Arapov <[email protected]>
---
arch/powerpc/include/asm/uprobes.h | 1 +
arch/powerpc/kernel/uprobes.c | 13 +++++++++++++
2 files changed, 14 insertions(+)

diff --git a/arch/powerpc/include/asm/uprobes.h b/arch/powerpc/include/asm/uprobes.h
index b532060..2301602 100644
--- a/arch/powerpc/include/asm/uprobes.h
+++ b/arch/powerpc/include/asm/uprobes.h
@@ -51,4 +51,5 @@ extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
+extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
#endif /* _ASM_UPROBES_H */
diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index bc77834..567b975 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -188,3 +188,16 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)

return false;
}
+
+unsigned long
+arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs)
+{
+ unsigned long orig_ret_vaddr;
+
+ orig_ret_vaddr = regs->link;
+
+ /* Replace the return addr with trampoline addr */
+ regs->link = trampoline_vaddr;
+
+ return orig_ret_vaddr;
+}
--
1.8.1.4

2013-04-03 16:02:14

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 8/9] uretprobes: Remove -ENOSYS as return probes implemented

Enclose return probes implementation.

Signed-off-by: Anton Arapov <[email protected]>
---
kernel/events/uprobes.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 489f5e3..9af52f7 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -828,10 +828,6 @@ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *
if (!uc->handler && !uc->ret_handler)
return -EINVAL;

- /* TODO: Implement return probes */
- if (uc->ret_handler)
- return -ENOSYS;
-
/* Racy, just to catch the obvious mistakes */
if (offset > i_size_read(inode))
return -EINVAL;
--
1.8.1.4

2013-04-03 16:02:28

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 7/9] uretprobes: Limit the depth of return probe nestedness

Unlike the kretprobes we can't trust userspace, thus must have
protection from user space attacks. User-space have "unlimited"
stack, and this patch limits the return probes nestedness as a
simple remedy for it.

Note that this implementation leaks return_instance on siglongjmp
until exit()/exec().

The intention is to have KISS and bare minimum solution for the
initial implementation in order to not complicate the uretprobes
code.

In the future we may come up with more sophisticated solution that
remove this depth limitation. It is not easy task and lays beyond
this patchset.

Signed-off-by: Anton Arapov <[email protected]>
---
include/linux/uprobes.h | 3 +++
kernel/events/uprobes.c | 11 +++++++++++
2 files changed, 14 insertions(+)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 5f8960e..d7bcf10 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -38,6 +38,8 @@ struct inode;
#define UPROBE_HANDLER_REMOVE 1
#define UPROBE_HANDLER_MASK 1

+#define MAX_URETPROBE_DEPTH 64
+
enum uprobe_filter_ctx {
UPROBE_FILTER_REGISTER,
UPROBE_FILTER_UNREGISTER,
@@ -72,6 +74,7 @@ struct uprobe_task {
struct arch_uprobe_task autask;

struct return_instance *return_instances;
+ unsigned int depth;
struct uprobe *active_uprobe;

unsigned long xol_vaddr;
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index d129c1d..489f5e3 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1381,6 +1381,13 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
if (!utask)
return;

+ if (utask->depth >= MAX_URETPROBE_DEPTH) {
+ printk_ratelimited(KERN_INFO "uprobe: omit uretprobe due to"
+ " nestedness limit pid/tgid=%d/%d\n",
+ current->pid, current->tgid);
+ return;
+ }
+
ri = kzalloc(sizeof(struct return_instance), GFP_KERNEL);
if (!ri)
goto fail;
@@ -1416,6 +1423,8 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
ri->orig_ret_vaddr = orig_ret_vaddr;
ri->chained = chained;

+ utask->depth++;
+
/* add instance to the stack */
ri->next = utask->return_instances;
utask->return_instances = ri;
@@ -1652,6 +1661,8 @@ static bool handler_uretprobe(struct pt_regs *regs)
if (!chained)
break;

+ utask->depth--;
+
BUG_ON(!ri);
}

--
1.8.1.4

2013-04-03 16:02:37

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 6/9] uretprobes: Return probe exit, invoke handlers

Uretprobe handlers are invoked when the trampoline is hit, on completion the
trampoline is replaced with the saved return address and the uretprobe instance
deleted.

v1 changes:
* pass bp_vaddr to ret_handler()
* simplify handle_uretprobe()

RFCv6 changes:
* rework handle_uretprobe()

RFCv5 changes:
* switch to simply linked list ->return_uprobes
* rework handle_uretprobe()

RFCv4 changes:
* check, whether utask is not NULL in handle_uretprobe()
* get rid of area->rp_trampoline_vaddr
* minor handle_uretprobe() fixups

RFCv3 changes:
* protected uprobe with refcounter. See put_uprobe() in handle_uretprobe()
that reflects increment in prepare_uretprobe()

RFCv2 changes:
* get rid of ->return_consumers member from struct uprobe, introduce
ret_handler() in consumer instead

Signed-off-by: Anton Arapov <[email protected]>
---
kernel/events/uprobes.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 08ecfff..d129c1d 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1609,6 +1609,57 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
up_read(&uprobe->register_rwsem);
}

+static void
+handler_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)
+{
+ struct uprobe *uprobe = ri->uprobe;
+ struct uprobe_consumer *uc;
+
+ down_read(&uprobe->register_rwsem);
+ for (uc = uprobe->consumers; uc; uc = uc->next) {
+ if (uc->ret_handler)
+ uc->ret_handler(uc, ri->func, regs);
+ }
+ up_read(&uprobe->register_rwsem);
+}
+
+static bool handler_uretprobe(struct pt_regs *regs)
+{
+ struct uprobe_task *utask;
+ struct return_instance *ri, *tmp;
+ bool chained;
+
+ utask = current->utask;
+ if (!utask)
+ return false;
+
+ ri = utask->return_instances;
+ if (!ri)
+ return false;
+
+ instruction_pointer_set(regs, ri->orig_ret_vaddr);
+
+ for (;;) {
+ handler_uretprobe_chain(ri, regs);
+
+ chained = ri->chained;
+ put_uprobe(ri->uprobe);
+
+ tmp = ri;
+ ri = ri->next;
+ kfree(tmp);
+
+ if (!chained)
+ break;
+
+ BUG_ON(!ri);
+ }
+
+ utask->return_instances = ri;
+
+ return true;
+}
+
/*
* Run handler and ask thread to singlestep.
* Ensure all non-fatal signals cannot interrupt thread while it singlesteps.
@@ -1620,8 +1671,15 @@ static void handle_swbp(struct pt_regs *regs)
int uninitialized_var(is_swbp);

bp_vaddr = uprobe_get_swbp_addr(regs);
- uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
+ if (bp_vaddr == get_trampoline_vaddr()) {
+ if (handler_uretprobe(regs))
+ return;

+ pr_warn("uprobe: unable to handle uretprobe pid/tgid=%d/%d\n",
+ current->pid, current->tgid);
+ }
+
+ uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
if (!uprobe) {
if (is_swbp > 0) {
/* No matching uprobe; signal SIGTRAP. */
--
1.8.1.4

2013-04-03 16:03:56

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 2/9] uretprobes: Reserve the first slot in xol_vma for trampoline

Allocate trampoline page, as the very first one in uprobed
task xol area, and fill it with breakpoint opcode.

Also introduce get_trampoline_vaddr() helper, to wrap the
trampoline address extraction from area->vaddr. That removes
confusion and eases the debug experience in case ->vaddr
notion will be changed.

v1 changes:
* rework get_trampoline_vaddr() helper.
* init xol_area->slot_count.

Signed-off-by: Anton Arapov <[email protected]>
---
kernel/events/uprobes.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 27c964b..d3c8201 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1109,6 +1109,7 @@ static struct xol_area *get_xol_area(void)
{
struct mm_struct *mm = current->mm;
struct xol_area *area;
+ uprobe_opcode_t insn = UPROBE_SWBP_INSN;

area = mm->uprobes_state.xol_area;
if (area)
@@ -1126,7 +1127,12 @@ static struct xol_area *get_xol_area(void)
if (!area->page)
goto free_bitmap;

+ /* allocate first slot of task's xol_area for the return probes */
+ set_bit(0, area->bitmap);
+ copy_to_page(area->page, 0, &insn, UPROBE_SWBP_INSN_SIZE);
+ atomic_set(&area->slot_count, 1);
init_waitqueue_head(&area->wq);
+
if (!xol_add_vma(area))
return area;

@@ -1323,6 +1329,25 @@ static struct uprobe_task *get_utask(void)
return current->utask;
}

+/*
+ * Current area->vaddr notion assume the trampoline address is always
+ * equal area->vaddr.
+ *
+ * Returns -1 in case the xol_area is not allocated.
+ */
+static unsigned long get_trampoline_vaddr(void)
+{
+ struct xol_area *area;
+ unsigned long trampoline_vaddr = -1;
+
+ area = current->mm->uprobes_state.xol_area;
+ smp_read_barrier_depends();
+ if (area)
+ trampoline_vaddr = area->vaddr;
+
+ return trampoline_vaddr;
+}
+
/* Prepare to single-step probed instruction out of line. */
static int
pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
--
1.8.1.4

2013-04-03 16:06:19

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 1/9] uretprobes: Introduce uprobe_consumer->ret_handler()

Enclose return probes implementation, introduce ->ret_handler() and update
existing code to rely on ->handler() *and* ->ret_handler() for uprobe and
uretprobe respectively.

v1 changes:
* add bp_vaddr argument to ->ret_handler()

RFCv5 changes:
* don't remove uprobe in case there are no uprobe consumer(handler),
see handler_chain() changes.

RFCv3 changes: (the patch is introduced in v3)
* check whether at least one of the consumer's handlers were set.
* a 'TODO' cap that will be removed once return probes be implemented.
* introduce ->ret_handler().

Signed-off-by: Anton Arapov <[email protected]>
---
include/linux/uprobes.h | 3 +++
kernel/events/uprobes.c | 17 ++++++++++++++---
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 02b83db..4042cad 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -46,6 +46,9 @@ enum uprobe_filter_ctx {

struct uprobe_consumer {
int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs);
+ int (*ret_handler)(struct uprobe_consumer *self,
+ unsigned long func,
+ struct pt_regs *regs);
bool (*filter)(struct uprobe_consumer *self,
enum uprobe_filter_ctx ctx,
struct mm_struct *mm);
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 21d8a65..27c964b 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -815,6 +815,14 @@ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *
struct uprobe *uprobe;
int ret;

+ /* Uprobe must have at least one set consumer */
+ if (!uc->handler && !uc->ret_handler)
+ return -EINVAL;
+
+ /* TODO: Implement return probes */
+ if (uc->ret_handler)
+ return -ENOSYS;
+
/* Racy, just to catch the obvious mistakes */
if (offset > i_size_read(inode))
return -EINVAL;
@@ -1473,10 +1481,13 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)

down_read(&uprobe->register_rwsem);
for (uc = uprobe->consumers; uc; uc = uc->next) {
- int rc = uc->handler(uc, regs);
+ int rc = 0;

- WARN(rc & ~UPROBE_HANDLER_MASK,
- "bad rc=0x%x from %pf()\n", rc, uc->handler);
+ if (uc->handler) {
+ rc = uc->handler(uc, regs);
+ WARN(rc & ~UPROBE_HANDLER_MASK,
+ "bad rc=0x%x from %pf()\n", rc, uc->handler);
+ }
remove &= rc;
}

--
1.8.1.4

2013-04-03 16:09:17

by Anton Arapov

[permalink] [raw]
Subject: [PATCH v1 9/9] uretprobes: Documentation update

add the uretprobe syntax and update an example

Signed-off-by: Anton Arapov <[email protected]>
---
Documentation/trace/uprobetracer.txt | 114 ++++++++++++++++++++---------------
1 file changed, 67 insertions(+), 47 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt b/Documentation/trace/uprobetracer.txt
index 24ce682..d9c3e68 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -1,6 +1,8 @@
- Uprobe-tracer: Uprobe-based Event Tracing
- =========================================
- Documentation written by Srikar Dronamraju
+ Uprobe-tracer: Uprobe-based Event Tracing
+ =========================================
+
+ Documentation written by Srikar Dronamraju
+

Overview
--------
@@ -13,78 +15,94 @@ current_tracer. Instead of that, add probe points via
/sys/kernel/debug/tracing/events/uprobes/<EVENT>/enabled.

However unlike kprobe-event tracer, the uprobe event interface expects the
-user to calculate the offset of the probepoint in the object
+user to calculate the offset of the probepoint in the object.

Synopsis of uprobe_tracer
-------------------------
- p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a probe
+ p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
+ r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe)
+ -:[GRP/]EVENT : Clear uprobe or uretprobe event

- GRP : Group name. If omitted, use "uprobes" for it.
- EVENT : Event name. If omitted, the event name is generated
- based on SYMBOL+offs.
- PATH : path to an executable or a library.
- SYMBOL[+offs] : Symbol+offset where the probe is inserted.
+ GRP : Group name. If omitted, "uprobes" is the default value.
+ EVENT : Event name. If omitted, the event name is generated based
+ on SYMBOL+offs.
+ PATH : Path to an executable or a library.
+ SYMBOL[+offs] : Symbol+offset where the probe is inserted.

- FETCHARGS : Arguments. Each probe can have up to 128 args.
- %REG : Fetch register REG
+ FETCHARGS : Arguments. Each probe can have up to 128 args.
+ %REG : Fetch register REG

Event Profiling
---------------
- You can check the total number of probe hits and probe miss-hits via
+You can check the total number of probe hits and probe miss-hits via
/sys/kernel/debug/tracing/uprobe_profile.
- The first column is event name, the second is the number of probe hits,
+The first column is event name, the second is the number of probe hits,
the third is the number of probe miss-hits.

Usage examples
--------------
-To add a probe as a new event, write a new definition to uprobe_events
-as below.
+ * Add a probe as a new uprobe event, write a new definition to uprobe_events
+as below: (sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash)
+
+ echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
+
+ * Add a probe as a new uretprobe event:
+
+ echo 'r: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
+
+ * Unset registered event:

- echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
+ echo '-:bash_0x4245c0' >> /sys/kernel/debug/tracing/uprobe_events

- This sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash
+ * Print out the events that are registered:

- echo > /sys/kernel/debug/tracing/uprobe_events
+ cat /sys/kernel/debug/tracing/uprobe_events

- This clears all probe points.
+ * Clear all events:

-The following example shows how to dump the instruction pointer and %ax
-a register at the probed text address. Here we are trying to probe
-function zfree in /bin/zsh
+ echo > /sys/kernel/debug/tracing/uprobe_events
+
+Following example shows how to dump the instruction pointer and %ax register
+at the probed text address. Probe zfree function in /bin/zsh:

# cd /sys/kernel/debug/tracing/
- # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
+ # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
# objdump -T /bin/zsh | grep -w zfree
0000000000446420 g DF .text 0000000000000012 Base zfree

-0x46420 is the offset of zfree in object /bin/zsh that is loaded at
-0x00400000. Hence the command to probe would be :
+ 0x46420 is the offset of zfree in object /bin/zsh that is loaded at
+ 0x00400000. Hence the command to uprobe would be:
+
+ # echo 'p:zfree_entry /bin/zsh:0x46420 %ip %ax' > uprobe_events
+
+ And the same for the uretprobe would be:

- # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
+ # echo 'r:zfree_exit /bin/zsh:0x46420 %ip %ax' >> uprobe_events

-Please note: User has to explicitly calculate the offset of the probepoint
+Please note: User has to explicitly calculate the offset of the probe-point
in the object. We can see the events that are registered by looking at the
uprobe_events file.

# cat uprobe_events
- p:uprobes/p_zsh_0x46420 /bin/zsh:0x00046420 arg1=%ip arg2=%ax
+ p:uprobes/zfree_entry /bin/zsh:0x00046420 arg1=%ip arg2=%ax
+ r:uprobes/zfree_exit /bin/zsh:0x00046420 arg1=%ip arg2=%ax

-The format of events can be seen by viewing the file events/uprobes/p_zsh_0x46420/format
+Format of events can be seen by viewing the file events/uprobes/zfree_entry/format

- # cat events/uprobes/p_zsh_0x46420/format
- name: p_zsh_0x46420
+ # cat events/uprobes/zfree_entry/format
+ name: zfree_entry
ID: 922
format:
- field:unsigned short common_type; offset:0; size:2; signed:0;
- field:unsigned char common_flags; offset:2; size:1; signed:0;
- field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
- field:int common_pid; offset:4; size:4; signed:1;
- field:int common_padding; offset:8; size:4; signed:1;
+ field:unsigned short common_type; offset:0; size:2; signed:0;
+ field:unsigned char common_flags; offset:2; size:1; signed:0;
+ field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
+ field:int common_pid; offset:4; size:4; signed:1;
+ field:int common_padding; offset:8; size:4; signed:1;

- field:unsigned long __probe_ip; offset:12; size:4; signed:0;
- field:u32 arg1; offset:16; size:4; signed:0;
- field:u32 arg2; offset:20; size:4; signed:0;
+ field:unsigned long __probe_ip; offset:12; size:4; signed:0;
+ field:u32 arg1; offset:16; size:4; signed:0;
+ field:u32 arg2; offset:20; size:4; signed:0;

print fmt: "(%lx) arg1=%lx arg2=%lx", REC->__probe_ip, REC->arg1, REC->arg2

@@ -94,6 +112,7 @@ Right after definition, each event is disabled by default. For tracing these
# echo 1 > events/uprobes/enable

Lets disable the event after sleeping for some time.
+
# sleep 20
# echo 0 > events/uprobes/enable

@@ -104,10 +123,11 @@ And you can see the traced information via /sys/kernel/debug/tracing/trace.
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
- zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
- zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
- zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
- zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
-
-Each line shows us probes were triggered for a pid 24842 with ip being
-0x446421 and contents of ax register being 79.
+ zsh-24842 [006] 258544.995456: zfree_entry: (0x446420) arg1=446420 arg2=79
+ zsh-24842 [007] 258545.000270: zfree_exit: (0x446540 <- 0x446420) arg1=446540 arg2=0
+ zsh-24842 [002] 258545.043929: zfree_entry: (0x446420) arg1=446420 arg2=79
+ zsh-24842 [004] 258547.046129: zfree_exit: (0x446540 <- 0x446420) arg1=446540 arg2=0
+
+Output shows us uprobe was triggered for a pid 24842 with ip being 0x446420
+and contents of ax register being 79. And uretprobe was triggered with ip at
+0x446540 with counterpart function entry at 0x446420.
--
1.8.1.4

2013-04-03 17:50:40

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v1 0/9] uretprobes: Return uprobes implementation

On 04/03, Anton Arapov wrote:
>
> Anton Arapov (9):
> uretprobes: Introduce uprobe_consumer->ret_handler()
> uretprobes: Reserve the first slot in xol_vma for trampoline
> uretprobes/x86: Hijack return address
> uretprobes/ppc: Hijack return address
> uretprobes: Return probe entry, prepare_uretprobe()
> uretprobes: Return probe exit, invoke handlers
> uretprobes: Limit the depth of return probe nestedness
> uretprobes: Remove -ENOSYS as return probes implemented
> uretprobes: Documentation update
>
> Documentation/trace/uprobetracer.txt | 126 +++++++++++++---------
> arch/powerpc/include/asm/uprobes.h | 1 +
> arch/powerpc/kernel/uprobes.c | 13 +++
> arch/x86/include/asm/uprobes.h | 1 +
> arch/x86/kernel/uprobes.c | 29 +++++
> include/linux/uprobes.h | 7 ++
> kernel/events/uprobes.c | 202 +++++++++++++++++++++++++++++++++--
> 7 files changed, 320 insertions(+), 59 deletions(-)

Looks fine to me. I am going to add this to
git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc uprobes/core

Ananth. "4/9 uretprobes/ppc" looks "obviously correct", but could you
please review and ack/nack ?

To remind, this series depends on other changes in uprobes core
and kernel/trace/trace_uprobe.c, please see the full stat below.

I'll wait for the comments a bit, after that I'll ask Ingo to
pull if nobody objects.

Anton Arapov (9):
uretprobes: Introduce uprobe_consumer->ret_handler()
uretprobes: Reserve the first slot in xol_vma for trampoline
uretprobes/x86: Hijack return address
uretprobes/ppc: Hijack return address
uretprobes: Return probe entry, prepare_uretprobe()
uretprobes: Return probe exit, invoke handlers
uretprobes: Limit the depth of return probe nestedness
uretprobes: Remove -ENOSYS as return probes implemented
uretprobes: Documentation update

Oleg Nesterov (15):
uprobes: Turn copy_opcode() into copy_from_page()
uprobes: Change __copy_insn() to use copy_from_page()
uprobes: Kill the unnecesary filp != NULL check in __copy_insn()
uprobes: Introduce copy_to_page()
uprobes: Change write_opcode() to use copy_*page()
uprobes/tracing: Kill the pointless task_pt_regs() calls
uprobes/tracing: Kill the pointless seq_print_ip_sym() call
uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
uprobes/tracing: Generalize struct uprobe_trace_entry_head
uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers
uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()
uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly
uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly
uprobes/tracing: Make seq_printf() code uretprobe-friendly
uprobes/tracing: Change create_trace_uprobe() to support uretprobes

Documentation/trace/uprobetracer.txt | 114 +++++++++-------
arch/powerpc/include/asm/uprobes.h | 1 +
arch/powerpc/kernel/uprobes.c | 13 ++
arch/x86/include/asm/uprobes.h | 1 +
arch/x86/kernel/uprobes.c | 29 ++++
include/linux/uprobes.h | 7 +
kernel/events/uprobes.c | 251 ++++++++++++++++++++++++++++------
kernel/trace/trace.h | 5 -
kernel/trace/trace_uprobe.c | 205 ++++++++++++++++++++--------
9 files changed, 480 insertions(+), 146 deletions(-)

Subject: Re: [PATCH v1 4/9] uretprobes/ppc: Hijack return address

On Wed, Apr 03, 2013 at 06:00:34PM +0200, Anton Arapov wrote:
> Hijack the return address and replace it with a trampoline address.
> PowerPC implementation.
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Ananth N Mavinakayanahalli <[email protected]>

> ---
> arch/powerpc/include/asm/uprobes.h | 1 +
> arch/powerpc/kernel/uprobes.c | 13 +++++++++++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/uprobes.h b/arch/powerpc/include/asm/uprobes.h
> index b532060..2301602 100644
> --- a/arch/powerpc/include/asm/uprobes.h
> +++ b/arch/powerpc/include/asm/uprobes.h
> @@ -51,4 +51,5 @@ extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
> extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
> extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
> extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
> +extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
> #endif /* _ASM_UPROBES_H */
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index bc77834..567b975 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -188,3 +188,16 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>
> return false;
> }
> +
> +unsigned long
> +arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs)
> +{
> + unsigned long orig_ret_vaddr;
> +
> + orig_ret_vaddr = regs->link;
> +
> + /* Replace the return addr with trampoline addr */
> + regs->link = trampoline_vaddr;
> +
> + return orig_ret_vaddr;
> +}
> --
> 1.8.1.4

Subject: Re: [PATCH v1 0/9] uretprobes: Return uprobes implementation

On Wed, Apr 03, 2013 at 07:45:52PM +0200, Oleg Nesterov wrote:
> On 04/03, Anton Arapov wrote:

...

> Looks fine to me. I am going to add this to
> git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc uprobes/core
>
> Ananth. "4/9 uretprobes/ppc" looks "obviously correct", but could you
> please review and ack/nack ?

Oleg,
4/9 is fine; I have acked it.

Ananth

2013-04-07 10:53:30

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] uretprobes: Return probe exit, invoke handlers

* Anton Arapov <[email protected]> [2013-04-03 18:00:36]:

> Uretprobe handlers are invoked when the trampoline is hit, on completion the
> trampoline is replaced with the saved return address and the uretprobe instance
> deleted.
>
> v1 changes:
> * pass bp_vaddr to ret_handler()
> * simplify handle_uretprobe()
>
> RFCv6 changes:
> * rework handle_uretprobe()
>
> RFCv5 changes:
> * switch to simply linked list ->return_uprobes
> * rework handle_uretprobe()
>
> RFCv4 changes:
> * check, whether utask is not NULL in handle_uretprobe()
> * get rid of area->rp_trampoline_vaddr
> * minor handle_uretprobe() fixups
>
> RFCv3 changes:
> * protected uprobe with refcounter. See put_uprobe() in handle_uretprobe()
> that reflects increment in prepare_uretprobe()
>
> RFCv2 changes:
> * get rid of ->return_consumers member from struct uprobe, introduce
> ret_handler() in consumer instead
>
> Signed-off-by: Anton Arapov <[email protected]>
> ---
> kernel/events/uprobes.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 08ecfff..d129c1d 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1609,6 +1609,57 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> up_read(&uprobe->register_rwsem);
> }
>
> +static void
> +handler_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)

> +{
> + struct uprobe *uprobe = ri->uprobe;
> + struct uprobe_consumer *uc;
> +
> + down_read(&uprobe->register_rwsem);
> + for (uc = uprobe->consumers; uc; uc = uc->next) {
> + if (uc->ret_handler)
> + uc->ret_handler(uc, ri->func, regs);
> + }
> + up_read(&uprobe->register_rwsem);
> +}
> +
> +static bool handler_uretprobe(struct pt_regs *regs)

Nit: can this be renamed to handle_trampoline

> +{
> + struct uprobe_task *utask;
> + struct return_instance *ri, *tmp;
> + bool chained;
> +
> + utask = current->utask;
> + if (!utask)
> + return false;
> +
> + ri = utask->return_instances;
> + if (!ri)
> + return false;
> +
> + instruction_pointer_set(regs, ri->orig_ret_vaddr);

Should we a check here before using top most ri.
What if the application had done a longjmp and the trampoline he hit
corresponds to something thats below in the stack?

Not sure if this what you meant by leaking return instances in your next
patch.


> +
> + for (;;) {
> + handler_uretprobe_chain(ri, regs);
> +
> + chained = ri->chained;
> + put_uprobe(ri->uprobe);
> +
> + tmp = ri;
> + ri = ri->next;
> + kfree(tmp);
> +
> + if (!chained)
> + break;
> +
> + BUG_ON(!ri);
> + }
> +
> + utask->return_instances = ri;
> +
> + return true;
> +}
> +
> /*
> * Run handler and ask thread to singlestep.
> * Ensure all non-fatal signals cannot interrupt thread while it singlesteps.
> @@ -1620,8 +1671,15 @@ static void handle_swbp(struct pt_regs *regs)
> int uninitialized_var(is_swbp);
>
> bp_vaddr = uprobe_get_swbp_addr(regs);
> - uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
> + if (bp_vaddr == get_trampoline_vaddr()) {
> + if (handler_uretprobe(regs))
> + return;
>
> + pr_warn("uprobe: unable to handle uretprobe pid/tgid=%d/%d\n",
> + current->pid, current->tgid);
> + }
> +
> + uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
> if (!uprobe) {
> if (is_swbp > 0) {
> /* No matching uprobe; signal SIGTRAP. */
> --
> 1.8.1.4
>

2013-04-07 11:43:17

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 1/9] uretprobes: Introduce uprobe_consumer->ret_handler()

* Anton Arapov <[email protected]> [2013-04-03 18:00:31]:

> Enclose return probes implementation, introduce ->ret_handler() and update
> existing code to rely on ->handler() *and* ->ret_handler() for uprobe and
> uretprobe respectively.
>
> v1 changes:
> * add bp_vaddr argument to ->ret_handler()
>
> RFCv5 changes:
> * don't remove uprobe in case there are no uprobe consumer(handler),
> see handler_chain() changes.
>
> RFCv3 changes: (the patch is introduced in v3)
> * check whether at least one of the consumer's handlers were set.
> * a 'TODO' cap that will be removed once return probes be implemented.
> * introduce ->ret_handler().
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

> ---
> include/linux/uprobes.h | 3 +++
> kernel/events/uprobes.c | 17 ++++++++++++++---
> 2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index 02b83db..4042cad 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -46,6 +46,9 @@ enum uprobe_filter_ctx {
>
> struct uprobe_consumer {
> int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs);
> + int (*ret_handler)(struct uprobe_consumer *self,
> + unsigned long func,
> + struct pt_regs *regs);
> bool (*filter)(struct uprobe_consumer *self,
> enum uprobe_filter_ctx ctx,
> struct mm_struct *mm);
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 21d8a65..27c964b 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -815,6 +815,14 @@ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *
> struct uprobe *uprobe;
> int ret;
>
> + /* Uprobe must have at least one set consumer */
> + if (!uc->handler && !uc->ret_handler)
> + return -EINVAL;
> +
> + /* TODO: Implement return probes */
> + if (uc->ret_handler)
> + return -ENOSYS;
> +
> /* Racy, just to catch the obvious mistakes */
> if (offset > i_size_read(inode))
> return -EINVAL;
> @@ -1473,10 +1481,13 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
>
> down_read(&uprobe->register_rwsem);
> for (uc = uprobe->consumers; uc; uc = uc->next) {
> - int rc = uc->handler(uc, regs);
> + int rc = 0;
>
> - WARN(rc & ~UPROBE_HANDLER_MASK,
> - "bad rc=0x%x from %pf()\n", rc, uc->handler);
> + if (uc->handler) {
> + rc = uc->handler(uc, regs);
> + WARN(rc & ~UPROBE_HANDLER_MASK,
> + "bad rc=0x%x from %pf()\n", rc, uc->handler);
> + }
> remove &= rc;
> }
>
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 11:50:27

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 2/9] uretprobes: Reserve the first slot in xol_vma for trampoline

* Anton Arapov <[email protected]> [2013-04-03 18:00:32]:

> Allocate trampoline page, as the very first one in uprobed
> task xol area, and fill it with breakpoint opcode.
>
> Also introduce get_trampoline_vaddr() helper, to wrap the
> trampoline address extraction from area->vaddr. That removes
> confusion and eases the debug experience in case ->vaddr
> notion will be changed.
>
> v1 changes:
> * rework get_trampoline_vaddr() helper.
> * init xol_area->slot_count.
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

(one small check below:)
> ---
> kernel/events/uprobes.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 27c964b..d3c8201 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1109,6 +1109,7 @@ static struct xol_area *get_xol_area(void)
> {
> struct mm_struct *mm = current->mm;
> struct xol_area *area;
> + uprobe_opcode_t insn = UPROBE_SWBP_INSN;
>
> area = mm->uprobes_state.xol_area;
> if (area)
> @@ -1126,7 +1127,12 @@ static struct xol_area *get_xol_area(void)
> if (!area->page)
> goto free_bitmap;
>
> + /* allocate first slot of task's xol_area for the return probes */
> + set_bit(0, area->bitmap);
> + copy_to_page(area->page, 0, &insn, UPROBE_SWBP_INSN_SIZE);
> + atomic_set(&area->slot_count, 1);
> init_waitqueue_head(&area->wq);
> +
> if (!xol_add_vma(area))
> return area;
>
> @@ -1323,6 +1329,25 @@ static struct uprobe_task *get_utask(void)
> return current->utask;
> }
>
> +/*
> + * Current area->vaddr notion assume the trampoline address is always
> + * equal area->vaddr.
> + *
> + * Returns -1 in case the xol_area is not allocated.
> + */
> +static unsigned long get_trampoline_vaddr(void)
> +{
> + struct xol_area *area;
> + unsigned long trampoline_vaddr = -1;
> +
> + area = current->mm->uprobes_state.xol_area;
> + smp_read_barrier_depends();

check: do we need this barrier?

> + if (area)
> + trampoline_vaddr = area->vaddr;
> +
> + return trampoline_vaddr;
> +}
> +
> /* Prepare to single-step probed instruction out of line. */
> static int
> pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 11:54:46

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 3/9] uretprobes/x86: Hijack return address

* Anton Arapov <[email protected]> [2013-04-03 18:00:33]:

> Hijack the return address and replace it with a trampoline address.
>
> v1 changes:
> * use force_sig_info()
> * rework and simplify logic
>
> RFCv5 changes:
> * change the fail return code, because orig_ret_vaddr=0 is possible
> * style fixup
> RFCv2 changes:
> * remove ->doomed flag, kill task immediately
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

> ---
> arch/x86/include/asm/uprobes.h | 1 +
> arch/x86/kernel/uprobes.c | 29 +++++++++++++++++++++++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
> index 8ff8be7..6e51979 100644
> --- a/arch/x86/include/asm/uprobes.h
> +++ b/arch/x86/include/asm/uprobes.h
> @@ -55,4 +55,5 @@ extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
> extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
> extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
> extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
> +extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
> #endif /* _ASM_UPROBES_H */
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 0ba4cfb..2ed8459 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -697,3 +697,32 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> send_sig(SIGTRAP, current, 0);
> return ret;
> }
> +
> +unsigned long
> +arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs)
> +{
> + int rasize, ncopied;
> + unsigned long orig_ret_vaddr = 0; /* clear high bits for 32-bit apps */
> +
> + rasize = is_ia32_task() ? 4 : 8;
> + ncopied = copy_from_user(&orig_ret_vaddr, (void __user *)regs->sp, rasize);
> + if (unlikely(ncopied))
> + return -1;
> +
> + /* check whether address has been already hijacked */
> + if (orig_ret_vaddr == trampoline_vaddr)
> + return orig_ret_vaddr;
> +
> + ncopied = copy_to_user((void __user *)regs->sp, &trampoline_vaddr, rasize);
> + if (likely(!ncopied))
> + return orig_ret_vaddr;
> +
> + if (ncopied != rasize) {
> + pr_err("uprobe: return address clobbered: pid=%d, %%sp=%#lx, "
> + "%%ip=%#lx\n", current->pid, regs->sp, regs->ip);
> +
> + force_sig_info(SIGSEGV, SEND_SIG_FORCED, current);
> + }
> +
> + return -1;
> +}
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 11:57:03

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 4/9] uretprobes/ppc: Hijack return address

* Anton Arapov <[email protected]> [2013-04-03 18:00:34]:

> Hijack the return address and replace it with a trampoline address.
> PowerPC implementation.
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

> ---
> arch/powerpc/include/asm/uprobes.h | 1 +
> arch/powerpc/kernel/uprobes.c | 13 +++++++++++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/uprobes.h b/arch/powerpc/include/asm/uprobes.h
> index b532060..2301602 100644
> --- a/arch/powerpc/include/asm/uprobes.h
> +++ b/arch/powerpc/include/asm/uprobes.h
> @@ -51,4 +51,5 @@ extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
> extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
> extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
> extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
> +extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
> #endif /* _ASM_UPROBES_H */
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index bc77834..567b975 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -188,3 +188,16 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>
> return false;
> }
> +
> +unsigned long
> +arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs)
> +{
> + unsigned long orig_ret_vaddr;
> +
> + orig_ret_vaddr = regs->link;
> +
> + /* Replace the return addr with trampoline addr */
> + regs->link = trampoline_vaddr;
> +
> + return orig_ret_vaddr;
> +}
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 11:58:46

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 5/9] uretprobes: Return probe entry, prepare_uretprobe()

* Anton Arapov <[email protected]> [2013-04-03 18:00:35]:

> When a uprobe with return probe consumer is hit, prepare_uretprobe()
> function is invoked. It creates return_instance, hijacks return address
> and replaces it with the trampoline.
>
> * Return instances are kept as stack per uprobed task.
> * Return instance is chained, when the original return address is
> trampoline's page vaddr (e.g. recursive call of the probed function).
>
> v1 changes:
> * preserve address of the breakpoint in return_instance.
> * don't forget NULLify return_instances on free_utask.
> * simplify prepare_uretprobe().
>
> RFCv6 changes:
> * rework prepare_uretprobe() logic in order to make further unwinding
> in handler_uretprobe() simplier.
> * introduce the 'dirty' field.
>
> RFCv5 changes:
> * switch from hlist to simply linked list for tracking ->*return_uprobes.
> * preallocate first slot xol_area for return probes, see xol_get_area()
> changes.
> * add get_trampoline_vaddr() helper, to emphasize area->vaddr overload.
>
> RFCv4 changes:
> * get rid of area->rp_trampoline_vaddr as it always the same as ->vaddr.
> * cleanup ->return_uprobes list in uprobe_free_utask(), because the
> task can exit from inside the ret-probe'd function(s).
> * in find_active_uprobe(): Once we inserted "int3" we must ensure that
> handle_swbp() will be called even if this uprobe goes away. We have
> the reference but it only protects uprobe itself, it can't protect
> agains delete_uprobe().
> IOW, we must ensure that uprobe_pre_sstep_notifier() can't return 0.
>
> RFCv3 changes:
> * protected uprobe with refcounter. See atomic_inc in prepare_uretprobe()
> and put_uprobe() in a following patch in handle_uretprobe().
>
> RFCv2 changes:
> * get rid of ->return_consumers member from struct uprobe, introduce
> ret_handler() in consumer.
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

> ---
> include/linux/uprobes.h | 1 +
> kernel/events/uprobes.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 92 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index 4042cad..5f8960e 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -71,6 +71,7 @@ struct uprobe_task {
> enum uprobe_task_state state;
> struct arch_uprobe_task autask;
>
> + struct return_instance *return_instances;
> struct uprobe *active_uprobe;
>
> unsigned long xol_vaddr;
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index d3c8201..08ecfff 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -75,6 +75,15 @@ struct uprobe {
> struct arch_uprobe arch;
> };
>
> +struct return_instance {
> + struct uprobe *uprobe;
> + unsigned long func;
> + unsigned long orig_ret_vaddr; /* original return address */
> + bool chained; /* true, if instance is nested */
> +
> + struct return_instance *next; /* keep as stack */
> +};
> +
> /*
> * valid_vma: Verify if the specified vma is an executable vma
> * Relax restrictions while unregistering: vm_flags might have
> @@ -1294,6 +1303,7 @@ unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs)
> void uprobe_free_utask(struct task_struct *t)
> {
> struct uprobe_task *utask = t->utask;
> + struct return_instance *ri, *tmp;
>
> if (!utask)
> return;
> @@ -1301,6 +1311,15 @@ void uprobe_free_utask(struct task_struct *t)
> if (utask->active_uprobe)
> put_uprobe(utask->active_uprobe);
>
> + ri = utask->return_instances;
> + while (ri) {
> + tmp = ri;
> + ri = ri->next;
> +
> + put_uprobe(tmp->uprobe);
> + kfree(tmp);
> + }
> +
> xol_free_insn_slot(t);
> kfree(utask);
> t->utask = NULL;
> @@ -1348,6 +1367,65 @@ static unsigned long get_trampoline_vaddr(void)
> return trampoline_vaddr;
> }
>
> +static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
> +{
> + struct return_instance *ri;
> + struct uprobe_task *utask;
> + unsigned long orig_ret_vaddr, trampoline_vaddr;
> + bool chained = false;
> +
> + if (!get_xol_area())
> + return;
> +
> + utask = get_utask();
> + if (!utask)
> + return;
> +
> + ri = kzalloc(sizeof(struct return_instance), GFP_KERNEL);
> + if (!ri)
> + goto fail;
> +
> + trampoline_vaddr = get_trampoline_vaddr();
> + orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
> + if (orig_ret_vaddr == -1)
> + goto fail;
> +
> + /*
> + * We don't want to keep trampoline address in stack, rather keep the
> + * original return address of first caller thru all the consequent
> + * instances. This also makes breakpoint unwrapping easier.
> + */
> + if (orig_ret_vaddr == trampoline_vaddr) {
> + if (!utask->return_instances) {
> + /*
> + * This situation is not possible. Likely we have an
> + * attack from user-space.
> + */
> + pr_warn("uprobe: unable to set uretprobe pid/tgid=%d/%d\n",
> + current->pid, current->tgid);
> + goto fail;
> + }
> +
> + chained = true;
> + orig_ret_vaddr = utask->return_instances->orig_ret_vaddr;
> + }
> +
> + atomic_inc(&uprobe->ref);
> + ri->uprobe = uprobe;
> + ri->func = instruction_pointer(regs);
> + ri->orig_ret_vaddr = orig_ret_vaddr;
> + ri->chained = chained;
> +
> + /* add instance to the stack */
> + ri->next = utask->return_instances;
> + utask->return_instances = ri;
> +
> + return;
> +
> + fail:
> + kfree(ri);
> +}
> +
> /* Prepare to single-step probed instruction out of line. */
> static int
> pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
> @@ -1503,6 +1581,7 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> {
> struct uprobe_consumer *uc;
> int remove = UPROBE_HANDLER_REMOVE;
> + bool need_prep = false; /* prepare return uprobe, when needed */
>
> down_read(&uprobe->register_rwsem);
> for (uc = uprobe->consumers; uc; uc = uc->next) {
> @@ -1513,9 +1592,16 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> WARN(rc & ~UPROBE_HANDLER_MASK,
> "bad rc=0x%x from %pf()\n", rc, uc->handler);
> }
> +
> + if (uc->ret_handler)
> + need_prep = true;
> +
> remove &= rc;
> }
>
> + if (need_prep && !remove)
> + prepare_uretprobe(uprobe, regs); /* put bp at return */
> +
> if (remove && uprobe->consumers) {
> WARN_ON(!uprobe_is_active(uprobe));
> unapply_uprobe(uprobe, current->mm);
> @@ -1634,7 +1720,11 @@ void uprobe_notify_resume(struct pt_regs *regs)
> */
> int uprobe_pre_sstep_notifier(struct pt_regs *regs)
> {
> - if (!current->mm || !test_bit(MMF_HAS_UPROBES, &current->mm->flags))
> + if (!current->mm)
> + return 0;
> +
> + if (!test_bit(MMF_HAS_UPROBES, &current->mm->flags) &&
> + (!current->utask || !current->utask->return_instances))
> return 0;
>
> set_thread_flag(TIF_UPROBE);
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 12:01:15

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 7/9] uretprobes: Limit the depth of return probe nestedness

* Anton Arapov <[email protected]> [2013-04-03 18:00:37]:

> Unlike the kretprobes we can't trust userspace, thus must have
> protection from user space attacks. User-space have "unlimited"
> stack, and this patch limits the return probes nestedness as a
> simple remedy for it.
>
> Note that this implementation leaks return_instance on siglongjmp
> until exit()/exec().
>
> The intention is to have KISS and bare minimum solution for the
> initial implementation in order to not complicate the uretprobes
> code.
>
> In the future we may come up with more sophisticated solution that
> remove this depth limitation. It is not easy task and lays beyond
> this patchset.
>
> Signed-off-by: Anton Arapov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

> ---
> include/linux/uprobes.h | 3 +++
> kernel/events/uprobes.c | 11 +++++++++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index 5f8960e..d7bcf10 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -38,6 +38,8 @@ struct inode;
> #define UPROBE_HANDLER_REMOVE 1
> #define UPROBE_HANDLER_MASK 1
>
> +#define MAX_URETPROBE_DEPTH 64
> +
> enum uprobe_filter_ctx {
> UPROBE_FILTER_REGISTER,
> UPROBE_FILTER_UNREGISTER,
> @@ -72,6 +74,7 @@ struct uprobe_task {
> struct arch_uprobe_task autask;
>
> struct return_instance *return_instances;
> + unsigned int depth;
> struct uprobe *active_uprobe;
>
> unsigned long xol_vaddr;
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index d129c1d..489f5e3 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1381,6 +1381,13 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
> if (!utask)
> return;
>
> + if (utask->depth >= MAX_URETPROBE_DEPTH) {
> + printk_ratelimited(KERN_INFO "uprobe: omit uretprobe due to"
> + " nestedness limit pid/tgid=%d/%d\n",
> + current->pid, current->tgid);
> + return;
> + }
> +
> ri = kzalloc(sizeof(struct return_instance), GFP_KERNEL);
> if (!ri)
> goto fail;
> @@ -1416,6 +1423,8 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
> ri->orig_ret_vaddr = orig_ret_vaddr;
> ri->chained = chained;
>
> + utask->depth++;
> +
> /* add instance to the stack */
> ri->next = utask->return_instances;
> utask->return_instances = ri;
> @@ -1652,6 +1661,8 @@ static bool handler_uretprobe(struct pt_regs *regs)
> if (!chained)
> break;
>
> + utask->depth--;
> +
> BUG_ON(!ri);
> }
>
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 12:01:54

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 8/9] uretprobes: Remove -ENOSYS as return probes implemented

* Anton Arapov <[email protected]> [2013-04-03 18:00:38]:

> Enclose return probes implementation.
>
> Signed-off-by: Anton Arapov <[email protected]>
> ---

Acked-by: Srikar Dronamraju <[email protected]>

> kernel/events/uprobes.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 489f5e3..9af52f7 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -828,10 +828,6 @@ int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *
> if (!uc->handler && !uc->ret_handler)
> return -EINVAL;
>
> - /* TODO: Implement return probes */
> - if (uc->ret_handler)
> - return -ENOSYS;
> -
> /* Racy, just to catch the obvious mistakes */
> if (offset > i_size_read(inode))
> return -EINVAL;
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-07 12:03:00

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 9/9] uretprobes: Documentation update

* Anton Arapov <[email protected]> [2013-04-03 18:00:39]:

> add the uretprobe syntax and update an example
>
> Signed-off-by: Anton Arapov <[email protected]>
> ---

Acked-by: Srikar Dronamraju <[email protected]>

> Documentation/trace/uprobetracer.txt | 114 ++++++++++++++++++++---------------
> 1 file changed, 67 insertions(+), 47 deletions(-)
>
> diff --git a/Documentation/trace/uprobetracer.txt b/Documentation/trace/uprobetracer.txt
> index 24ce682..d9c3e68 100644
> --- a/Documentation/trace/uprobetracer.txt
> +++ b/Documentation/trace/uprobetracer.txt
> @@ -1,6 +1,8 @@
> - Uprobe-tracer: Uprobe-based Event Tracing
> - =========================================
> - Documentation written by Srikar Dronamraju
> + Uprobe-tracer: Uprobe-based Event Tracing
> + =========================================
> +
> + Documentation written by Srikar Dronamraju
> +
>
> Overview
> --------
> @@ -13,78 +15,94 @@ current_tracer. Instead of that, add probe points via
> /sys/kernel/debug/tracing/events/uprobes/<EVENT>/enabled.
>
> However unlike kprobe-event tracer, the uprobe event interface expects the
> -user to calculate the offset of the probepoint in the object
> +user to calculate the offset of the probepoint in the object.
>
> Synopsis of uprobe_tracer
> -------------------------
> - p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a probe
> + p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
> + r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe)
> + -:[GRP/]EVENT : Clear uprobe or uretprobe event
>
> - GRP : Group name. If omitted, use "uprobes" for it.
> - EVENT : Event name. If omitted, the event name is generated
> - based on SYMBOL+offs.
> - PATH : path to an executable or a library.
> - SYMBOL[+offs] : Symbol+offset where the probe is inserted.
> + GRP : Group name. If omitted, "uprobes" is the default value.
> + EVENT : Event name. If omitted, the event name is generated based
> + on SYMBOL+offs.
> + PATH : Path to an executable or a library.
> + SYMBOL[+offs] : Symbol+offset where the probe is inserted.
>
> - FETCHARGS : Arguments. Each probe can have up to 128 args.
> - %REG : Fetch register REG
> + FETCHARGS : Arguments. Each probe can have up to 128 args.
> + %REG : Fetch register REG
>
> Event Profiling
> ---------------
> - You can check the total number of probe hits and probe miss-hits via
> +You can check the total number of probe hits and probe miss-hits via
> /sys/kernel/debug/tracing/uprobe_profile.
> - The first column is event name, the second is the number of probe hits,
> +The first column is event name, the second is the number of probe hits,
> the third is the number of probe miss-hits.
>
> Usage examples
> --------------
> -To add a probe as a new event, write a new definition to uprobe_events
> -as below.
> + * Add a probe as a new uprobe event, write a new definition to uprobe_events
> +as below: (sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash)
> +
> + echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
> +
> + * Add a probe as a new uretprobe event:
> +
> + echo 'r: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
> +
> + * Unset registered event:
>
> - echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
> + echo '-:bash_0x4245c0' >> /sys/kernel/debug/tracing/uprobe_events
>
> - This sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash
> + * Print out the events that are registered:
>
> - echo > /sys/kernel/debug/tracing/uprobe_events
> + cat /sys/kernel/debug/tracing/uprobe_events
>
> - This clears all probe points.
> + * Clear all events:
>
> -The following example shows how to dump the instruction pointer and %ax
> -a register at the probed text address. Here we are trying to probe
> -function zfree in /bin/zsh
> + echo > /sys/kernel/debug/tracing/uprobe_events
> +
> +Following example shows how to dump the instruction pointer and %ax register
> +at the probed text address. Probe zfree function in /bin/zsh:
>
> # cd /sys/kernel/debug/tracing/
> - # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
> + # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
> 00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
> # objdump -T /bin/zsh | grep -w zfree
> 0000000000446420 g DF .text 0000000000000012 Base zfree
>
> -0x46420 is the offset of zfree in object /bin/zsh that is loaded at
> -0x00400000. Hence the command to probe would be :
> + 0x46420 is the offset of zfree in object /bin/zsh that is loaded at
> + 0x00400000. Hence the command to uprobe would be:
> +
> + # echo 'p:zfree_entry /bin/zsh:0x46420 %ip %ax' > uprobe_events
> +
> + And the same for the uretprobe would be:
>
> - # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
> + # echo 'r:zfree_exit /bin/zsh:0x46420 %ip %ax' >> uprobe_events
>
> -Please note: User has to explicitly calculate the offset of the probepoint
> +Please note: User has to explicitly calculate the offset of the probe-point
> in the object. We can see the events that are registered by looking at the
> uprobe_events file.
>
> # cat uprobe_events
> - p:uprobes/p_zsh_0x46420 /bin/zsh:0x00046420 arg1=%ip arg2=%ax
> + p:uprobes/zfree_entry /bin/zsh:0x00046420 arg1=%ip arg2=%ax
> + r:uprobes/zfree_exit /bin/zsh:0x00046420 arg1=%ip arg2=%ax
>
> -The format of events can be seen by viewing the file events/uprobes/p_zsh_0x46420/format
> +Format of events can be seen by viewing the file events/uprobes/zfree_entry/format
>
> - # cat events/uprobes/p_zsh_0x46420/format
> - name: p_zsh_0x46420
> + # cat events/uprobes/zfree_entry/format
> + name: zfree_entry
> ID: 922
> format:
> - field:unsigned short common_type; offset:0; size:2; signed:0;
> - field:unsigned char common_flags; offset:2; size:1; signed:0;
> - field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
> - field:int common_pid; offset:4; size:4; signed:1;
> - field:int common_padding; offset:8; size:4; signed:1;
> + field:unsigned short common_type; offset:0; size:2; signed:0;
> + field:unsigned char common_flags; offset:2; size:1; signed:0;
> + field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
> + field:int common_pid; offset:4; size:4; signed:1;
> + field:int common_padding; offset:8; size:4; signed:1;
>
> - field:unsigned long __probe_ip; offset:12; size:4; signed:0;
> - field:u32 arg1; offset:16; size:4; signed:0;
> - field:u32 arg2; offset:20; size:4; signed:0;
> + field:unsigned long __probe_ip; offset:12; size:4; signed:0;
> + field:u32 arg1; offset:16; size:4; signed:0;
> + field:u32 arg2; offset:20; size:4; signed:0;
>
> print fmt: "(%lx) arg1=%lx arg2=%lx", REC->__probe_ip, REC->arg1, REC->arg2
>
> @@ -94,6 +112,7 @@ Right after definition, each event is disabled by default. For tracing these
> # echo 1 > events/uprobes/enable
>
> Lets disable the event after sleeping for some time.
> +
> # sleep 20
> # echo 0 > events/uprobes/enable
>
> @@ -104,10 +123,11 @@ And you can see the traced information via /sys/kernel/debug/tracing/trace.
> #
> # TASK-PID CPU# TIMESTAMP FUNCTION
> # | | | | |
> - zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
> - zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
> - zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
> - zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
> -
> -Each line shows us probes were triggered for a pid 24842 with ip being
> -0x446421 and contents of ax register being 79.
> + zsh-24842 [006] 258544.995456: zfree_entry: (0x446420) arg1=446420 arg2=79
> + zsh-24842 [007] 258545.000270: zfree_exit: (0x446540 <- 0x446420) arg1=446540 arg2=0
> + zsh-24842 [002] 258545.043929: zfree_entry: (0x446420) arg1=446420 arg2=79
> + zsh-24842 [004] 258547.046129: zfree_exit: (0x446540 <- 0x446420) arg1=446540 arg2=0
> +
> +Output shows us uprobe was triggered for a pid 24842 with ip being 0x446420
> +and contents of ax register being 79. And uretprobe was triggered with ip at
> +0x446540 with counterpart function entry at 0x446420.
> --
> 1.8.1.4
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-09 14:08:24

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] uretprobes: Return probe exit, invoke handlers

On 04/07, Srikar Dronamraju wrote:
>
> > +static void
> > +handler_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)
>
> > +{
> > + struct uprobe *uprobe = ri->uprobe;
> > + struct uprobe_consumer *uc;
> > +
> > + down_read(&uprobe->register_rwsem);
> > + for (uc = uprobe->consumers; uc; uc = uc->next) {
> > + if (uc->ret_handler)
> > + uc->ret_handler(uc, ri->func, regs);
> > + }
> > + up_read(&uprobe->register_rwsem);
> > +}
> > +
> > +static bool handler_uretprobe(struct pt_regs *regs)
>
> Nit: can this be renamed to handle_trampoline

Agreed.

And probably handler_uretprobe_chain() should be ret_handler_chain()
to match handler_chain().

> > +{
> > + struct uprobe_task *utask;
> > + struct return_instance *ri, *tmp;
> > + bool chained;
> > +
> > + utask = current->utask;
> > + if (!utask)
> > + return false;
> > +
> > + ri = utask->return_instances;
> > + if (!ri)
> > + return false;
> > +
> > + instruction_pointer_set(regs, ri->orig_ret_vaddr);
>
> Should we a check here before using top most ri.
> What if the application had done a longjmp and the trampoline he hit
> corresponds to something thats below in the stack?
>
> Not sure if this what you meant by leaking return instances in your next
> patch.

Oh yes, this should be documented more explicitly in the changelog of
this patch or 7/9 (which tries to document the limitations but should
be more clear).

Currently we do not support longjmp() and we assume that the probed
function should do the regular return. We should certainly try to improve
this, but I really think that this should go into the next series.

Because this is nontrivial, needs more discussion, and I'm afraid should
be per-arch. Even on x86 (which can check the stack) this is not simple,
in general we can't know how to check that (to simplify) the first frame
is already invalid. Just for example, we could check regs->sp and detect
that longjmp() was called but sigaltstack() can easily fool this logic.

Or we can change prepare_uretprobe() to alloc the new slot for the
trampoline every time (and mark it as "trampoline" for handle_swbp() of
course), this way we can easily discard the invalid ret_instance's in
handler_uretprobe(). But this doesn't solve all problems and this is
not really nice/simple.

In short. I think we should document the limitations more clearly, push
this functionality, then try to improve things. Do you agree?

Oleg.

2013-04-09 14:20:55

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v1 2/9] uretprobes: Reserve the first slot in xol_vma for trampoline

On 04/07, Srikar Dronamraju wrote:
>
> * Anton Arapov <[email protected]> [2013-04-03 18:00:32]:
>
> > Allocate trampoline page, as the very first one in uprobed
> > task xol area, and fill it with breakpoint opcode.
> >
> > Also introduce get_trampoline_vaddr() helper, to wrap the
> > trampoline address extraction from area->vaddr. That removes
> > confusion and eases the debug experience in case ->vaddr
> > notion will be changed.
> >
> > v1 changes:
> > * rework get_trampoline_vaddr() helper.
> > * init xol_area->slot_count.
> >
> > Signed-off-by: Anton Arapov <[email protected]>
>
> Acked-by: Srikar Dronamraju <[email protected]>

Thanks!

> > +static unsigned long get_trampoline_vaddr(void)
> > +{
> > + struct xol_area *area;
> > + unsigned long trampoline_vaddr = -1;
> > +
> > + area = current->mm->uprobes_state.xol_area;
> > + smp_read_barrier_depends();
>
> check: do we need this barrier?

In theory yes. For example, we do not want the false positive in
handle_swbp() which does

if (bp_vaddr == get_trampoline_vaddr())
handler_uretprobe();

In theory we can race with another thread which initializes area
(in particular area->vaddr) and then sets uprobes_state.xol_area = area
in xol_add_vma().

If we see ->xol_area != NULL we must ensure that we read the correct
value of area->vaddr, so we need a barrier.

In short. Please note get_xol_area()->smp_read_barrier_depends(),
get_trampoline_vaddr() needs it for the same reason.

Oleg.

2013-04-09 20:16:00

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] uretprobes: Return probe exit, invoke handlers

On 04/09, Oleg Nesterov wrote:
>
> > Should we a check here before using top most ri.
> > What if the application had done a longjmp and the trampoline he hit
> > corresponds to something thats below in the stack?
> >
> > Not sure if this what you meant by leaking return instances in your next
> > patch.
>
> Oh yes, this should be documented more explicitly in the changelog of
> this patch or 7/9 (which tries to document the limitations but should
> be more clear).
>
> Currently we do not support longjmp() and we assume that the probed
> function should do the regular return. We should certainly try to improve
> this, but I really think that this should go into the next series.
>
> Because this is nontrivial, needs more discussion, and I'm afraid should
> be per-arch. Even on x86 (which can check the stack) this is not simple,
> in general we can't know how to check that (to simplify) the first frame
> is already invalid. Just for example, we could check regs->sp and detect
> that longjmp() was called but sigaltstack() can easily fool this logic.
>
> Or we can change prepare_uretprobe() to alloc the new slot for the
> trampoline every time (and mark it as "trampoline" for handle_swbp() of
> course), this way we can easily discard the invalid ret_instance's in
> handler_uretprobe(). But this doesn't solve all problems and this is
> not really nice/simple.
>
> In short. I think we should document the limitations more clearly, push
> this functionality, then try to improve things. Do you agree?

IOW. Will you agree with v2 below?

Changes:

- s/handler_uretprobe/handle_trampoline/

- s/handler_uretprobe_chain/handle_uretprobe_chain/

- add the TODO: comments into the changelog and
handle_trampoline().


-------------------------------------------------------------------------------
[PATCH v2] uretprobes: Return probe exit, invoke handlers

Uretprobe handlers are invoked when the trampoline is hit, on completion
the trampoline is replaced with the saved return address and the uretprobe
instance deleted.

TODO: handle_trampoline() assumes that ->return_instances is always valid.
We should teach it to handle longjmp() which can invalidate the pending
return_instance's. This is nontrivial, we will try to do this in a separate
series.

Signed-off-by: Anton Arapov <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
---
kernel/events/uprobes.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 64 insertions(+), 1 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index e352e18..b9c4325 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1633,6 +1633,62 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
up_read(&uprobe->register_rwsem);
}

+static void
+handle_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)
+{
+ struct uprobe *uprobe = ri->uprobe;
+ struct uprobe_consumer *uc;
+
+ down_read(&uprobe->register_rwsem);
+ for (uc = uprobe->consumers; uc; uc = uc->next) {
+ if (uc->ret_handler)
+ uc->ret_handler(uc, ri->func, regs);
+ }
+ up_read(&uprobe->register_rwsem);
+}
+
+static bool handle_trampoline(struct pt_regs *regs)
+{
+ struct uprobe_task *utask;
+ struct return_instance *ri, *tmp;
+ bool chained;
+
+ utask = current->utask;
+ if (!utask)
+ return false;
+
+ ri = utask->return_instances;
+ if (!ri)
+ return false;
+
+ /*
+ * TODO: we should throw out return_instance's invalidated by
+ * longjmp(), currently we assume that the probed function always
+ * returns.
+ */
+ instruction_pointer_set(regs, ri->orig_ret_vaddr);
+
+ for (;;) {
+ handle_uretprobe_chain(ri, regs);
+
+ chained = ri->chained;
+ put_uprobe(ri->uprobe);
+
+ tmp = ri;
+ ri = ri->next;
+ kfree(tmp);
+
+ if (!chained)
+ break;
+
+ BUG_ON(!ri);
+ }
+
+ utask->return_instances = ri;
+
+ return true;
+}
+
/*
* Run handler and ask thread to singlestep.
* Ensure all non-fatal signals cannot interrupt thread while it singlesteps.
@@ -1644,8 +1700,15 @@ static void handle_swbp(struct pt_regs *regs)
int uninitialized_var(is_swbp);

bp_vaddr = uprobe_get_swbp_addr(regs);
- uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
+ if (bp_vaddr == get_trampoline_vaddr()) {
+ if (handle_trampoline(regs))
+ return;

+ pr_warn("uprobe: unable to handle uretprobe pid/tgid=%d/%d\n",
+ current->pid, current->tgid);
+ }
+
+ uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
if (!uprobe) {
if (is_swbp > 0) {
/* No matching uprobe; signal SIGTRAP. */
--
1.5.5.1

2013-04-13 10:07:22

by Srikar Dronamraju

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] uretprobes: Return probe exit, invoke handlers

* Oleg Nesterov <[email protected]> [2013-04-09 22:13:02]:

> On 04/09, Oleg Nesterov wrote:
> >
> > > Should we a check here before using top most ri.
> > > What if the application had done a longjmp and the trampoline he hit
> > > corresponds to something thats below in the stack?
> > >
> > > Not sure if this what you meant by leaking return instances in your next
> > > patch.
> >
> > Oh yes, this should be documented more explicitly in the changelog of
> > this patch or 7/9 (which tries to document the limitations but should
> > be more clear).
> >
> > Currently we do not support longjmp() and we assume that the probed
> > function should do the regular return. We should certainly try to improve
> > this, but I really think that this should go into the next series.
> >
> > Because this is nontrivial, needs more discussion, and I'm afraid should
> > be per-arch. Even on x86 (which can check the stack) this is not simple,
> > in general we can't know how to check that (to simplify) the first frame
> > is already invalid. Just for example, we could check regs->sp and detect
> > that longjmp() was called but sigaltstack() can easily fool this logic.
> >

Yes, its perfectly fine to keep this logic for the next patchset.
Can you tell me why sigaltstack() can fool us if we rely on regs->sp.
I should admit that I am not too familiar with sigaltstack.

> > Or we can change prepare_uretprobe() to alloc the new slot for the
> > trampoline every time (and mark it as "trampoline" for handle_swbp() of
> > course), this way we can easily discard the invalid ret_instance's in
> > handler_uretprobe(). But this doesn't solve all problems and this is
> > not really nice/simple.
> >
> > In short. I think we should document the limitations more clearly, push
> > this functionality, then try to improve things. Do you agree?
>
> IOW. Will you agree with v2 below?
>
> Changes:
>
> - s/handler_uretprobe/handle_trampoline/
>
> - s/handler_uretprobe_chain/handle_uretprobe_chain/
>
> - add the TODO: comments into the changelog and
> handle_trampoline().
>
>
> -------------------------------------------------------------------------------
> [PATCH v2] uretprobes: Return probe exit, invoke handlers
>
> Uretprobe handlers are invoked when the trampoline is hit, on completion
> the trampoline is replaced with the saved return address and the uretprobe
> instance deleted.
>
> TODO: handle_trampoline() assumes that ->return_instances is always valid.
> We should teach it to handle longjmp() which can invalidate the pending
> return_instance's. This is nontrivial, we will try to do this in a separate
> series.
>
> Signed-off-by: Anton Arapov <[email protected]>
> Signed-off-by: Oleg Nesterov <[email protected]>

Acked-by: Srikar Dronamraju <[email protected]>

> ---
> kernel/events/uprobes.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 64 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index e352e18..b9c4325 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1633,6 +1633,62 @@ static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> up_read(&uprobe->register_rwsem);
> }
>
> +static void
> +handle_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)
> +{
> + struct uprobe *uprobe = ri->uprobe;
> + struct uprobe_consumer *uc;
> +
> + down_read(&uprobe->register_rwsem);
> + for (uc = uprobe->consumers; uc; uc = uc->next) {
> + if (uc->ret_handler)
> + uc->ret_handler(uc, ri->func, regs);
> + }
> + up_read(&uprobe->register_rwsem);
> +}
> +
> +static bool handle_trampoline(struct pt_regs *regs)
> +{
> + struct uprobe_task *utask;
> + struct return_instance *ri, *tmp;
> + bool chained;
> +
> + utask = current->utask;
> + if (!utask)
> + return false;
> +
> + ri = utask->return_instances;
> + if (!ri)
> + return false;
> +
> + /*
> + * TODO: we should throw out return_instance's invalidated by
> + * longjmp(), currently we assume that the probed function always
> + * returns.
> + */
> + instruction_pointer_set(regs, ri->orig_ret_vaddr);
> +
> + for (;;) {
> + handle_uretprobe_chain(ri, regs);
> +
> + chained = ri->chained;
> + put_uprobe(ri->uprobe);
> +
> + tmp = ri;
> + ri = ri->next;
> + kfree(tmp);
> +
> + if (!chained)
> + break;
> +
> + BUG_ON(!ri);
> + }
> +
> + utask->return_instances = ri;
> +
> + return true;
> +}
> +
> /*
> * Run handler and ask thread to singlestep.
> * Ensure all non-fatal signals cannot interrupt thread while it singlesteps.
> @@ -1644,8 +1700,15 @@ static void handle_swbp(struct pt_regs *regs)
> int uninitialized_var(is_swbp);
>
> bp_vaddr = uprobe_get_swbp_addr(regs);
> - uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
> + if (bp_vaddr == get_trampoline_vaddr()) {
> + if (handle_trampoline(regs))
> + return;
>
> + pr_warn("uprobe: unable to handle uretprobe pid/tgid=%d/%d\n",
> + current->pid, current->tgid);
> + }
> +
> + uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
> if (!uprobe) {
> if (is_swbp > 0) {
> /* No matching uprobe; signal SIGTRAP. */
> --
> 1.5.5.1
>
>

--
Thanks and Regards
Srikar Dronamraju

2013-04-13 16:13:47

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v1 6/9] uretprobes: Return probe exit, invoke handlers

On 04/13, Srikar Dronamraju wrote:
>
> > > Oh yes, this should be documented more explicitly in the changelog of
> > > this patch or 7/9 (which tries to document the limitations but should
> > > be more clear).
> > >
> > > Currently we do not support longjmp() and we assume that the probed
> > > function should do the regular return. We should certainly try to improve
> > > this, but I really think that this should go into the next series.
> > >
> > > Because this is nontrivial, needs more discussion, and I'm afraid should
> > > be per-arch. Even on x86 (which can check the stack) this is not simple,
> > > in general we can't know how to check that (to simplify) the first frame
> > > is already invalid. Just for example, we could check regs->sp and detect
> > > that longjmp() was called but sigaltstack() can easily fool this logic.
> > >
>
> Yes, its perfectly fine to keep this logic for the next patchset.

OK, great.

> Can you tell me why sigaltstack() can fool us if we rely on regs->sp.

Because we can't simply compare resg->sp and ret_instance->sp and decide
if we should ignore this ri or not, the task can hit retprobe, then take
a signal, switch to altstack and hit another rp. I'll write another email
(hopefully patches) later.

> Acked-by: Srikar Dronamraju <[email protected]>

Thanks Srikar.

OK. Everything is acked, I'll send git-pull-request.

Oleg.