Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp4503540imc; Mon, 25 Feb 2019 06:09:18 -0800 (PST) X-Google-Smtp-Source: AHgI3Ia9rwJHNj7Nk9qyR1to52XWcY/+UZd9Y6ZRK4fr1zeI7PuiOMm9bG2FMDDZENIKAqAtgf+E X-Received: by 2002:a17:902:461:: with SMTP id 88mr1661448ple.216.1551103758672; Mon, 25 Feb 2019 06:09:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551103758; cv=none; d=google.com; s=arc-20160816; b=lGUfB0PpI0sB11UqhynUJWl2nwiBX9hTd0zhM5WRyS3LXwgbsHdc+bxRjzbT93LQYr WlPUcTA73h4SbYbf0VlowH8Fomi4NxY91AiwGRD9qm9BVgT52zIvKkYY/yElHmk5sl/U 2AyOz5BjyOhs8DiG/DB2bPcJdL3frMwsjtIsz8U4Q4odEWJzN3tXam9tiAiCwjVJA2WR 6WZZVd6cvshGmN9o1fDqu6FCwhrwwZYVDJYscz0yehrSeVPwNbG4UzLJOYBNMqlqQSmw mtNRIcwyrBPhOvdsiKV3IC8eAv/BAuiK2VDOovel5sgsJBtf0mHule9G4/MIR2/M4nag ZnJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=54sSdCIDmVfHRaNg84u83zal6sB0S0o3xmqbnssGS0c=; b=c8kXZSefKXfL3DlgWbnIbdWNYE3qDmu/AekrHJGmBOUWWswdc6OL8b1jFW1cGLfI0X TxZd9hg2dOZGtAL9FkPp3Ux47FbNZ9VR2OU3c1SNLi1maTsWEq+ev9xMd+ffliLYLLc6 fuW9LNj5ikfRViBbsz1I8wBWCEMThBCfeRj/hk3TTcFxxpZM1xdmarWc7Z882UaT5v7e h41GBKRpq1hXSGWouaXOFGd1gJCL3Mo+3mhELdcbktLx5Sch2TsynadOTvSYGfOj2BOc +I3LVYRuuwbgR+oziW4qBtLJHYrniddBXidB1fCqgdXkMpzCULzaMuCoGaP1pongJYaR wiCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=E0CpIXjn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y11si9273519plk.323.2019.02.25.06.09.03; Mon, 25 Feb 2019 06:09:18 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=E0CpIXjn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727302AbfBYOHF (ORCPT + 99 others); Mon, 25 Feb 2019 09:07:05 -0500 Received: from mail.kernel.org ([198.145.29.99]:52156 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726233AbfBYOHF (ORCPT ); Mon, 25 Feb 2019 09:07:05 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 94E0820663; Mon, 25 Feb 2019 14:07:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1551103623; bh=Ns6AwSfgmGZHmoSmxsJrHXgUg5wpDhHjW7/OHtvX240=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E0CpIXjnlJxVlLuGBkVPVb68wPMyv2CQpuYl7UMiiSxAHAfQlqFYMCcOOHOwOLW+N 39ubvF/BMRDb0XVrmq5p1i+hvifR+TvfCzOX3mFHtqb8pZNH9vXml/soH4S+lA7X64 XreIHrvkP5/ksSZI+MT3pZZrNL+iHcQi0pVpjO/0= From: Masami Hiramatsu To: Steven Rostedt , Linus Torvalds Cc: mhiramat@kernel.org, linux-kernel@vger.kernel.org, Andy Lutomirski , Ingo Molnar , Andrew Morton , Changbin Du , Jann Horn , Kees Cook , Andy Lutomirski , Alexei Starovoitov , Nadav Amit , Peter Zijlstra Subject: [RFC PATCH 4/4] tracing/probe: Support user-space dereference Date: Mon, 25 Feb 2019 23:06:39 +0900 Message-Id: <155110359954.21156.13056070760380754286.stgit@devbox> X-Mailer: git-send-email 2.13.6 In-Reply-To: <155110348217.21156.3874419272673328527.stgit@devbox> References: <155110348217.21156.3874419272673328527.stgit@devbox> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Support user-space dereference syntax for probe event arguments to dereference the data-structure or array in user-space. The syntax is just adding 'u' before an offset value. +|-u() e.g. +u8(%ax), +u0(+0(%si)) For example, if you probe do_sched_setscheduler(pid, policy, param) and record param->sched_priority, you can add new probe as below; p do_sched_setscheduler priority=+u0($arg3) Note that kprobe event provides this and it doesn't change the dereference method automatically because we do not know whether the given address is in userspace or kernel on some arch. So as same as "ustring", this is an option for user, who has to carefully choose the dereference method. Signed-off-by: Masami Hiramatsu --- Documentation/trace/kprobetrace.rst | 4 +++- Documentation/trace/uprobetracer.rst | 9 +++++---- kernel/trace/trace.c | 5 +++-- kernel/trace/trace_kprobe.c | 6 ++++++ kernel/trace/trace_probe.c | 27 +++++++++++++++++++++------ kernel/trace/trace_probe.h | 2 ++ kernel/trace/trace_probe_tmpl.h | 22 +++++++++++++++++----- kernel/trace/trace_uprobe.c | 7 +++++++ 8 files changed, 64 insertions(+), 18 deletions(-) diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst index a3ac7c9ac242..036d8c5ba18c 100644 --- a/Documentation/trace/kprobetrace.rst +++ b/Documentation/trace/kprobetrace.rst @@ -51,7 +51,7 @@ Synopsis of kprobe_events $argN : Fetch the Nth function argument. (N >= 1) (\*1) $retval : Fetch return value.(\*2) $comm : Fetch current task comm. - +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(\*3) + +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*3)(\*4) NAME=FETCHARG : Set NAME as the argument name of FETCHARG. FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types @@ -61,6 +61,8 @@ Synopsis of kprobe_events (\*1) only for the probe on function entry (offs == 0). (\*2) only for return probe. (\*3) this is useful for fetching a field of data structures. + (\*4) "u" means user-space dereference. So accessing data structure in + user-space, you have to use this "u" prefix. (e.g. +u0(%si)) Types ----- diff --git a/Documentation/trace/uprobetracer.rst b/Documentation/trace/uprobetracer.rst index 4c3bfde2ba47..6144423b2368 100644 --- a/Documentation/trace/uprobetracer.rst +++ b/Documentation/trace/uprobetracer.rst @@ -42,16 +42,17 @@ Synopsis of uprobe_tracer @+OFFSET : Fetch memory at OFFSET (OFFSET from same file as PATH) $stackN : Fetch Nth entry of stack (N >= 0) $stack : Fetch stack address. - $retval : Fetch return value.(*) + $retval : Fetch return value.(\*1) $comm : Fetch current task comm. - +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**) + +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*2)(\*3) NAME=FETCHARG : Set NAME as the argument name of FETCHARG. FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types (x8/x16/x32/x64), "string" and bitfield are supported. - (*) only for return probe. - (**) this is useful for fetching a field of data structures. + (\*1) only for return probe. + (\*2) this is useful for fetching a field of data structures. + (\*3) Unlike kprobe event, "u" prefix will be just ignored. Types ----- diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 4cacbb0e1538..5408a82a015d 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4638,10 +4638,11 @@ static const char readme_msg[] = "\t args: =fetcharg[:type]\n" "\t fetcharg: %, @
, @[+|-],\n" #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API - "\t $stack, $stack, $retval, $comm, $arg\n" + "\t $stack, $stack, $retval, $comm, $arg,\n" #else - "\t $stack, $stack, $retval, $comm\n" + "\t $stack, $stack, $retval, $comm,\n" #endif + "\t +|-[u]()\n" "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, string, symbol,\n" "\t b@/, ustring,\n" "\t \\[\\]\n" diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index d50d937b6933..e650b9cc5fbd 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -961,6 +961,12 @@ probe_mem_read(void *dest, void *src, size_t size) return probe_kernel_read(dest, src, size); } +static nokprobe_inline int +probe_mem_read_user(void *dest, void *src, size_t size) +{ + return probe_user_read(dest, src, size); +} + /* Note that we don't verify it, since the code does not come from user space */ static int process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest, diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index a7012de37a00..0efef172db17 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -239,6 +239,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type, { struct fetch_insn *code = *pcode; unsigned long param; + int deref = FETCH_OP_DEREF; long offset = 0; char *tmp; int ret = 0; @@ -301,8 +302,17 @@ parse_probe_arg(char *arg, const struct fetch_type *type, break; case '+': /* deref memory */ - arg++; /* Skip '+', because kstrtol() rejects it. */ case '-': + if (arg[0] == '+') { + arg++; /* Skip '+', because kstrtol() rejects it. */ + if (arg[0] == 'u') { + deref = FETCH_OP_UDEREF; + arg++; + } + } else if (arg[1] == 'u') { /* Start with "-u" */ + deref = FETCH_OP_UDEREF; + *(++arg) = '-'; + } tmp = strchr(arg, '('); if (!tmp) return -EINVAL; @@ -328,7 +338,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type, return -E2BIG; *pcode = code; - code->op = FETCH_OP_DEREF; + code->op = deref; code->offset = offset; } break; @@ -444,13 +454,14 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size, /* Store operation */ if (!strcmp(parg->type->name, "string") || !strcmp(parg->type->name, "ustring")) { - if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_IMM && - code->op != FETCH_OP_COMM) { + if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF + && code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM) { pr_info("string only accepts memory or address.\n"); ret = -EINVAL; goto fail; } - if (code->op != FETCH_OP_DEREF || parg->count) { + if ((code->op == FETCH_OP_IMM && code->op == FETCH_OP_COMM) + || parg->count) { /* * IMM and COMM is pointing actual address, those must * be kept, and if parg->count != 0, this is an array @@ -463,7 +474,8 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size, } } /* If op == DEREF, replace it with STRING */ - if (!strcmp(parg->type->name, "ustring")) + if (!strcmp(parg->type->name, "ustring") || + code->op == FETCH_OP_UDEREF) code->op = FETCH_OP_ST_USTRING; else code->op = FETCH_OP_ST_STRING; @@ -472,6 +484,9 @@ static int traceprobe_parse_probe_arg_body(char *arg, ssize_t *size, } else if (code->op == FETCH_OP_DEREF) { code->op = FETCH_OP_ST_MEM; code->size = parg->type->size; + } else if (code->op == FETCH_OP_UDEREF) { + code->op = FETCH_OP_ST_UMEM; + code->size = parg->type->size; } else { code++; if (code->op != FETCH_OP_NOP) { diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index cf4ba8bbb841..a5e8b2ac2c97 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -91,9 +91,11 @@ enum fetch_op { FETCH_OP_FOFFS, /* File offset: .immediate */ // Stage 2 (dereference) op FETCH_OP_DEREF, /* Dereference: .offset */ + FETCH_OP_UDEREF, /* User-space Dereference: .offset */ // Stage 3 (store) ops FETCH_OP_ST_RAW, /* Raw: .size */ FETCH_OP_ST_MEM, /* Mem: .offset, .size */ + FETCH_OP_ST_UMEM, /* Mem: .offset, .size */ FETCH_OP_ST_STRING, /* String: .offset, .size */ FETCH_OP_ST_USTRING, /* User String: .offset, .size */ // Stage 4 (modify) op diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h index 7526f6f8d7b0..a1b58ccdba9a 100644 --- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -64,6 +64,8 @@ static nokprobe_inline int fetch_store_string_user(unsigned long addr, void *dest, void *base); static nokprobe_inline int probe_mem_read(void *dest, void *src, size_t size); +static nokprobe_inline int +probe_mem_read_user(void *dest, void *src, size_t size); /* From the 2nd stage, routine is same */ static nokprobe_inline int @@ -77,14 +79,21 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, stage2: /* 2nd stage: dereference memory if needed */ - while (code->op == FETCH_OP_DEREF) { - lval = val; - ret = probe_mem_read(&val, (void *)val + code->offset, - sizeof(val)); + do { + if (code->op == FETCH_OP_DEREF) { + lval = val; + ret = probe_mem_read(&val, (void *)val + code->offset, + sizeof(val)); + } else if (code->op == FETCH_OP_UDEREF) { + lval = val; + ret = probe_mem_read_user(&val, + (void *)val + code->offset, sizeof(val)); + } else + break; if (ret) return ret; code++; - } + } while (1); s3 = code; stage3: @@ -109,6 +118,9 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, case FETCH_OP_ST_MEM: probe_mem_read(dest, (void *)val + code->offset, code->size); break; + case FETCH_OP_ST_UMEM: + probe_mem_read_user(dest, (void *)val + code->offset, code->size); + break; case FETCH_OP_ST_STRING: loc = *(u32 *)dest; ret = fetch_store_string(val + code->offset, dest, base); diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 92facae8c3d8..a86afc9e2a6a 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -140,6 +140,13 @@ probe_mem_read(void *dest, void *src, size_t size) return copy_from_user(dest, vaddr, size) ? -EFAULT : 0; } + +static nokprobe_inline int +probe_mem_read_user(void *dest, void *src, size_t size) +{ + return probe_mem_read(dest, src, size); +} + /* * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max * length and relative data location.