Received: by 10.223.164.202 with SMTP id h10csp299000wrb; Wed, 8 Nov 2017 16:55:59 -0800 (PST) X-Google-Smtp-Source: ABhQp+TKmv4Etuc1aha1zxW602tbXAZonWziN+/N7gBUyMWMr508LwOCGPaw8CCXtFkco/YcMQEf X-Received: by 10.159.233.137 with SMTP id bh9mr2067349plb.201.1510188959660; Wed, 08 Nov 2017 16:55:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510188959; cv=none; d=google.com; s=arc-20160816; b=P4oSpSDyeb1pmLIHK1CsrM3yPrTRRUWSWMX1HX8SvvTK93exveKx2+Hnnx0h9FP8xs f0hFtcY2z++58g+IP1FYkBFhR31jPXrLkDgAM3esTIU1Hn200WWTvZPUlBqYRFIdX5OG YYGnGIzq/IOAm9Jv81HHOK6IbWy5Ng9mBr7qPcTx+GPYi7qPOQX9rN7h3r66GZnmfNi/ arbM8BFowrvvENhy4jI9H0TLzHd3PX9SYY9p68Jdy3rqVoaLu/eerYwqkiPR29JdAV6g 5HNDEI0a9df0KYo6lsoGMtozRWywubA70mxqsVAyrUjbeBONGt2yX3aOErbV9nLZ34Bh DbQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:subject :smtp-origin-cluster:cc:to:smtp-origin-hostname:from :smtp-origin-hostprefix:dkim-signature:arc-authentication-results; bh=1GjRC+mjriBMIZ3HFsvktCXMYsqpTDmcimcqTZt3nN0=; b=yh9CF/lQjHr7Hk8MNEnBn89O3GTOAC0HYf6XDg0CDIpUJ09FdLCzNvO6G4ahKw3fAl oBLNObhL26hWOtfG2cMVdPWJ6YEUOpnna/yi59j/G+svjHCOnWaALmFijUakv3JYcYcw +zoFNBWy1InNdxP3Ff0vE2bCMV2HVNMw57b7UuO4mztN8rwGdhLeAXPp0FQ8zDa1DfG2 JznwwG++yzH9594zWvtkX/d7LGy+/d/sdQHq8/9chboBwyKWS3OB7vyo63sOJIgKKfdT PUQxSJIyJ2/E219dFAQjFUrGxTam+tvGnOxxf+DhKmkYwJKKmmiRzh7WE0Cgs5AD5Hr6 rQ9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fb.com header.s=facebook header.b=LWIvCVDx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=fb.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v5si5085817pgt.411.2017.11.08.16.55.47; Wed, 08 Nov 2017 16:55:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@fb.com header.s=facebook header.b=LWIvCVDx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=fb.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753449AbdKIAyo (ORCPT + 84 others); Wed, 8 Nov 2017 19:54:44 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:37048 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752727AbdKIAyh (ORCPT ); Wed, 8 Nov 2017 19:54:37 -0500 Received: from pps.filterd (m0001255.ppops.net [127.0.0.1]) by mx0b-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vA90ptW5032010 for ; Wed, 8 Nov 2017 16:54:36 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=facebook; bh=1GjRC+mjriBMIZ3HFsvktCXMYsqpTDmcimcqTZt3nN0=; b=LWIvCVDxIVwsn0cow965N83QjDMnwAzsfSkq9lE/f6XjhIqFalN8YrQndsoXWGUM+02X xP3uGdc2pLzZWmpzfM4oteJ36tEpkltxtZqQ0ew2uzlJntcUkrlOlRGQen0ztCWbl+hK 6wwszVsXFI/uO5OjoHdNY7I7Gse46KZY2FY= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0b-00082601.pphosted.com with ESMTP id 2e4axrgeg8-3 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 08 Nov 2017 16:54:36 -0800 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB03.TheFacebook.com (192.168.16.13) with Microsoft SMTP Server id 14.3.361.1; Wed, 8 Nov 2017 16:54:33 -0800 Received: by devbig474.prn1.facebook.com (Postfix, from userid 128203) id 8A0AEE41255; Wed, 8 Nov 2017 16:54:33 -0800 (PST) Smtp-Origin-Hostprefix: devbig From: Yonghong Song Smtp-Origin-Hostname: devbig474.prn1.facebook.com To: , , , , , , , CC: Smtp-Origin-Cluster: prn1c29 Subject: [PATCH x86 v2] uprobe: emulate push insns for uprobe on x86 Date: Wed, 8 Nov 2017 16:54:33 -0800 Message-ID: <20171109005433.2289587-1-yhs@fb.com> X-Mailer: git-send-email 2.9.5 X-FB-Internal: Safe MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-08_05:,, signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Uprobe is a tracing mechanism for userspace programs. Typical uprobe will incur overhead of two traps. First trap is caused by replaced trap insn, and the second trap is to execute the original displaced insn in user space. To reduce the overhead, kernel provides hooks for architectures to emulate the original insn and skip the second trap. In x86, emulation is done for certain branch insns. This patch extends the emulation to "push " insns. These insns are typical in the beginning of the function. For example, bcc in https://github.com/iovisor/bcc repo provides tools to measure funclantency, detect memleak, etc. The tools will place uprobes in the beginning of function and possibly uretprobes at the end of function. This patch is able to reduce the trap overhead for uprobe from 2 to 1. Without this patch, uretprobe will typically incur three traps. With this patch, if the function starts with "push" insn, the number of traps can be reduced from 3 to 2. An experiment was conducted on two local VMs, fedora 26 64-bit VM and 32-bit VM, both 4 processors and 4GB memory, booted with latest net-next (and this patch). The host is MacBook with intel i7 processor. The test program looks like #include #include #include #include static void test() __attribute__((noinline)); void test() {} int main() { struct timeval start, end; gettimeofday(&start, NULL); for (int i = 0; i < 1000000; i++) { test(); } gettimeofday(&end, NULL); printf("%ld\n", ((end.tv_sec * 1000000 + end.tv_usec) - (start.tv_sec * 1000000 + start.tv_usec))); return 0; } The program is compiled without optimization, and the first insn for function "test" is "push %rbp". The host is relatively idle. Before the test run, the uprobe is inserted as below for uprobe: echo 'p :' > /sys/kernel/debug/tracing/uprobe_events echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable and for uretprobe: echo 'r :' > /sys/kernel/debug/tracing/uprobe_events echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable Unit: microsecond(usec) per loop iteration x86_64 W/ this patch W/O this patch uprobe 1.55 3.1 uretprobe 2.0 3.6 x86_32 W/ this patch W/O this patch uprobe 1.41 3.5 uretprobe 1.75 4.0 You can see that this patch significantly reduced the overhead, 50% for uprobe and 44% for uretprobe on x86_64, and even more on x86_32. Signed-off-by: Yonghong Song --- arch/x86/include/asm/uprobes.h | 10 ++++ arch/x86/kernel/uprobes.c | 115 +++++++++++++++++++++++++++++++++++------ 2 files changed, 109 insertions(+), 16 deletions(-) Changelog: v1 -> v2: . Make commit subject more appropriate diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h index 74f4c2f..f9d2b43 100644 --- a/arch/x86/include/asm/uprobes.h +++ b/arch/x86/include/asm/uprobes.h @@ -33,6 +33,11 @@ typedef u8 uprobe_opcode_t; #define UPROBE_SWBP_INSN 0xcc #define UPROBE_SWBP_INSN_SIZE 1 +enum uprobe_insn_t { + UPROBE_BRANCH_INSN = 0, + UPROBE_PUSH_INSN = 1, +}; + struct uprobe_xol_ops; struct arch_uprobe { @@ -42,6 +47,7 @@ struct arch_uprobe { }; const struct uprobe_xol_ops *ops; + enum uprobe_insn_t insn_class; union { struct { @@ -53,6 +59,10 @@ struct arch_uprobe { u8 fixups; u8 ilen; } defparam; + struct { + u8 rex_prefix; + u8 opc1; + } push; }; }; diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index a3755d2..5ace65c 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -640,11 +640,71 @@ static bool check_jmp_cond(struct arch_uprobe *auprobe, struct pt_regs *regs) #undef COND #undef CASE_COND -static bool branch_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs) +static unsigned long *get_push_reg_ptr(struct arch_uprobe *auprobe, + struct pt_regs *regs) { - unsigned long new_ip = regs->ip += auprobe->branch.ilen; - unsigned long offs = (long)auprobe->branch.offs; +#if defined(CONFIG_X86_64) + switch (auprobe->push.opc1) { + case 0x50: + return auprobe->push.rex_prefix ? ®s->r8 : ®s->ax; + case 0x51: + return auprobe->push.rex_prefix ? ®s->r9 : ®s->cx; + case 0x52: + return auprobe->push.rex_prefix ? ®s->r10 : ®s->dx; + case 0x53: + return auprobe->push.rex_prefix ? ®s->r11 : ®s->bx; + case 0x54: + return auprobe->push.rex_prefix ? ®s->r12 : ®s->sp; + case 0x55: + return auprobe->push.rex_prefix ? ®s->r13 : ®s->bp; + case 0x56: + return auprobe->push.rex_prefix ? ®s->r14 : ®s->si; + } + + /* opc1 0x57 */ + return auprobe->push.rex_prefix ? ®s->r15 : ®s->di; +#else + switch (auprobe->push.opc1) { + case 0x50: + return ®s->ax; + case 0x51: + return ®s->cx; + case 0x52: + return ®s->dx; + case 0x53: + return ®s->bx; + case 0x54: + return ®s->sp; + case 0x55: + return ®s->bp; + case 0x56: + return ®s->si; + } + /* opc1 0x57 */ + return ®s->di; +#endif +} + +static bool sstep_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs) +{ + int reg_width, insn_class = auprobe->insn_class; + unsigned long *src_ptr, new_ip, offs, sp; + + if (insn_class == UPROBE_PUSH_INSN) { + src_ptr = get_push_reg_ptr(auprobe, regs); + reg_width = sizeof_long(); + sp = regs->sp; + if (copy_to_user((void __user *)(sp - reg_width), src_ptr, reg_width)) + return false; + + regs->sp = sp - reg_width; + regs->ip += 1 + (auprobe->push.rex_prefix != 0); + return true; + } + + new_ip = regs->ip += auprobe->branch.ilen; + offs = (long)auprobe->branch.offs; if (branch_is_call(auprobe)) { /* * If it fails we execute this (mangled, see the comment in @@ -665,14 +725,18 @@ static bool branch_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs) return true; } -static int branch_post_xol_op(struct arch_uprobe *auprobe, struct pt_regs *regs) +static int sstep_post_xol_op(struct arch_uprobe *auprobe, struct pt_regs *regs) { - BUG_ON(!branch_is_call(auprobe)); + BUG_ON(auprobe->insn_class != UPROBE_PUSH_INSN && + !branch_is_call(auprobe)); /* - * We can only get here if branch_emulate_op() failed to push the ret - * address _and_ another thread expanded our stack before the (mangled) - * "call" insn was executed out-of-line. Just restore ->sp and restart. - * We could also restore ->ip and try to call branch_emulate_op() again. + * We can only get here if + * - for push operation, sstep_emulate_op() failed to push the stack, or + * - for branch operation, sstep_emulate_op() failed to push the ret address + * _and_ another thread expanded our stack before the (mangled) + * "call" insn was executed out-of-line. + * Just restore ->sp and restart. We could also restore ->ip and try to + * call sstep_emulate_op() again. */ regs->sp += sizeof_long(); return -ERESTART; @@ -698,17 +762,18 @@ static void branch_clear_offset(struct arch_uprobe *auprobe, struct insn *insn) 0, insn->immediate.nbytes); } -static const struct uprobe_xol_ops branch_xol_ops = { - .emulate = branch_emulate_op, - .post_xol = branch_post_xol_op, +static const struct uprobe_xol_ops sstep_xol_ops = { + .emulate = sstep_emulate_op, + .post_xol = sstep_post_xol_op, }; -/* Returns -ENOSYS if branch_xol_ops doesn't handle this insn */ -static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) +/* Returns -ENOSYS if sstep_xol_ops doesn't handle this insn */ +static int sstep_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) { u8 opc1 = OPCODE1(insn); int i; + auprobe->insn_class = UPROBE_BRANCH_INSN; switch (opc1) { case 0xeb: /* jmp 8 */ case 0xe9: /* jmp 32 */ @@ -719,6 +784,23 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) branch_clear_offset(auprobe, insn); break; + case 0x50 ... 0x57: + if (insn->length > 2) + return -ENOSYS; + if (insn->length == 2) { + /* only support rex_prefix 0x41 (x64 only) */ + if (insn->rex_prefix.nbytes != 1 || + insn->rex_prefix.bytes[0] != 0x41) + return -ENOSYS; + auprobe->push.rex_prefix = 0x41; + } else { + auprobe->push.rex_prefix = 0; + } + + auprobe->insn_class = UPROBE_PUSH_INSN; + auprobe->push.opc1 = opc1; + goto set_ops; + case 0x0f: if (insn->opcode.nbytes != 2) return -ENOSYS; @@ -746,7 +828,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) auprobe->branch.ilen = insn->length; auprobe->branch.offs = insn->immediate.value; - auprobe->ops = &branch_xol_ops; +set_ops: + auprobe->ops = &sstep_xol_ops; return 0; } @@ -767,7 +850,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, if (ret) return ret; - ret = branch_setup_xol_ops(auprobe, &insn); + ret = sstep_setup_xol_ops(auprobe, &insn); if (ret != -ENOSYS) return ret; -- 2.9.5 From 1585413824239521053@xxx Wed Nov 29 15:14:05 +0000 2017 X-GM-THRID: 1585413824239521053 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread