Received: by 10.223.185.116 with SMTP id b49csp5886581wrg; Tue, 27 Feb 2018 23:53:04 -0800 (PST) X-Google-Smtp-Source: AG47ELuRGttjHNx3MJ0NRX+y+n8wGq6tei9N7R/6qqdBrTWjKZLJgwBWbDgKPd0IINld90cU9NYF X-Received: by 2002:a17:902:1a2:: with SMTP id b31-v6mr8829959plb.203.1519804384761; Tue, 27 Feb 2018 23:53:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519804384; cv=none; d=google.com; s=arc-20160816; b=O83nEgr9cH/XC/qhd+bliTBRXS9ArrRAgU5DIujoJMplswSljb1SL3v/r8MvKKnF0f /hqiyhRirHc1A2V1kAFNT+2EFK9AGwV+AbL5MjI/37MsrsvgbsmV8o+0DKu8taOaH0fj bSdIKrvt4OJjb1cFAHEMuyTBEzSkHFMKz872sFpZVv8DK4fPC24QSQu0GgeleZWRYH+d J7NiGob9wNxYPsEy5tryZZuSnfmB6geOwojGUcNj9G92e7djTQXAmuYGbTKCdOal+S6s OODn3+N39LaiYE9+sZ1ZSyyt6PwfLL05Besu4cBePRzdivT+ZE2VFuqFuqfXzhlYzC1c zf6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from:arc-authentication-results; bh=2j3npZU7ORx8lWEqQQ0eBG7WFf0i+eUAoPQURvLe6tg=; b=EYrypB0ZejLdVRvmxvbKH2kD7EXmC1DDlh+j4vCucoD1SN1rc3TboURRZBUyHMBOIK 1GIAg+pnzEZ6CqNlYqL72hl9XPUS8a/wDckaIeAIkKg1J//OzGhZWRW59g8qpleSMh6e BYAFJJsqb3q0BvkRaoKbDCOYf87tVN1gzppnPhvMrQpcN5Bnfqdhze8MMu8ngPmficR5 9HSnh8A6IVVS8epMgG1DpfVHnMtQkj698ty/zcCU5EhSHbHw4yy5qBtjHYduZoRn2AQd Qq8OW1o47FB420OeXN7Q23vKOlvGM2c+NaIO/1gc5MpSM4duPlaGR1pTqgmw3lrDRbXd +gow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k126si682533pgk.539.2018.02.27.23.52.49; Tue, 27 Feb 2018 23:53:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752187AbeB1HwC (ORCPT + 99 others); Wed, 28 Feb 2018 02:52:02 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:37738 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751080AbeB1Hv7 (ORCPT ); Wed, 28 Feb 2018 02:51:59 -0500 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w1S7nX7D071653 for ; Wed, 28 Feb 2018 02:51:59 -0500 Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) by mx0b-001b2d01.pphosted.com with ESMTP id 2gdq1ujycg-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 28 Feb 2018 02:51:58 -0500 Received: from localhost by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 28 Feb 2018 07:51:57 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp12.uk.ibm.com (192.168.101.142) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 28 Feb 2018 07:51:51 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w1S7ppPc50266304; Wed, 28 Feb 2018 07:51:51 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 063C7A4053; Wed, 28 Feb 2018 07:44:55 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B94E9A4040; Wed, 28 Feb 2018 07:44:52 +0000 (GMT) Received: from bangoria.in.ibm.com (unknown [9.124.31.161]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 28 Feb 2018 07:44:52 +0000 (GMT) From: Ravi Bangoria To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, linux-kernel@vger.kernel.org, rostedt@goodmis.org, mhiramat@kernel.org, ananth@linux.vnet.ibm.com, naveen.n.rao@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com, oleg@redhat.com Cc: Ravi Bangoria Subject: [RFC 3/4] trace_uprobe: Support SDT markers having semaphore Date: Wed, 28 Feb 2018 13:23:44 +0530 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180228075345.674-1-ravi.bangoria@linux.vnet.ibm.com> References: <20180228075345.674-1-ravi.bangoria@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18022807-0008-0000-0000-000004D55D22 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18022807-0009-0000-0000-00001E68804B Message-Id: <20180228075345.674-4-ravi.bangoria@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2018-02-28_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1802280094 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Userspace Statically Defined Tracepoints[1] are dtrace style markers inside userspace applications. These markers are added by developer at important places in the code. Each marker source expands to a single nop instruction in the compiled code but there may be additional overhead for computing the marker arguments which expands to couple of instructions. If this computaion is quite more, execution of it can be ommited by runtime if() condition when no one is tracing on the marker: if (semaphore > 0) { Execute marker instructions; } Default value of semaphore is 0. Tracer has to increment the semaphore before recording on a marker and decrement it at the end of tracing. Implement the semaphore flip logic in trace_uprobe, leaving core uprobe infrastructure as is, except one new callback from uprobe_mmap() to trace_uprobe. There are two major scenarios while enabling the marker, 1. Trace already running process. In this case, find all suitable processes and increment the semaphore value in them. 2. Trace is already enabled when target binary is executed. In this case, all mmaps will get notified to trace_uprobe and trace_uprobe will increment the semaphore if corresponding uprobe is enabled. At the time of disabling probes, decrement semaphore in all existing target processes. Signed-off-by: Ravi Bangoria --- include/linux/uprobes.h | 2 + kernel/events/uprobes.c | 5 ++ kernel/trace/trace_uprobe.c | 145 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 152 insertions(+) diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 06c169e..04e9d57 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -121,6 +121,8 @@ struct uprobe_map_info { unsigned long vaddr; }; +extern void (*uprobe_mmap_callback)(struct vm_area_struct *vma); + extern int set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); extern int set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); extern bool is_swbp_insn(uprobe_opcode_t *insn); diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 56dd7af..81d8aaf 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1051,6 +1051,8 @@ static void build_probe_list(struct inode *inode, spin_unlock(&uprobes_treelock); } +void (*uprobe_mmap_callback)(struct vm_area_struct *vma) = NULL; + /* * Called from mmap_region/vma_adjust with mm->mmap_sem acquired. * @@ -1063,6 +1065,9 @@ int uprobe_mmap(struct vm_area_struct *vma) struct uprobe *uprobe, *u; struct inode *inode; + if (vma->vm_flags & VM_WRITE && uprobe_mmap_callback) + uprobe_mmap_callback(vma); + if (no_uprobe_events() || !valid_vma(vma, true)) return 0; diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 40592e7b..d14aafc 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -25,6 +25,7 @@ #include #include #include +#include #include "trace_probe.h" @@ -58,6 +59,7 @@ struct trace_uprobe { struct inode *inode; char *filename; unsigned long offset; + unsigned long sdt_offset; /* sdt semaphore offset */ unsigned long nhit; struct trace_probe tp; }; @@ -502,6 +504,16 @@ static int create_trace_uprobe(int argc, char **argv) for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) { struct probe_arg *parg = &tu->tp.args[i]; + /* This is not really an argument. */ + if (argv[i][0] == '*') { + ret = kstrtoul(&(argv[i][1]), 0, &tu->sdt_offset); + if (ret) { + pr_info("Invalid semaphore address.\n"); + goto error; + } + continue; + } + /* Increment count for freeing args in error case */ tu->tp.nr_args++; @@ -894,6 +906,131 @@ static void uretprobe_trace_func(struct trace_uprobe *tu, unsigned long func, return trace_handle_return(s); } +static bool sdt_valid_vma(struct trace_uprobe *tu, struct vm_area_struct *vma) +{ + unsigned long vaddr = offset_to_vaddr(vma, tu->sdt_offset); + + return tu->sdt_offset && + vma->vm_file && + file_inode(vma->vm_file) == tu->inode && + vma->vm_flags & VM_WRITE && + vma->vm_start <= vaddr && + vma->vm_end > vaddr; +} + +static struct vm_area_struct * +sdt_find_vma(struct mm_struct *mm, struct trace_uprobe *tu) +{ + struct vm_area_struct *tmp; + + for (tmp = mm->mmap; tmp != NULL; tmp = tmp->vm_next) + if (sdt_valid_vma(tu, tmp)) + return tmp; + + return NULL; +} + +static int +sdt_update_sem(struct mm_struct *mm, unsigned long vaddr, short val) +{ + struct page *page; + struct vm_area_struct *vma; + int ret = 0; + unsigned short orig = 0; + + if (vaddr == 0) + return -EINVAL; + + ret = get_user_pages_remote(NULL, mm, vaddr, 1, + FOLL_FORCE | FOLL_WRITE, &page, &vma, NULL); + if (ret <= 0) + return ret; + + copy_from_page(page, vaddr, &orig, sizeof(orig)); + orig += val; + copy_to_page(page, vaddr, &orig, sizeof(orig)); + put_page(page); + return 0; +} + +/* + * TODO: Adding this defination in include/linux/uprobes.h throws + * warnings about address_sapce. Adding it here for the time being. + */ +struct uprobe_map_info *build_uprobe_map_info(struct address_space *mapping, loff_t offset, bool is_register); + +static void sdt_increment_sem(struct trace_uprobe *tu) +{ + struct uprobe_map_info *info; + struct vm_area_struct *vma; + unsigned long vaddr; + + uprobe_start_dup_mmap(); + info = build_uprobe_map_info(tu->inode->i_mapping, tu->sdt_offset, false); + if (IS_ERR(info)) + goto out; + + while (info) { + down_write(&info->mm->mmap_sem); + vma = sdt_find_vma(info->mm, tu); + if (!vma) + goto cont; + + vaddr = offset_to_vaddr(vma, tu->sdt_offset); + sdt_update_sem(info->mm, vaddr, 1); + +cont: + up_write(&info->mm->mmap_sem); + mmput(info->mm); + info = free_uprobe_map_info(info); + } + +out: + uprobe_end_dup_mmap(); +} + +/* Called with down_write(&vma->vm_mm->mmap_sem) */ +void trace_uprobe_mmap_callback(struct vm_area_struct *vma) +{ + struct trace_uprobe *tu; + unsigned long vaddr; + + mutex_lock(&uprobe_lock); + list_for_each_entry(tu, &uprobe_list, list) { + if (!sdt_valid_vma(tu, vma) || + !trace_probe_is_enabled(&tu->tp)) + continue; + + vaddr = offset_to_vaddr(vma, tu->sdt_offset); + sdt_update_sem(vma->vm_mm, vaddr, 1); + } + mutex_unlock(&uprobe_lock); +} + +static void sdt_decrement_sem(struct trace_uprobe *tu) +{ + struct vm_area_struct *vma; + unsigned long vaddr; + struct uprobe_map_info *info; + + info = build_uprobe_map_info(tu->inode->i_mapping, tu->sdt_offset, false); + if (IS_ERR(info)) + return; + + while (info) { + down_write(&info->mm->mmap_sem); + vma = sdt_find_vma(info->mm, tu); + if (vma) { + vaddr = offset_to_vaddr(vma, tu->sdt_offset); + sdt_update_sem(info->mm, vaddr, -1); + } + up_write(&info->mm->mmap_sem); + + mmput(info->mm); + info = free_uprobe_map_info(info); + } +} + typedef bool (*filter_func_t)(struct uprobe_consumer *self, enum uprobe_filter_ctx ctx, struct mm_struct *mm); @@ -939,6 +1076,9 @@ typedef bool (*filter_func_t)(struct uprobe_consumer *self, if (ret) goto err_buffer; + if (tu->sdt_offset) + sdt_increment_sem(tu); + return 0; err_buffer: @@ -979,6 +1119,9 @@ typedef bool (*filter_func_t)(struct uprobe_consumer *self, WARN_ON(!uprobe_filter_is_empty(&tu->filter)); + if (tu->sdt_offset) + sdt_decrement_sem(tu); + uprobe_unregister(tu->inode, tu->offset, &tu->consumer); tu->tp.flags &= file ? ~TP_FLAG_TRACE : ~TP_FLAG_PROFILE; @@ -1353,6 +1496,8 @@ static __init int init_uprobe_trace(void) /* Profile interface */ trace_create_file("uprobe_profile", 0444, d_tracer, NULL, &uprobe_profile_ops); + + uprobe_mmap_callback = trace_uprobe_mmap_callback; return 0; } -- 1.8.3.1