Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755817AbZA0Wnj (ORCPT ); Tue, 27 Jan 2009 17:43:39 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753199AbZA0WnM (ORCPT ); Tue, 27 Jan 2009 17:43:12 -0500 Received: from nf-out-0910.google.com ([64.233.182.189]:63977 "EHLO nf-out-0910.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752718AbZA0WnK (ORCPT ); Tue, 27 Jan 2009 17:43:10 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=FwjKcZNOXNPaUGpHTpRgDTREHFfWHj9+7yfFlaUMxm4dzx3zhMzTTahieCkVvkLnjv MbDcWF01spqENgwBHaFRrMLxJrD3HS6lJ45AB1hGxlguN77gQ0HQh6yvZyKEiWMsQeJC Dx2EsPqj7C5x3rZKxfHbbnaC2LqCEF/pOdT6w= Date: Tue, 27 Jan 2009 23:43:05 +0100 From: Frederic Weisbecker To: "Kok, Auke" Cc: Linux Kernel Mailing List , powertop ml , Arjan van de Ven , Ingo Molnar , srostedt@redhat.com, Arnaldo Carvalho de Melo , "Frank Ch. Eigler" , Neil Horman Subject: Re: [PATCH] tracer for sys_open() - sreadahead Message-ID: <20090127224303.GB5850@nowhere> References: <497F69A4.2070007@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <497F69A4.2070007@intel.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8751 Lines: 300 On Tue, Jan 27, 2009 at 12:08:04PM -0800, Kok, Auke wrote: > > This tracer monitors regular file open() syscalls. This is a fast > and low-overhead alternative to strace, and does not allow or > require to be attached to every process. > > The tracer only logs succesfull calls, as those are the only ones we > are currently interested in, and we can determine the absolute path > of these files as we log. > > Signed-off-by: Auke Kok Hi Auke, Speaking about a global syscall tracer, I made a patch to trace only the syscalls with the function-graph-tracer. http://lkml.org/lkml/2008/12/30/267 Its approach and purpose is different than a tracer dedicated only to syscalls. The function graph tracer traces execution graph of the functions and is more about execution time spent and code flow whereas a syscall tracer can provide more specific informations about syscalls. So both are not overlaping. But the low level part of my patch creates a thread flag _TIF_SYSCALL_TRACE which triggers a ptrace hook when set. This low-level part can easily be used by all tracers that would like to inspect syscalls. Just a change is needed: Steven requested that the part inside syscall_trace_enter become a tracepoint, making it totally shareable between tracers and easy to turn on and off. And perhaps the parts that set/clear the flag on all tasks can be shared too. So we can start with this low-level syscall tracing facility. If you want, I can adapt this low-level part and submit a patch this week or the next one to give you this base infrastructure. Once we have it, I think a syscall tracer can be fed with new syscalls events through several patch iterations, starting with the open and close one :-) Are you ok with that? Steven, Ingo, do you agree? > > diff --git a/fs/open.c b/fs/open.c > index a3a78ce..8cf2a6b 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -30,6 +30,10 @@ > #include > #include > > +#include > + > +DEFINE_TRACE(do_sys_open); > + > int vfs_statfs(struct dentry *dentry, struct kstatfs *buf) > { > int retval = -ENODEV; > @@ -1040,6 +1044,7 @@ long do_sys_open(int dfd, const char __user *filename, int > flags, int mode) > fsnotify_open(f->f_path.dentry); > fd_install(fd, f); > } > + trace_do_sys_open(f, flags, mode, fd); > } > putname(tmp); > } > diff --git a/include/trace/fs.h b/include/trace/fs.h > new file mode 100644 > index 0000000..870eec2 > --- /dev/null > +++ b/include/trace/fs.h > @@ -0,0 +1,11 @@ > +#ifndef _TRACE_FS_H > +#define _TRACE_FS_H > + > +#include > +#include > + > +DECLARE_TRACE(do_sys_open, > + TPPROTO(struct file *filp, int flags, int mode, long fd), > + TPARGS(filp, flags, mode, fd)); > + > +#endif > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index e2a4ff6..0400815 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -149,6 +149,15 @@ config CONTEXT_SWITCH_TRACER > This tracer gets called from the context switch and records > all switching of tasks. > > +config OPEN_CLOSE_TRACER > + bool "Trace open() calls" > + depends on DEBUG_KERNEL > + select TRACING > + select MARKERS > + help > + This tracer records open() syscalls. These calls are made when > + files are accessed on disk. > + > config BOOT_TRACER > bool "Trace boot initcalls" > depends on DEBUG_KERNEL > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile > index 349d5a9..25cec6c 100644 > --- a/kernel/trace/Makefile > +++ b/kernel/trace/Makefile > @@ -20,6 +20,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer.o > > obj-$(CONFIG_TRACING) += trace.o > obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o > +obj-$(CONFIG_OPEN_CLOSE_TRACER) += trace_open_close.o > obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o > obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o > obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h > index 4d3d381..24c17d2 100644 > --- a/kernel/trace/trace.h > +++ b/kernel/trace/trace.h > @@ -30,6 +30,7 @@ enum trace_type { > TRACE_USER_STACK, > TRACE_HW_BRANCHES, > TRACE_POWER, > + TRACE_OPEN, > > __TRACE_LAST_TYPE > }; > diff --git a/kernel/trace/trace_open_close.c b/kernel/trace/trace_open_close.c > new file mode 100644 > index 0000000..4250efc > --- /dev/null > +++ b/kernel/trace/trace_open_close.c > @@ -0,0 +1,148 @@ > +/* > + * trace open calls > + * Copyright (C) 2009 Intel Corporation > + * > + * Based extensively on trace_sched_switch.c > + * Copyright (C) 2007 Steven Rostedt > + * > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "trace.h" > + > + > +static struct trace_array *ctx_trace; > +static int __read_mostly open_trace_enabled; > +static atomic_t open_ref; > + > +static void probe_do_sys_open(struct file *filp, int flags, int mode, long fd) > +{ > + char *buf; > + char *fname; > + > + if (!atomic_read(&open_ref)) > + return; > + > + if (!open_trace_enabled) > + return; > + > + buf = kzalloc(PAGE_SIZE, GFP_KERNEL); > + if (!buf) > + return; > + fname = d_path(&filp->f_path, buf, PAGE_SIZE); > + > + if (IS_ERR(fname)) > + goto out; > + > + ftrace_printk("%s: open(\"%s\", %d, %d) = %ld\n", > + current->comm, fname, flags, mode, fd); > +out: > + kfree(buf); > +} > + > +static void open_trace_reset(struct trace_array *tr) > +{ > + tr->time_start = ftrace_now(tr->cpu); > + tracing_reset_online_cpus(tr); > +} > + > +static int open_trace_register(void) > +{ > + int ret; > + > + ret = register_trace_do_sys_open(probe_do_sys_open); > + if (ret) { > + pr_info("open trace: Could not activate tracepoint" > + " probe to do_open\n"); > + } > + > + return ret; > +} > + > +static void open_trace_unregister(void) > +{ > + unregister_trace_do_sys_open(probe_do_sys_open); > +} > + > +static void open_trace_start(void) > +{ > + long ref; > + > + ref = atomic_inc_return(&open_ref); > + if (ref == 1) > + open_trace_register(); > +} > + > +static void open_trace_stop(void) > +{ > + long ref; > + > + ref = atomic_dec_and_test(&open_ref); > + if (ref) > + open_trace_unregister(); > +} > + > +void open_trace_start_cmdline_record(void) > +{ > + open_trace_start(); > +} > + > +void open_trace_stop_cmdline_record(void) > +{ > + open_trace_stop(); > +} > + > +static void open_start_trace(struct trace_array *tr) > +{ > + open_trace_reset(tr); > + open_trace_start_cmdline_record(); > + open_trace_enabled = 1; > +} > + > +static void open_stop_trace(struct trace_array *tr) > +{ > + open_trace_enabled = 0; > + open_trace_stop_cmdline_record(); > +} > + > +static int open_trace_init(struct trace_array *tr) > +{ > + ctx_trace = tr; > + > + open_start_trace(tr); > + return 0; > +} > + > +static void reset_open_trace(struct trace_array *tr) > +{ > + open_stop_trace(tr); > +} > + > +static struct tracer open_trace __read_mostly = > +{ > + .name = "open", > + .init = open_trace_init, > + .reset = reset_open_trace, > +}; > + > +__init static int init_open_trace(void) > +{ > + int ret = 0; > + > + if (atomic_read(&open_ref)) > + ret = open_trace_register(); > + if (ret) { > + pr_info("error registering open trace\n"); > + return ret; > + } > + return register_tracer(&open_trace); > +} > +device_initcall(init_open_trace); > + > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/