2003-09-18 06:20:28

by Villacis, Juan

[permalink] [raw]
Subject: [PATCH 2.6.x] additional kernel event notifications

Hi,

We request that the following patch for additional kernel event
notifications be included in the upcoming 2.6.x kernel.

The current profiling hooks provide notifications at the end of a task's
lifetime (i.e., task exit, mmap exit, and exec unmap). We would like to
have additional notifications on the start of a task (i.e., fork,
execve, kernel image loads, and user image loads).

We believe that profiling tools such as Oprofile, Perfmon, and VTune
would benefit from the additional hooks by improving the accuracy and
completeness of the performance data, especially when working in
environments that can dynamically create and destroy executable code
(such as Java). Furthermore, these hooks could be used to measure
different types of performance data (e.g., "forks per second") which are
currently not available any other way.

Our patch follows the conventions used by the current profiling hooks,
and is relatively small.

We would appreciate comments/feedback on our proposal.

Regards,

Juan Villacis
Intel Corporation
Santa Clara, CA 95052


Attachments:
notification-2.6.0-test5.patch (9.02 kB)
notification-2.6.0-test5.patch
notification-2.6.0-test5.diffstat (471.00 B)
notification-2.6.0-test5.diffstat
Download all attachments

2003-09-19 18:19:02

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

On Wed, Sep 17, 2003 at 11:20:12PM -0700, Villacis, Juan wrote:
> ...
>
> We believe that profiling tools such as Oprofile, Perfmon, and VTune
> would benefit from the additional hooks by improving the accuracy and
> completeness of the performance data, especially when working in
> environments that can dynamically create and destroy executable code
> (such as Java). Furthermore, these hooks could be used to measure
> different types of performance data (e.g., "forks per second") which are
> currently not available any other way.
>
> ...

Any chance of this getting into 2.6? I for one would like to see it so
that the performance monitoring tools can work properly without having
to resort to syscall table patching. I've inlined the diffstat and
patch for easy reading.

Thanks,
Jesse

MAINTAINERS | 7 +++++
Makefile | 2 -
arch/i386/oprofile/Kconfig | 8 +++---
fs/exec.c | 3 ++
include/linux/profile.h | 25 ++++++++++++++++++-
kernel/fork.c | 3 ++
kernel/module.c | 16 ++++++++++++
kernel/profile.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
mm/mmap.c | 5 +++
9 files changed, 122 insertions(+), 6 deletions(-)

diff -urN linux-2.6.0-test5/arch/i386/oprofile/Kconfig linux-2.6.0-test5-intel-vtune/arch/i386/oprofile/Kconfig
--- linux-2.6.0-test5/arch/i386/oprofile/Kconfig Mon Sep 8 12:50:06 2003
+++ linux-2.6.0-test5-intel-vtune/arch/i386/oprofile/Kconfig Wed Sep 17 21:36:18 2003
@@ -1,16 +1,16 @@

menu "Profiling support"
- depends on EXPERIMENTAL

config PROFILING
- bool "Profiling support (EXPERIMENTAL)"
+ bool "Profiling support"
+ default y
help
Say Y here to enable the extended profiling support mechanisms used
- by profilers such as OProfile.
+ by profilers such as OProfile and VTune.


config OPROFILE
- tristate "OProfile system profiling (EXPERIMENTAL)"
+ tristate "OProfile system profiling"
depends on PROFILING
help
OProfile is a profiling system capable of profiling the
diff -urN linux-2.6.0-test5/Makefile linux-2.6.0-test5-intel-vtune/Makefile
--- linux-2.6.0-test5/Makefile Mon Sep 8 12:50:12 2003
+++ linux-2.6.0-test5-intel-vtune/Makefile Wed Sep 17 21:32:58 2003
@@ -1,7 +1,7 @@
VERSION = 2
PATCHLEVEL = 6
SUBLEVEL = 0
-EXTRAVERSION = -test5
+EXTRAVERSION = -test5-intel-vtune

# *DOCUMENTATION*
# To see a list of typical targets execute "make help"
diff -urN linux-2.6.0-test5/MAINTAINERS linux-2.6.0-test5-intel-vtune/MAINTAINERS
--- linux-2.6.0-test5/MAINTAINERS Mon Sep 8 12:50:07 2003
+++ linux-2.6.0-test5-intel-vtune/MAINTAINERS Wed Sep 17 21:31:55 2003
@@ -2193,6 +2193,13 @@
M: [email protected]
S: Maintained

+VTUNE
+P: Juan Villacis
+M: [email protected]
+W: http://www.intel.com/software/products/opensource/vdk/
+W: http://www.intel.com/software/products/vtune/
+S: Maintained
+
WAN ROUTER & SANGOMA WANPIPE DRIVERS & API (X.25, FRAME RELAY, PPP, CISCO HDLC)
P: Nenad Corbic
M: [email protected]
diff -urN linux-2.6.0-test5/include/linux/profile.h linux-2.6.0-test5-intel-vtune/include/linux/profile.h
--- linux-2.6.0-test5/include/linux/profile.h Mon Sep 8 12:50:03 2003
+++ linux-2.6.0-test5-intel-vtune/include/linux/profile.h Wed Sep 17 21:31:03 2003
@@ -6,6 +6,7 @@
#include <linux/kernel.h>
#include <linux/config.h>
#include <linux/init.h>
+#include <linux/module.h>
#include <asm/errno.h>

/* parse command line */
@@ -23,7 +24,11 @@
enum profile_type {
EXIT_TASK,
EXIT_MMAP,
- EXEC_UNMAP
+ EXEC_UNMAP,
+ DO_FORK,
+ DO_EXECVE,
+ LOAD_KERNEL_IMAGE,
+ LOAD_USER_IMAGE
};

#ifdef CONFIG_PROFILING
@@ -41,6 +46,20 @@
/* exit of all vmas for a task */
void profile_exit_mmap(struct mm_struct * mm);

+/* handler for DO_FORK event */
+void profile_do_fork(struct task_struct * task);
+
+/* handler for DO_EXECVE event */
+void profile_do_execve(struct task_struct * task);
+
+/* handler for LOAD_KERNEL_IMAGE event */
+void profile_load_kernel_image(struct module * mod, unsigned int sechdr_index,
+ unsigned long addr, unsigned long size);
+
+/* handler for LOAD_USER_IMAGE */
+void profile_load_user_image(struct task_struct * task,
+ struct vm_area_struct * vma);
+
int profile_event_register(enum profile_type, struct notifier_block * n);

int profile_event_unregister(enum profile_type, struct notifier_block * n);
@@ -66,6 +85,10 @@
#define profile_exit_task(a) do { } while (0)
#define profile_exec_unmap(a) do { } while (0)
#define profile_exit_mmap(a) do { } while (0)
+#define profile_do_fork(a) do { } while (0)
+#define profile_do_execve(a) do { } while (0)
+#define profile_load_kernel_image(a) do { } while (0)
+#define profile_load_user_image(a) do { } while (0)

static inline int register_profile_notifier(struct notifier_block * nb)
{
diff -urN linux-2.6.0-test5/kernel/profile.c linux-2.6.0-test5-intel-vtune/kernel/profile.c
--- linux-2.6.0-test5/kernel/profile.c Mon Sep 8 12:50:31 2003
+++ linux-2.6.0-test5-intel-vtune/kernel/profile.c Tue Sep 16 22:32:33 2003
@@ -50,6 +50,10 @@
static struct notifier_block * exit_task_notifier;
static struct notifier_block * exit_mmap_notifier;
static struct notifier_block * exec_unmap_notifier;
+static struct notifier_block * do_fork_notifier;
+static struct notifier_block * do_execve_notifier;
+static struct notifier_block * load_kernel_image_notifier;
+static struct notifier_block * load_user_image_notifier;

void profile_exit_task(struct task_struct * task)
{
@@ -72,6 +76,37 @@
up_read(&profile_rwsem);
}

+void profile_do_fork(struct task_struct * task)
+{
+ down_read(&profile_rwsem);
+ notifier_call_chain(&do_fork_notifier, 0, task);
+ up_read(&profile_rwsem);
+}
+
+void profile_do_execve(struct task_struct * task)
+{
+ down_read(&profile_rwsem);
+ notifier_call_chain(&do_execve_notifier, 0, task);
+ up_read(&profile_rwsem);
+}
+
+void profile_load_kernel_image(struct module * mod, unsigned int sechdr_index,
+ unsigned long addr, unsigned long size)
+{
+ down_read(&profile_rwsem);
+ notifier_call_chain(&load_kernel_image_notifier, sechdr_index, mod);
+ up_read(&profile_rwsem);
+}
+
+void profile_load_user_image(struct task_struct * task,
+ struct vm_area_struct * vma)
+{
+ down_read(&profile_rwsem);
+ notifier_call_chain(&load_user_image_notifier,(unsigned long) task,
+ vma);
+ up_read(&profile_rwsem);
+}
+
int profile_event_register(enum profile_type type, struct notifier_block * n)
{
int err = -EINVAL;
@@ -88,6 +123,18 @@
case EXEC_UNMAP:
err = notifier_chain_register(&exec_unmap_notifier, n);
break;
+ case DO_FORK:
+ err = notifier_chain_register(&do_fork_notifier, n);
+ break;
+ case DO_EXECVE:
+ err = notifier_chain_register(&do_execve_notifier, n);
+ break;
+ case LOAD_KERNEL_IMAGE:
+ err = notifier_chain_register(&load_kernel_image_notifier, n);
+ break;
+ case LOAD_USER_IMAGE:
+ err = notifier_chain_register(&load_user_image_notifier, n);
+ break;
}

up_write(&profile_rwsem);
@@ -112,6 +159,18 @@
case EXEC_UNMAP:
err = notifier_chain_unregister(&exec_unmap_notifier, n);
break;
+ case DO_FORK:
+ err = notifier_chain_unregister(&do_fork_notifier, n);
+ break;
+ case DO_EXECVE:
+ err = notifier_chain_unregister(&do_execve_notifier, n);
+ break;
+ case LOAD_KERNEL_IMAGE:
+ err = notifier_chain_unregister(&load_kernel_image_notifier, n);
+ break;
+ case LOAD_USER_IMAGE:
+ err = notifier_chain_unregister(&load_user_image_notifier, n);
+ break;
}

up_write(&profile_rwsem);
diff -urN linux-2.6.0-test5/kernel/fork.c linux-2.6.0-test5-intel-vtune/kernel/fork.c
--- linux-2.6.0-test5/kernel/fork.c Mon Sep 8 12:49:53 2003
+++ linux-2.6.0-test5-intel-vtune/kernel/fork.c Tue Sep 16 22:00:16 2003
@@ -30,6 +30,7 @@
#include <linux/futex.h>
#include <linux/ptrace.h>
#include <linux/mount.h>
+#include <linux/profile.h>

#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -1113,6 +1114,8 @@
set_tsk_thread_flag(p, TIF_SIGPENDING);
}

+ profile_do_fork(p);
+
p->state = TASK_STOPPED;
if (!(clone_flags & CLONE_STOPPED))
wake_up_forked_process(p); /* do this last */
diff -urN linux-2.6.0-test5/kernel/module.c linux-2.6.0-test5-intel-vtune/kernel/module.c
--- linux-2.6.0-test5/kernel/module.c Mon Sep 8 12:50:18 2003
+++ linux-2.6.0-test5-intel-vtune/kernel/module.c Tue Sep 16 22:03:00 2003
@@ -27,6 +27,7 @@
#include <linux/fcntl.h>
#include <linux/rcupdate.h>
#include <linux/cpu.h>
+#include <linux/profile.h>
#include <linux/moduleparam.h>
#include <linux/errno.h>
#include <linux/err.h>
@@ -1655,6 +1656,21 @@
if (err < 0)
goto cleanup;

+ /* track address of each loaded section in kernel module;
+ * only profile those which lie within the module's address
+ * range; we do it here since by this time all the sections
+ * have been properly layed out
+ */
+ for (i = 0; i < hdr->e_shnum; i++) {
+ if ( (sechdrs[i].sh_addr >= (unsigned long) mod->module_core) &&
+ (sechdrs[i].sh_addr < (unsigned long) (mod->module_core +
+ mod->core_size))) {
+ profile_load_kernel_image(mod, i,
+ (unsigned long)(sechdrs[i].sh_addr),
+ (unsigned long)(sechdrs[i].sh_size));
+ }
+ }
+
/* Get rid of temporary copy */
vfree(hdr);

diff -urN linux-2.6.0-test5/fs/exec.c linux-2.6.0-test5-intel-vtune/fs/exec.c
--- linux-2.6.0-test5/fs/exec.c Mon Sep 8 12:50:02 2003
+++ linux-2.6.0-test5-intel-vtune/fs/exec.c Tue Sep 16 22:28:33 2003
@@ -45,6 +45,7 @@
#include <linux/mount.h>
#include <linux/security.h>
#include <linux/rmap-locking.h>
+#include <linux/profile.h>

#include <asm/uaccess.h>
#include <asm/pgalloc.h>
@@ -1103,6 +1104,8 @@
if (retval >= 0) {
free_arg_pages(&bprm);

+ profile_do_execve(current);
+
/* execve success */
security_bprm_free(&bprm);
return retval;
diff -urN linux-2.6.0-test5/mm/mmap.c linux-2.6.0-test5-intel-vtune/mm/mmap.c
--- linux-2.6.0-test5/mm/mmap.c Mon Sep 8 12:50:08 2003
+++ linux-2.6.0-test5-intel-vtune/mm/mmap.c Tue Sep 16 22:04:42 2003
@@ -651,6 +651,11 @@
}
kmem_cache_free(vm_area_cachep, vma);
}
+
+ if (vma->vm_flags & VM_EXEC) {
+ profile_load_user_image(current, vma);
+ }
+
out:
mm->total_vm += len >> PAGE_SHIFT;
if (vm_flags & VM_LOCKED) {

2003-09-19 18:47:26

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

[email protected] (Jesse Barnes) wrote:
>
> Any chance of this getting into 2.6? I for one would like to see it so
> that the performance monitoring tools can work properly without having
> to resort to syscall table patching.

If the code which uses these hooks is included in the kernel.org tree, yes.

If the code which needs the hooks is not in the kernel.org tree then people
can patch the core kernel at the same time as adding the performance
analysis patch.

If the code which needs these hooks is not appropriately licensed then
these hooks basically constitute a GPL bypass and that is not a direction
we wish to be heading in.

2003-09-19 19:32:14

by Villacis, Juan

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

Hi,

Our sampling driver kernel module which uses these hooks is GPL
and could be included in the kernel.org tree.

The current version of the driver (also GPL, but which hooks the
sys_call_table for 2.4.x-based kernels) is posted at,

http://www.intel.com/software/products/opensource/vdk/

We plan to post our new driver for kernel 2.6.0-test5 (with the
event notification patch applied) on both IA-32 and IA-64 to the
above site early next week.

-juan


-----Original Message-----
From: Andrew Morton [mailto:[email protected]]
Sent: Friday, September 19, 2003 11:28 AM
To: Jesse Barnes
Cc: Villacis, Juan; [email protected]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

[email protected] (Jesse Barnes) wrote:
>
> Any chance of this getting into 2.6? I for one would like to see it
so
> that the performance monitoring tools can work properly without having
> to resort to syscall table patching.

If the code which uses these hooks is included in the kernel.org tree,
yes.

If the code which needs the hooks is not in the kernel.org tree then
people
can patch the core kernel at the same time as adding the performance
analysis patch.

If the code which needs these hooks is not appropriately licensed then
these hooks basically constitute a GPL bypass and that is not a
direction
we wish to be heading in.

2003-09-19 19:59:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

"Villacis, Juan" <[email protected]> wrote:
>
> Our sampling driver kernel module which uses these hooks is GPL
> and could be included in the kernel.org tree.

Ah, OK.

That code seems to have a lot of infrastructure for buffering samples,
transferring it to userspace, etc.

Have you looked into using the infrastructure in drivers/oprofile/ for
this? In other words: is it possible to augment the kernel's existing
oprofile capabilities so they meet VTune requirements?


2003-09-19 20:00:46

by Nakajima, Jun

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

I think Juan simply wants to remove the ugly hack that hooks
sys_call_table.
Constituting a GPL bypass is not his purpose of adding the hooks.
Several similar hooks are already there.

Probably what's missing is value of adding those hooks in the kernel.org
tree. I guess there is a user-level command available that runs on Linux
(other than GUI-based), to analyze user/kernel performance in a detailed
fashion.

Thanks,
Jun

> -----Original Message-----
> From: Villacis, Juan
> Sent: Friday, September 19, 2003 12:32 PM
> To: Andrew Morton; Jesse Barnes
> Cc: [email protected]
> Subject: RE: [PATCH 2.6.x] additional kernel event notifications
>
> Hi,
>
> Our sampling driver kernel module which uses these hooks is GPL
> and could be included in the kernel.org tree.
>
> The current version of the driver (also GPL, but which hooks the
> sys_call_table for 2.4.x-based kernels) is posted at,
>
> http://www.intel.com/software/products/opensource/vdk/
>
> We plan to post our new driver for kernel 2.6.0-test5 (with the
> event notification patch applied) on both IA-32 and IA-64 to the
> above site early next week.
>
> -juan
>
>
> -----Original Message-----
> From: Andrew Morton [mailto:[email protected]]
> Sent: Friday, September 19, 2003 11:28 AM
> To: Jesse Barnes
> Cc: Villacis, Juan; [email protected]
> Subject: Re: [PATCH 2.6.x] additional kernel event notifications
>
> [email protected] (Jesse Barnes) wrote:
> >
> > Any chance of this getting into 2.6? I for one would like to see it
> so
> > that the performance monitoring tools can work properly without
having
> > to resort to syscall table patching.
>
> If the code which uses these hooks is included in the kernel.org tree,
> yes.
>
> If the code which needs the hooks is not in the kernel.org tree then
> people
> can patch the core kernel at the same time as adding the performance
> analysis patch.
>
> If the code which needs these hooks is not appropriately licensed then
> these hooks basically constitute a GPL bypass and that is not a
> direction
> we wish to be heading in.
>
> -
> To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2003-09-19 21:18:48

by Villacis, Juan

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

Hi,

"Andrew Morton" <[email protected]> wrote:
>
> That code seems to have a lot of infrastructure for buffering samples,
> transferring it to userspace, etc.

We are not trying to change the current profiling infrastructure. We
are trying to enhance the existing event notification scheme to handle
more events.

> Have you looked into using the infrastructure in drivers/oprofile/ for
> this? In other words: is it possible to augment the kernel's existing
> oprofile capabilities so they meet VTune requirements?

The current event notifications used by tools like Oprofile, while quite
useful, are not sufficient. The additional event notifications we
propose can provide a more complete picture for performance tuning on
Linux, particularly for dynamically generated code (such as found in
Java).
In addition to allowing for the enhancement of current performance
tools, it also enables creation of new tools to gather measurements that
were previously difficult to obtain (e.g., "image loads per second").

-juan

2003-09-19 21:47:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

"Villacis, Juan" <[email protected]> wrote:
>
> > Have you looked into using the infrastructure in drivers/oprofile/ for
> > this? In other words: is it possible to augment the kernel's existing
> > oprofile capabilities so they meet VTune requirements?
>
> The current event notifications used by tools like Oprofile, while quite
> useful, are not sufficient. The additional event notifications we
> propose can provide a more complete picture for performance tuning on
> Linux, particularly for dynamically generated code (such as found in
> Java).

You are answering a question I did not ask. Let me rephrase.

Have you considered interfacing vtune userspace to oprofilefs and enhancing
oprofilefs to meet vtune requirements, thus removing the need for the vtune
kernel module, and its device node and ioctl interface?

2003-09-20 00:57:45

by Villacis, Juan

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

Hi,

"Andrew Morton" <[email protected]> wrote:
> Have you considered interfacing vtune userspace
> to oprofilefs and enhancing oprofilefs to meet
> vtune requirements, thus removing the need for
> the vtune kernel module, and its device node
> and ioctl interface?

We have considered the option of using Oprofile's mechanisms for VTune,
but Oprofile and VTune do different things in different ways. For
example,
both tools capture performance data, but Oprofile was designed with
aggregation in mind whereas VTune was designed to collect all the data
and then post-process it.

We are open to putting the VTune driver into the kernel source tree.
However, is consolidation of the performance tools a requirement for
getting the four additional event notifications into the kernel?

-juan

2003-09-20 02:02:29

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

"Villacis, Juan" <[email protected]> writes:

> The current event notifications used by tools like Oprofile, while quite
> useful, are not sufficient. The additional event notifications we
> propose can provide a more complete picture for performance tuning on
> Linux, particularly for dynamically generated code (such as found in
> Java).

Can you explain why profiling dynamically generated code needs kernel
support? The kernel should not know anything about this.

The original oprofile patch also added similar hooks, but they were
not merged. Instead the "dcookies" mechanism was added to assign samples to
specific executables. Why can't you use the same mechanism?

There is not more information in the kernel than what dcookies
already provide.

-Andi

2003-09-20 02:23:56

by John Levon

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

On Fri, Sep 19, 2003 at 05:57:40PM -0700, Villacis, Juan wrote:

> both tools capture performance data, but Oprofile was designed with
> aggregation in mind whereas VTune was designed to collect all the data
> and then post-process it.

It would help a huge amount if you explained how you do :

EIP -> java source line / symbol

This is the exact transformation that oprofile *doesn't* do, and I never
managed to get a clear explanation of what you need and why for that to
happen.

In particular, your userspace must be doing some sort of communication
with the running Java VM, and the question remains open as to whether
it's possible to do that in an oprofile manner instead of a VTune 2.4 /
OProfile 2.4 manner.

I still suspect we have significant amounts of code that can be merged
between us. This would be a significant benefit to the poor saps such as
akpm who have to care about the kernel as a whole.

You also mentioned performance issues with the current OProfile code -
have we discussed the new design at all (basically: keep task structs
hanging around, remove the horrific buffer_sem)

regards
john

--
Khendon's Law:
If the same point is made twice by the same person, the thread is over.

2003-09-20 17:28:36

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications


> The current event notifications used by tools like Oprofile, while quite
> useful, are not sufficient. The additional event notifications we
> propose can provide a more complete picture for performance tuning on
> Linux, particularly for dynamically generated code (such as found in
> Java).

Could you please explain why you cant build on top of oprofile? If arch
specific profilers start going in, we are going to have 5 different ways
of doing the same thing.

We really need to work together on this. We for example have a bunch of
ppc64 profiling stuff but throwing that into the kernel to create yet
another profiler is not the answer.

Anton

2003-09-22 05:59:26

by Villacis, Juan

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

Hi,

"Andi Kleen" <[email protected]> writes:
> Can you explain why profiling dynamically generated code needs kernel
> support? The kernel should not know anything about this.

In some cases, a profiler can figure out information regarding
Dynamically Generate Code (DGC) with help from the generator of the
code, but in other cases it cannot.

In the case of Java jitted code, our userspace tools obtain sufficient
information through JVMPI, when it is implemented by the JVM. However,
for DGC which does not have such userspace support, it is important to
be able to spot and accurately attribute samples to DGC. The 4
additional profiling hooks we proposed can be used for such purposes.

> The original oprofile patch also added similar hooks, but they were
> not merged. Instead the "dcookies" mechanism was added to assign
samples
> to specific executables. Why can't you use the same mechanism?

If the generator of DGC frees memory used for DGC that subsequently gets
a loaded image (or reuses memory that may have once had an executable
image), you can mis-attribute samples so that instead of attributing the
samples to the DGC, you will attribute the samples to an image. The
dcookie mechanism will indicate information about an image, but doesn't
help prevent mis-attribution of samples if DGC is intermixed with images
that are loaded/unloaded in the same memory region.

-juan

2003-09-22 06:23:18

by Villacis, Juan

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

Hi,

"Anton Blanchard" <[email protected]> writes:
> Could you please explain why you cant build on top of oprofile? If
arch
> specific profilers start going in, we are going to have 5 different
ways
> of doing the same thing.

The patch we submitted adds 4 generic hooks to the existing set of
profiling hooks. These additional hooks can be used to help performance
tools such as Oprofile and VTune to not mis-attribute performance data.

> We really need to work together on this. We for example have a bunch
of
> ppc64 profiling stuff but throwing that into the kernel to create yet
> another profiler is not the answer.

We are open to the possibility of including the VTune driver into the
base kernel, perhaps in an architecture-dependent area. It could
complement existing profilers.

-juan

2003-09-22 11:07:33

by John Levon

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

On Sun, Sep 21, 2003 at 10:59:21PM -0700, Villacis, Juan wrote:

> In some cases, a profiler can figure out information regarding
> Dynamically Generate Code (DGC) with help from the generator of the
> code, but in other cases it cannot.
>
> In the case of Java jitted code, our userspace tools obtain sufficient
> information through JVMPI, when it is implemented by the JVM.

I would argue that any JVM that doesn't yet implement
JVMPI_EVENT_COMPILED_METHOD_LOAD is broken - JVMPI has been around for a
long time now.

I don't see why the kernel is the correct place to fix such lacking
functionality.

> for DGC which does not have such userspace support, it is important to
> be able to spot and accurately attribute samples to DGC. The 4
> additional profiling hooks we proposed can be used for such purposes.

Please be specific about which *actual* cases you're worried about, and
why they shouldn't be fixed in userspace.

> If the generator of DGC frees memory used for DGC that subsequently gets
> a loaded image (or reuses memory that may have once had an executable
> image), you can mis-attribute samples so that instead of attributing the
> samples to the DGC, you will attribute the samples to an image. The
> dcookie mechanism will indicate information about an image, but doesn't
> help prevent mis-attribution of samples if DGC is intermixed with images
> that are loaded/unloaded in the same memory region.

Simply flush the sample buffer by echo 1 >/dev/oprofile/dump when you
receive a COMPILED_METHOD_LOAD/UNLOAD that conflicts with a previous
mapping.

regards
john
--
Khendon's Law:
If the same point is made twice by the same person, the thread is over.

2003-09-23 01:16:33

by Villacis, Juan

[permalink] [raw]
Subject: RE: [PATCH 2.6.x] additional kernel event notifications

Hi,

"John Levon" <[email protected]> writes:
> > In some cases, a profiler can figure out information regarding
> > Dynamically Generate Code (DGC) with help from the generator of the
> > code, but in other cases it cannot.
> >
> > In the case of Java jitted code, our userspace tools obtain
sufficient
> > information through JVMPI, when it is implemented by the JVM.
>
> I would argue that any JVM that doesn't yet implement
> JVMPI_EVENT_COMPILED_METHOD_LOAD is broken - JVMPI has been around for
> a long time now.
>
> I don't see why the kernel is the correct place to fix such lacking
> functionality.

There are several languages and tools besides Java which can jit code.
Perl6 (using interpreters like Parrot) and Python (using interpreters
like Psycho) are examples of this and it is not yet clear what
interfaces they will provide to allow for JVMPI like functionality. We
would like to be able to profile applications using these interpreters
(as well as others). If userspace interpreters, for whatever reason, do
not provide functionality for tracking jitted code, would it not be
useful to provide help in the kernel to allow profilers to be able to
track this?

Although we understand your belief that a DGC generator that doesn't
provide hooks for determining the DGC information may not be optimal, we
would still like to be able to provide something useful to the users of
those environments. Perhaps this philosophical discussion is better
handled offline per an earlier suggestion.

> > for DGC which does not have such userspace support, it is important
to
> > be able to spot and accurately attribute samples to DGC. The 4
> > additional profiling hooks we proposed can be used for such
purposes.
>
> Please be specific about which *actual* cases you're worried about,
and
> why they shouldn't be fixed in userspace.

>From what we can see in the Oprofile driver code, if there is no dcookie
for a corresponding EIP, the EIP information is discarded and the
"oprofile_stats.sample_lost_no_mapping" count is incremented (see
add_us_sample() in drivers/oprofile/buffer_sync.c). Did we
misunderstand something? Is a DENTRY (and a corresponding dcookie)
being created somewhere for DGC?

> > If the generator of DGC frees memory used for DGC that subsequently
gets
> > a loaded image (or reuses memory that may have once had an
executable
> > image), you can mis-attribute samples so that instead of attributing
the
> > samples to the DGC, you will attribute the samples to an image. The
> > dcookie mechanism will indicate information about an image, but
doesn't
> > help prevent mis-attribution of samples if DGC is intermixed with
images
> > that are loaded/unloaded in the same memory region.
>
> Simply flush the sample buffer by echo 1 >/dev/oprofile/dump when you
> receive a COMPILED_METHOD_LOAD/UNLOAD that conflicts with a previous
> mapping.

As long as there exists a userspace API similar to the JVMPI for a
particular DGC generator, and as long as the profiling tool is using
that API, then yes, it seems like this method would work. But this
isn't the case we are worried about.

-juan

2003-09-23 10:16:35

by John Levon

[permalink] [raw]
Subject: Re: [PATCH 2.6.x] additional kernel event notifications

On Mon, Sep 22, 2003 at 06:16:14PM -0700, Villacis, Juan wrote:

> There are several languages and tools besides Java which can jit code.
> Perl6 (using interpreters like Parrot) and Python (using interpreters
> like Psycho) are examples of this and it is not yet clear what
> interfaces they will provide to allow for JVMPI like functionality. We

First, "we might need it" is not generally considered a good enough
rationale for extra code by the kernel people.

Second, if there is no help from a runtime for tracking JITted code
memory -> source lines, I do not see how your hooks could possibly help.
Either way you need help from the VM of the target program (be it
Parrot, Python or whatever).

> would still like to be able to provide something useful to the users of

What would "something useful" be in particular ?

> From what we can see in the Oprofile driver code, if there is no dcookie
> for a corresponding EIP, the EIP information is discarded and the

Correct.

But we're talking about making oprofile useful for your purposes - that
can and will involve changes. In particular it's trivial to add an
option to oprofile to output the raw EIP is no dentry could be matched.

> As long as there exists a userspace API similar to the JVMPI for a
> particular DGC generator, and as long as the profiling tool is using
> that API, then yes, it seems like this method would work. But this
> isn't the case we are worried about.

I hate to be a pain, but I'm still missing a concrete explanation of an
*actual* case you are worried about. It's rather difficult to discuss
this stuff in the abstract.

regards
john
--
Khendon's Law:
If the same point is made twice by the same person, the thread is over.