2002-09-23 22:24:32

by John Levon

[permalink] [raw]
Subject: [PATCH][RFC] oprofile 2.5.38 patch


At :

http://oprofile.sourceforge.net/oprofile-2.5.html

You can find a very early patch against 2.5.38 (87k) and userspace
tools for the new oprofile stuff. I've split it up into smaller chunks
at http://oprofile.sourceforge.net/oprofile-2.5/ (only for easier
reading).

I would appreciate people's comments. I'm not exactly happy with
buffer_sync.c in particular, so I'd love to find a better way.

The code has been lightly tested on a 2-way PII-400 box, which showed
some quirks, but it seems to work.

Note that you must use the new userspace tools at the URL (and they will
conflict with any existing oprofile stuff, so please don't mix the two).

A short summary of how it works (mostly thought out by Linus ;) :

Each cpu has a linear buffer for capturing EIP/event values. Events such
as munmaps trigger a buffer sync, where the EIP values are converted
into cookie/offset pairs and stored in the global buffer. The cookies
are accessible by the userspace daemon which uses the binary path to
create the sample files. Context switches also need to go in the cpu
buffer. release_task() needs similar tracking in order to prevent
task_struct * accesses after it has been freed.

regards
john


2002-09-24 11:29:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH][RFC] oprofile 2.5.38 patch

On Mon, Sep 23, 2002 at 11:29:33PM +0100, John Levon wrote:
>
> At :
>
> http://oprofile.sourceforge.net/oprofile-2.5.html

Comments:

+if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
+ mainmenu_option next_comment
+ comment 'Profiling support'
+ bool 'Profiling support' CONFIG_PROFILING
+ source drivers/oprofile/Config.in
+ endmenu
+fi

The check and the menu should go into drivers/oprofile/Config.in

+OProfile system profiling
+CONFIG_OPROFILE
+ OProfile is a profiling system capable of profiling the
+ whole system, include the kernel, kernel modules, libraries,
+ and applications.
+
+ If unsure, say N.

For 2.5 we don't need/want the 'OProfile system profiling' line.

if [ "$CONFIG_PREEMPT" != "y" ]; then
+ dep_tristate 'OProfile system profiling' CONFIG_OPROFILE
$CONFIG_EXPERIMENTAL $CONFIG_PROFILING
+fi

You already tested for CONFIG_EXPERIMENTAL above.. Also you want
an (EXPERIMENTAL) mark.

Together with the above suggestion drivers/oprofile/Config.in should
look roughly like:

if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
mainmenu_option next_comment
comment 'Profiling support'
bool 'Profiling support (EXPERIMENTAL)' CONFIG_PROFILING
if [ "$CONFIG_PREEMPT" != "y" -a "$CONFIG_PROFILING" = "y" ]; then
dep_tristate ' OProfile system profiling (EXPERIMENTAL)' CONFIG_OPROFILE
fi
endmenu
fi

OTOH I wonder whether you really want an submenu. It could as well just
be part of the general settings one.

+obj-$(CONFIG_OPROFILE) += oprofile.o
+
+oprofile-objs := oprof.o cpu_buffer.o event_buffer.o \
+ buffer_sync.o oprofilefs.o \
+ oprofile_files.o
+
+include $(ARCH)/Makefile
+
+include $(TOPDIR)/Rules.make

Usually <foo>-objs is below obj-*, but that's just cosmetic. The Makefile
inclusion seems very wrong to me. Why do you need it?

Shouldn't drivers/oprofile/i386/ be arch/i386/oprofile??


/**
+ * \file oprofilefs.c
+ * Copyright 2002 the LyX Team
+ * Read the file COPYING
+ *
+ * \author John Levon <[email protected]>
+ *
+ * Based on fs/binfmt_misc.c
+ *
+ * A simple filesystem for configuration and
+ * access of oprofile.
+ */

Why is this copyright different from the others?

ssize_t fail_write(struct file * file, const char * buf, size_t count, loff_t *
offset)
+{
+ return -EPERM;
+}

Why don't you just set no write method at all?

Maybe oprofilefs wants to become a generic version, it has nothing
oprofile-specific..


#ifndef CONFIG_PROFILING
+
+asmlinkage int sys_lookup_dcookie(unsigned long cookie, char * buf, size_t
len)+{
+ return -ENOSYS;
+}
+
+#else

Please use cond_syscall() and compile that file only for CONFIG_PROFILING set.

+ struct dcookie_struct * d_cookie; /* cookie, if any */

Make that conditional on CONFIG_PROFILING?

/* Profile hooks */
+#ifdef CONFIG_PROFILING
+EXPORT_SYMBOL(profile_event_register);
+EXPORT_SYMBOL(profile_event_unregister);
+EXPORT_SYMBOL(dcookie_register);
+EXPORT_SYMBOL(dcookie_unregister);
+EXPORT_SYMBOL(get_dcookie);
+#endif

Put that exports into profile.c?


2002-09-24 12:19:02

by John Levon

[permalink] [raw]
Subject: Re: [PATCH][RFC] oprofile 2.5.38 patch

On Tue, Sep 24, 2002 at 02:48:38PM -0400, Christoph Hellwig wrote:

> [snip config bits]

Thanks.

> OTOH I wonder whether you really want an submenu. It could as well just
> be part of the general settings one.

I was a bit dubious it might be hard to find. It's not particularly
general ...

> Usually <foo>-objs is below obj-*, but that's just cosmetic. The Makefile
> inclusion seems very wrong to me. Why do you need it?

It's by far the simplest way to build the single module target...

> Shouldn't drivers/oprofile/i386/ be arch/i386/oprofile??

Perhaps. I kind of liked the code to be together. Also I wasn't sure how
I could convince kbuild to make a single target from the disparate dirs.
Kai mentioned that XFS does something similar, I'll look into that ...

> Why is this copyright different from the others?

Perils of very similar vim iab's ... fixed

> Why don't you just set no write method at all?

OK.

> Maybe oprofilefs wants to become a generic version, it has nothing
> oprofile-specific..

I don't know what you mean here. Do you mean hooking into driverfs ?
Last I checked it wasn't suitable for such oddball minifs's.

> Please use cond_syscall() and compile that file only for CONFIG_PROFILING set.

Ah, thanks.

> + struct dcookie_struct * d_cookie; /* cookie, if any */
>
> Make that conditional on CONFIG_PROFILING?

And same for the task_struct field. Will do.

> Put that exports into profile.c?

Sure.

thanks
john

2002-09-24 12:38:35

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH][RFC] oprofile 2.5.38 patch

John Levon <[email protected]> writes:

> At :
>
> http://oprofile.sourceforge.net/oprofile-2.5.html

+ page = (unsigned char *)__get_free_page(GFP_KERNEL);
+ if (!page)
+ return -ENOMEM;
+
+ spin_lock(&oprofilefs_lock);
+ len = sprintf(page, "%lu\n", *value);
+ spin_unlock(&oprofilefs_lock);

wouldn't an on stack buffer do nicely to format a single number ?

ulong_write_file:

it doesn't length limit count before passing to kmalloc - hole.
Also has overflow bugs (consider someone passing 0xffffffff-1).

The sys_lookup_dcookie call looks like a security hole to me. After
all it could allow everybody to lookup random paths by trying all
dcookies, even though the directories may be unreadable for him. It should
be probably made root only

Adding a list_head to task_struct looks quite ugly to me. Is there
surely no better way ? e.g. you could just put it in a file private
structure and the daemon keeps the file open.

-Andi

2002-09-24 12:50:17

by John Levon

[permalink] [raw]
Subject: Re: [PATCH][RFC] oprofile 2.5.38 patch

On Tue, Sep 24, 2002 at 02:43:47PM +0200, Andi Kleen wrote:

> wouldn't an on stack buffer do nicely to format a single number ?

Yes, this was a lazy adaption of other code. Will fix.

> it doesn't length limit count before passing to kmalloc - hole.
> Also has overflow bugs (consider someone passing 0xffffffff-1).

Thanks.

> The sys_lookup_dcookie call looks like a security hole to me. After

I'll make it CAP_SYS_ADMIN ?

> Adding a list_head to task_struct looks quite ugly to me. Is there
> surely no better way ? e.g. you could just put it in a file private
> structure and the daemon keeps the file open.

Well, if I'm going to "hard code" the unregister I can just remove the
registration and let the oprofile code call dcookie_init/exit on event
buffer open/release.

thanks
john

2002-09-26 16:00:24

by Alan

[permalink] [raw]
Subject: Re: [PATCH][RFC] oprofile 2.5.38 patch

Ok comments


Security:
enable_write with a count of 0xFFFFFFFF again repeated everywhere

Major
The buffer_read function doesnt seem to be SMP safe
Doesnt seem to know about Intel pmc errata
Assumes it will get PM notifiers reliably (it wont)

Minor
Massive duplication of code in each read/write handler - build some
helpers which actually do the thing right and you'd also have less bugs
cpu_type_read doesnt handle partial read
cpu_type_read scribbles on more data than the user requested
(fun to feed as stdin to a setuid app)
similar errors permeate the rest of that code

Trivial
In the event of an -EFAULT data is lost (nothing illegal about it)


Basically its a nice prototype, but with the prototype working could do
with some auditing and a cleanup. I think if you replace all the
read/write functions with some clean helpers and fix the messes in the
helpers it'll clean up really nicely