2009-12-16 03:28:38

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 1/2] perf: Add util/include/linuxhash.h to include hash.h of kernel

linux/hash.h, hash header of kernel, is also useful for perf.
util/include/linuxhash.h includes linux/hash.h, so we can use
hash stuffs (e.g. hash_long()) in perf now.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/util/include/linux/hash.h | 5 +++++
2 files changed, 6 insertions(+), 0 deletions(-)
create mode 100644 tools/perf/util/include/linux/hash.h

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 28b0258..aad2693 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -328,6 +328,7 @@ LIB_FILE=libperf.a
LIB_H += ../../include/linux/perf_event.h
LIB_H += ../../include/linux/rbtree.h
LIB_H += ../../include/linux/list.h
+LIB_H += ../../include/linux/hash.h
LIB_H += ../../include/linux/stringify.h
LIB_H += util/include/linux/bitmap.h
LIB_H += util/include/linux/bitops.h
diff --git a/tools/perf/util/include/linux/hash.h b/tools/perf/util/include/linux/hash.h
new file mode 100644
index 0000000..201f573
--- /dev/null
+++ b/tools/perf/util/include/linux/hash.h
@@ -0,0 +1,5 @@
+#include "../../../../include/linux/hash.h"
+
+#ifndef PERF_HASH_H
+#define PERF_HASH_H
+#endif
--
1.6.5.2


2009-12-16 03:28:50

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 2/2] perf lock: Fix output of tracing lock events

This patch adds address of lockdep_map to each lock events.
perf lock uses these addresses as IDs of lock instances.

And this removes waittime from output of lock_acquired event.
The value will be caliculated in userspace based on timestamp.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
include/trace/events/lock.h | 27 ++++++++++++++++++---------
kernel/lockdep.c | 2 +-
2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h
index a870ba1..2f94e25 100644
--- a/include/trace/events/lock.h
+++ b/include/trace/events/lock.h
@@ -18,16 +18,19 @@ TRACE_EVENT(lock_acquire,
TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__field(unsigned int, flags)
__string(name, lock->name)
),

TP_fast_assign(
+ __entry->lockdep_addr = lock;
__entry->flags = (trylock ? 1 : 0) | (read ? 2 : 0);
__assign_str(name, lock->name);
),

- TP_printk("%s%s%s", (__entry->flags & 1) ? "try " : "",
+ TP_printk("%p %s%s%s", __entry->lockdep_addr,
+ (__entry->flags & 1) ? "try " : "",
(__entry->flags & 2) ? "read " : "",
__get_str(name))
);
@@ -39,14 +42,16 @@ TRACE_EVENT(lock_release,
TP_ARGS(lock, nested, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__string(name, lock->name)
),

TP_fast_assign(
+ __entry->lockdep_addr = lock;
__assign_str(name, lock->name);
),

- TP_printk("%s", __get_str(name))
+ TP_printk("%p %s", __entry->lockdep_addr, __get_str(name))
);

#ifdef CONFIG_LOCK_STAT
@@ -58,33 +63,37 @@ TRACE_EVENT(lock_contended,
TP_ARGS(lock, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__string(name, lock->name)
),

TP_fast_assign(
+ __entry->lockdep_addr = lock;
__assign_str(name, lock->name);
),

- TP_printk("%s", __get_str(name))
+ TP_printk("%p %s", __entry->lockdep_addr, __get_str(name))
);

TRACE_EVENT(lock_acquired,
- TP_PROTO(struct lockdep_map *lock, unsigned long ip, s64 waittime),
+ TP_PROTO(struct lockdep_map *lock, unsigned long ip),

- TP_ARGS(lock, ip, waittime),
+ TP_ARGS(lock, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__string(name, lock->name)
__field(unsigned long, wait_usec)
__field(unsigned long, wait_nsec_rem)
),
+
TP_fast_assign(
+ __entry->lockdep_addr = lock;
__assign_str(name, lock->name);
- __entry->wait_nsec_rem = do_div(waittime, NSEC_PER_USEC);
- __entry->wait_usec = (unsigned long) waittime;
),
- TP_printk("%s (%lu.%03lu us)", __get_str(name), __entry->wait_usec,
- __entry->wait_nsec_rem)
+
+ TP_printk("%p %s", __entry->lockdep_addr,
+ __get_str(name))
);

#endif
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index f5dcd36..d6c30a9 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3383,7 +3383,7 @@ found_it:
hlock->holdtime_stamp = now;
}

- trace_lock_acquired(lock, ip, waittime);
+ trace_lock_acquired(lock, ip);

stats = get_lock_stats(hlock_class(hlock));
if (waittime) {
--
1.6.5.2

2009-12-16 08:19:54

by Hitoshi Mitake

[permalink] [raw]
Subject: [tip:perf/lock] perf: Add util/include/linuxhash.h to include hash.h of kernel

Commit-ID: 7485634cae6a70a07f14e9f8a05407363a3cb44b
Gitweb: http://git.kernel.org/tip/7485634cae6a70a07f14e9f8a05407363a3cb44b
Author: Hitoshi Mitake <[email protected]>
AuthorDate: Wed, 16 Dec 2009 12:28:24 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 16 Dec 2009 08:48:57 +0100

perf: Add util/include/linuxhash.h to include hash.h of kernel

linux/hash.h, hash header of kernel, is also useful for perf.

util/include/linuxhash.h includes linux/hash.h, so we can use
hash facilities (e.g. hash_long()) in perf now.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/util/include/linux/hash.h | 5 +++++
2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 28b0258..aad2693 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -328,6 +328,7 @@ LIB_FILE=libperf.a
LIB_H += ../../include/linux/perf_event.h
LIB_H += ../../include/linux/rbtree.h
LIB_H += ../../include/linux/list.h
+LIB_H += ../../include/linux/hash.h
LIB_H += ../../include/linux/stringify.h
LIB_H += util/include/linux/bitmap.h
LIB_H += util/include/linux/bitops.h
diff --git a/tools/perf/util/include/linux/hash.h b/tools/perf/util/include/linux/hash.h
new file mode 100644
index 0000000..201f573
--- /dev/null
+++ b/tools/perf/util/include/linux/hash.h
@@ -0,0 +1,5 @@
+#include "../../../../include/linux/hash.h"
+
+#ifndef PERF_HASH_H
+#define PERF_HASH_H
+#endif

2009-12-16 08:20:19

by Hitoshi Mitake

[permalink] [raw]
Subject: [tip:perf/lock] perf lock: Fix output of tracing lock events

Commit-ID: e87379ce0da06a63dc5bf7f1a9ed5c38b56e9726
Gitweb: http://git.kernel.org/tip/e87379ce0da06a63dc5bf7f1a9ed5c38b56e9726
Author: Hitoshi Mitake <[email protected]>
AuthorDate: Wed, 16 Dec 2009 12:28:25 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 16 Dec 2009 08:48:58 +0100

perf lock: Fix output of tracing lock events

This patch adds address of lockdep_map to each lock events.
perf lock uses these addresses as IDs of lock instances.

And this removes waittime from output of lock_acquired event.
The value will be caliculated in userspace based on timestamp.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
include/trace/events/lock.h | 27 ++++++++++++++++++---------
kernel/lockdep.c | 2 +-
2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h
index a870ba1..2f94e25 100644
--- a/include/trace/events/lock.h
+++ b/include/trace/events/lock.h
@@ -18,16 +18,19 @@ TRACE_EVENT(lock_acquire,
TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__field(unsigned int, flags)
__string(name, lock->name)
),

TP_fast_assign(
+ __entry->lockdep_addr = lock;
__entry->flags = (trylock ? 1 : 0) | (read ? 2 : 0);
__assign_str(name, lock->name);
),

- TP_printk("%s%s%s", (__entry->flags & 1) ? "try " : "",
+ TP_printk("%p %s%s%s", __entry->lockdep_addr,
+ (__entry->flags & 1) ? "try " : "",
(__entry->flags & 2) ? "read " : "",
__get_str(name))
);
@@ -39,14 +42,16 @@ TRACE_EVENT(lock_release,
TP_ARGS(lock, nested, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__string(name, lock->name)
),

TP_fast_assign(
+ __entry->lockdep_addr = lock;
__assign_str(name, lock->name);
),

- TP_printk("%s", __get_str(name))
+ TP_printk("%p %s", __entry->lockdep_addr, __get_str(name))
);

#ifdef CONFIG_LOCK_STAT
@@ -58,33 +63,37 @@ TRACE_EVENT(lock_contended,
TP_ARGS(lock, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__string(name, lock->name)
),

TP_fast_assign(
+ __entry->lockdep_addr = lock;
__assign_str(name, lock->name);
),

- TP_printk("%s", __get_str(name))
+ TP_printk("%p %s", __entry->lockdep_addr, __get_str(name))
);

TRACE_EVENT(lock_acquired,
- TP_PROTO(struct lockdep_map *lock, unsigned long ip, s64 waittime),
+ TP_PROTO(struct lockdep_map *lock, unsigned long ip),

- TP_ARGS(lock, ip, waittime),
+ TP_ARGS(lock, ip),

TP_STRUCT__entry(
+ __field(struct lockdep_map *, lockdep_addr)
__string(name, lock->name)
__field(unsigned long, wait_usec)
__field(unsigned long, wait_nsec_rem)
),
+
TP_fast_assign(
+ __entry->lockdep_addr = lock;
__assign_str(name, lock->name);
- __entry->wait_nsec_rem = do_div(waittime, NSEC_PER_USEC);
- __entry->wait_usec = (unsigned long) waittime;
),
- TP_printk("%s (%lu.%03lu us)", __get_str(name), __entry->wait_usec,
- __entry->wait_nsec_rem)
+
+ TP_printk("%p %s", __entry->lockdep_addr,
+ __get_str(name))
);

#endif
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index f5dcd36..d6c30a9 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3383,7 +3383,7 @@ found_it:
hlock->holdtime_stamp = now;
}

- trace_lock_acquired(lock, ip, waittime);
+ trace_lock_acquired(lock, ip);

stats = get_lock_stats(hlock_class(hlock));
if (waittime) {

2009-12-17 08:50:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events


* Hitoshi Mitake <[email protected]> wrote:

> This patch adds address of lockdep_map to each lock events.
> perf lock uses these addresses as IDs of lock instances.
>
> And this removes waittime from output of lock_acquired event.
> The value will be caliculated in userspace based on timestamp.
>
> Signed-off-by: Hitoshi Mitake <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> ---
> include/trace/events/lock.h | 27 ++++++++++++++++++---------
> kernel/lockdep.c | 2 +-
> 2 files changed, 19 insertions(+), 10 deletions(-)

Frederic, Peter, do these two patches look good to you? We can apply them to
make subsequent perf lock work easier.

Thanks,

Ingo

2009-12-17 09:25:30

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events

On Wed, 2009-12-16 at 12:28 +0900, Hitoshi Mitake wrote:
> This patch adds address of lockdep_map to each lock events.
> perf lock uses these addresses as IDs of lock instances.
>
> And this removes waittime from output of lock_acquired event.
> The value will be caliculated in userspace based on timestamp.
>
> Signed-off-by: Hitoshi Mitake <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> ---
> include/trace/events/lock.h | 27 ++++++++++++++++++---------
> kernel/lockdep.c | 2 +-
> 2 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h
> index a870ba1..2f94e25 100644
> --- a/include/trace/events/lock.h
> +++ b/include/trace/events/lock.h
> @@ -18,16 +18,19 @@ TRACE_EVENT(lock_acquire,
> TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),
>
> TP_STRUCT__entry(
> + __field(struct lockdep_map *, lockdep_addr)
> __field(unsigned int, flags)
> __string(name, lock->name)
> ),

I feel a bit awkward explicitly leaking kernel pointers like that. All
this is accessible by root only (for now) so its not too harmfull, but
sitll.

Also, I don't think we want to expose the struct lockdep_map thing, a
regular void * would be better.

As to removing the waittime, I'm not sure, in this case, yes, but if you
want some other processing that hooks straight into the tracepoints
instead of using a logging structure, it might be useful.

Removing that do_div() from there and exposing waittime as u64 in nsec,
for sure, that do_div() is just silly.

2009-12-17 10:09:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events


* Peter Zijlstra <[email protected]> wrote:

> > TP_STRUCT__entry(
> > + __field(struct lockdep_map *, lockdep_addr)
> > __field(unsigned int, flags)
> > __string(name, lock->name)
> > ),
>
> I feel a bit awkward explicitly leaking kernel pointers like that. All this
> is accessible by root only (for now) so its not too harmfull, but sitll.

What would you suggest as a 'natural lock class key'? The name? It might not
be unique enough.

Other kernel objects like tasks, cpus, inodes, pages all have some natural key
that isnt a kernel pointer - but locks are a bit special.

Ingo

2009-12-17 10:26:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events

On Thu, 2009-12-17 at 11:09 +0100, Ingo Molnar wrote:
> * Peter Zijlstra <[email protected]> wrote:
>
> > > TP_STRUCT__entry(
> > > + __field(struct lockdep_map *, lockdep_addr)
> > > __field(unsigned int, flags)
> > > __string(name, lock->name)
> > > ),
> >
> > I feel a bit awkward explicitly leaking kernel pointers like that. All this
> > is accessible by root only (for now) so its not too harmfull, but sitll.
>
> What would you suggest as a 'natural lock class key'? The name? It might not
> be unique enough.
>
> Other kernel objects like tasks, cpus, inodes, pages all have some natural key
> that isnt a kernel pointer - but locks are a bit special.

Well, yeah, that's the problem, and we use the pointer for exactly this
purpose inside the kernel too, its just that its a blatant data leak
when we expose it to userspace like that.

On the other hand, adding some ID generation just so we can expose it
seems silly too.

Why do we need to have instance resolution?

2009-12-17 10:51:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events


* Peter Zijlstra <[email protected]> wrote:

> On Thu, 2009-12-17 at 11:09 +0100, Ingo Molnar wrote:
> > * Peter Zijlstra <[email protected]> wrote:
> >
> > > > TP_STRUCT__entry(
> > > > + __field(struct lockdep_map *, lockdep_addr)
> > > > __field(unsigned int, flags)
> > > > __string(name, lock->name)
> > > > ),
> > >
> > > I feel a bit awkward explicitly leaking kernel pointers like that. All this
> > > is accessible by root only (for now) so its not too harmfull, but sitll.
> >
> > What would you suggest as a 'natural lock class key'? The name? It might not
> > be unique enough.
> >
> > Other kernel objects like tasks, cpus, inodes, pages all have some natural key
> > that isnt a kernel pointer - but locks are a bit special.
>
> Well, yeah, that's the problem, and we use the pointer for exactly this
> purpose inside the kernel too, its just that its a blatant data leak when we
> expose it to userspace like that.
>
> On the other hand, adding some ID generation just so we can expose it seems
> silly too.
>
> Why do we need to have instance resolution?

Does the tool make use of it to classify stats?

Ingo

2009-12-26 13:43:46

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events

Sorry for my slow response...

>> diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h
>> index a870ba1..2f94e25 100644
>> --- a/include/trace/events/lock.h
>> +++ b/include/trace/events/lock.h
>> @@ -18,16 +18,19 @@ TRACE_EVENT(lock_acquire,
>> ? ? ? TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),
>>
>> ? ? ? TP_STRUCT__entry(
>> + ? ? ? ? ? ? __field(struct lockdep_map *, lockdep_addr)
>> ? ? ? ? ? ? ? __field(unsigned int, flags)
>> ? ? ? ? ? ? ? __string(name, lock->name)
>> ? ? ? ),
>
> I feel a bit awkward explicitly leaking kernel pointers like that. All
> this is accessible by root only (for now) so its not too harmfull, but
> sitll.
>
> Also, I don't think we want to expose the struct lockdep_map thing, a
> regular void * would be better.

Yeah, I agree with it. void * is enough.

>
> As to removing the waittime, I'm not sure, in this case, yes, but if you
> want some other processing that hooks straight into the tracepoints
> instead of using a logging structure, it might be useful.
>
> Removing that do_div() from there and exposing waittime as u64 in nsec,
> for sure, that do_div() is just silly.
>
>
>

I was too egoist. perf lock is not an only one user of lock events.

And I have a suggestion. Adding name of source files and lines of
lock instances may be good thing for human's readability.
How do you think?

I have some pending works for these (not made into patches).
If you agree, I'll send these later.

2009-12-28 10:02:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events

On Sat, 2009-12-26 at 22:43 +0900, Hitoshi Mitake wrote:

> > As to removing the waittime, I'm not sure, in this case, yes, but if you
> > want some other processing that hooks straight into the tracepoints
> > instead of using a logging structure, it might be useful.
> >
> > Removing that do_div() from there and exposing waittime as u64 in nsec,
> > for sure, that do_div() is just silly.
> >
> >
> >
>
> I was too egoist. perf lock is not an only one user of lock events.
>
> And I have a suggestion. Adding name of source files and lines of
> lock instances may be good thing for human's readability.
> How do you think?

file:line might be interesting indeed, but I worry about the size of the
event entry.. But lets see how that goes.

> > Why do we need to have instance resolution?

You forgot to answer this question.

Is it purely because the waittime computation as done by lockstat is not
good enough for you -- should we not fix that instead, that'd benefit
more people.

2009-12-31 13:24:43

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events

On Mon, Dec 28, 2009 at 19:01, Peter Zijlstra <[email protected]> wrote:
> On Sat, 2009-12-26 at 22:43 +0900, Hitoshi Mitake wrote:
>
>> > As to removing the waittime, I'm not sure, in this case, yes, but if you
>> > want some other processing that hooks straight into the tracepoints
>> > instead of using a logging structure, it might be useful.
>> >
>> > Removing that do_div() from there and exposing waittime as u64 in nsec,
>> > for sure, that do_div() is just silly.
>> >
>> >
>> >
>>
>> I was too egoist. perf lock is not an only one user of lock events.
>>
>> And I have a suggestion. Adding name of source files and lines of
>> lock instances may be good thing for human's readability.
>> How do you think?
>
> file:line might be interesting indeed, but I worry about the size of the
> event entry.. But lets see how that goes.
>
>> > Why do we need to have instance resolution?
>
> You forgot to answer this question.
>
> Is it purely because the waittime computation as done by lockstat is not
> good enough for you -- should we not fix that instead, that'd benefit
> more people.
>
>
>

Ah, sorry, let me answer to the question.

Yes, I need instance resolution. Name is not enough thing.
Because there are locks which use name which is used by other locks.
For example, at least 5 lock instances use the name "port_lock".

This is copied from output of cscope.

*** drivers/infiniband/core/user_mad.c:
<global>[138] static DEFINE_SPINLOCK(port_lock);

*** drivers/net/ehea/ehea.h:
<global>[473] struct mutex port_lock;

*** drivers/pcmcia/i82092.c:
<global>[202] static DEFINE_SPINLOCK(port_lock);

*** drivers/pcmcia/pd6729.c:
<global>[69] static DEFINE_SPINLOCK(port_lock);

*** drivers/usb/gadget/u_serial.c:
<global>[94] spinlock_t port_lock;

port_lock of ehea.h and u_serial.c is a member of struct,
so number of instances with this name is more than 5.

So we cannot distinguish between each lock instances by their name.

And waittime caliculated by lockstat is valuable information,
but we can also use timestamp.
waittime is not equal to hold time, and every events have their timestamps.
I think exploiting information from timestamps
(which already be providec) is worthful.

2010-01-07 10:39:01

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf lock: Fix output of tracing lock events

On 2009年12月31日 22:24, Hitoshi Mitake wrote:
> On Mon, Dec 28, 2009 at 19:01, Peter Zijlstra<[email protected]> wrote:
>> On Sat, 2009-12-26 at 22:43 +0900, Hitoshi Mitake wrote:
>>
>>>> As to removing the waittime, I'm not sure, in this case, yes, but if you
>>>> want some other processing that hooks straight into the tracepoints
>>>> instead of using a logging structure, it might be useful.
>>>>
>>>> Removing that do_div() from there and exposing waittime as u64 in nsec,
>>>> for sure, that do_div() is just silly.
>>>>
>>>>
>>>>
>>>
>>> I was too egoist. perf lock is not an only one user of lock events.
>>>
>>> And I have a suggestion. Adding name of source files and lines of
>>> lock instances may be good thing for human's readability.
>>> How do you think?
>>
>> file:line might be interesting indeed, but I worry about the size of the
>> event entry.. But lets see how that goes.
>>
>>>> Why do we need to have instance resolution?
>>
>> You forgot to answer this question.
>>
>> Is it purely because the waittime computation as done by lockstat is not
>> good enough for you -- should we not fix that instead, that'd benefit
>> more people.
>>
>>
>>
>
> Ah, sorry, let me answer to the question.
>
> Yes, I need instance resolution. Name is not enough thing.
> Because there are locks which use name which is used by other locks.
> For example, at least 5 lock instances use the name "port_lock".
>
> This is copied from output of cscope.
>
> *** drivers/infiniband/core/user_mad.c:
> <global>[138] static DEFINE_SPINLOCK(port_lock);
>
> *** drivers/net/ehea/ehea.h:
> <global>[473] struct mutex port_lock;
>
> *** drivers/pcmcia/i82092.c:
> <global>[202] static DEFINE_SPINLOCK(port_lock);
>
> *** drivers/pcmcia/pd6729.c:
> <global>[69] static DEFINE_SPINLOCK(port_lock);
>
> *** drivers/usb/gadget/u_serial.c:
> <global>[94] spinlock_t port_lock;
>
> port_lock of ehea.h and u_serial.c is a member of struct,
> so number of instances with this name is more than 5.
>
> So we cannot distinguish between each lock instances by their name.
>
> And waittime caliculated by lockstat is valuable information,
> but we can also use timestamp.
> waittime is not equal to hold time, and every events have their timestamps.
> I think exploiting information from timestamps
> (which already be providec) is worthful.
> --

Sorry, I felt that I misunderstood intention of your question.
If you told about /proc/lock_stat, yes, it is very useful information.
But this provides all information from system boot.

perf lock can provide lock statistics between two arbitrarily points.
This is valuable thing for analyzing behavior of specific applications.

2010-01-07 10:40:08

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 2/5] lockdep: Add file and line to initialize sequence of rwsem

Like other ones of this patch series, this patch adds
__FILE__ and __LINE__ to lockdep_map of rw_semaphore.
This patch only affects to architecture independent rwsem.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
include/linux/rwsem-spinlock.h | 11 ++++++++---
lib/rwsem-spinlock.c | 5 +++--
lib/rwsem.c | 5 +++--
3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/rwsem-spinlock.h b/include/linux/rwsem-spinlock.h
index 6c3c0f6..64755c9 100644
--- a/include/linux/rwsem-spinlock.h
+++ b/include/linux/rwsem-spinlock.h
@@ -38,7 +38,11 @@ struct rw_semaphore {
};

#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __RWSEM_DEP_MAP_INIT(lockname) , .dep_map = { .name = #lockname }
+# define __RWSEM_DEP_MAP_INIT(lockname) , .dep_map = { \
+ .file = __FILE__, \
+ .line = __LINE__, \
+ .name = #lockname, \
+ }
#else
# define __RWSEM_DEP_MAP_INIT(lockname)
#endif
@@ -51,13 +55,14 @@ struct rw_semaphore {
struct rw_semaphore name = __RWSEM_INITIALIZER(name)

extern void __init_rwsem(struct rw_semaphore *sem, const char *name,
- struct lock_class_key *key);
+ struct lock_class_key *key,
+ const char *file, unsigned int line);

#define init_rwsem(sem) \
do { \
static struct lock_class_key __key; \
\
- __init_rwsem((sem), #sem, &__key); \
+ __init_rwsem((sem), #sem, &__key, __FILE__, __LINE__); \
} while (0)

extern void __down_read(struct rw_semaphore *sem);
diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c
index 9df3ca5..ab17997 100644
--- a/lib/rwsem-spinlock.c
+++ b/lib/rwsem-spinlock.c
@@ -21,14 +21,15 @@ struct rwsem_waiter {
* initialise the semaphore
*/
void __init_rwsem(struct rw_semaphore *sem, const char *name,
- struct lock_class_key *key)
+ struct lock_class_key *key,
+ const char *file, unsigned int line)
{
#ifdef CONFIG_DEBUG_LOCK_ALLOC
/*
* Make sure we are not reinitializing a held semaphore:
*/
debug_check_no_locks_freed((void *)sem, sizeof(*sem));
- lockdep_init_map(&sem->dep_map, name, key, 0);
+ __lockdep_init_map(&sem->dep_map, name, key, 0, file, line);
#endif
sem->activity = 0;
spin_lock_init(&sem->wait_lock);
diff --git a/lib/rwsem.c b/lib/rwsem.c
index 3e3365e..3f8d5cd 100644
--- a/lib/rwsem.c
+++ b/lib/rwsem.c
@@ -12,14 +12,15 @@
* Initialize an rwsem:
*/
void __init_rwsem(struct rw_semaphore *sem, const char *name,
- struct lock_class_key *key)
+ struct lock_class_key *key,
+ const char *file, unsigned int line)
{
#ifdef CONFIG_DEBUG_LOCK_ALLOC
/*
* Make sure we are not reinitializing a held semaphore:
*/
debug_check_no_locks_freed((void *)sem, sizeof(*sem));
- lockdep_init_map(&sem->dep_map, name, key, 0);
+ __lockdep_init_map(&sem->dep_map, name, key, 0, file, line);
#endif
sem->count = RWSEM_UNLOCKED_VALUE;
spin_lock_init(&sem->wait_lock);
--
1.6.5.2

2010-01-07 10:40:16

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 5/5] lockdep: Fix the way to initialize class_mutex for information of file and line

This patch adds __FILE__ and __LINE__ to arguments passed for
__mutex_init() called in __class_register().
This affects to lockdep_map of class_mutex of struct class_private,
and these will be used to make trace lock events more human friendly.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
---
drivers/base/class.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/base/class.c b/drivers/base/class.c
index 161746d..cc0630f 100644
--- a/drivers/base/class.c
+++ b/drivers/base/class.c
@@ -162,7 +162,8 @@ int __class_register(struct class *cls, struct lock_class_key *key)
klist_init(&cp->class_devices, klist_class_dev_get, klist_class_dev_put);
INIT_LIST_HEAD(&cp->class_interfaces);
kset_init(&cp->class_dirs);
- __mutex_init(&cp->class_mutex, "struct class mutex", key);
+ __mutex_init(&cp->class_mutex, "struct class mutex", key,
+ __FILE__, __LINE__);
error = kobject_set_name(&cp->class_subsys.kobj, "%s", cls->name);
if (error) {
kfree(cp);
--
1.6.5.2

2010-01-07 10:40:42

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 4/5] lockdep: Add file and line to initialize sequence of mutex

Like other ones of this patch series, this patch adds
__FILE__ and __LINE__ to lockdep_map of mutex.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
include/linux/mutex-debug.h | 2 +-
include/linux/mutex.h | 12 +++++++++---
kernel/mutex-debug.c | 5 +++--
kernel/mutex-debug.h | 3 ++-
kernel/mutex.c | 5 +++--
kernel/mutex.h | 2 +-
6 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/mutex-debug.h b/include/linux/mutex-debug.h
index 731d77d..f86bf4e 100644
--- a/include/linux/mutex-debug.h
+++ b/include/linux/mutex-debug.h
@@ -15,7 +15,7 @@
do { \
static struct lock_class_key __key; \
\
- __mutex_init((mutex), #mutex, &__key); \
+ __mutex_init((mutex), #mutex, &__key, __FILE__, __LINE__); \
} while (0)

extern void mutex_destroy(struct mutex *lock);
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 878cab4..ce9082a 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -82,14 +82,19 @@ struct mutex_waiter {
do { \
static struct lock_class_key __key; \
\
- __mutex_init((mutex), #mutex, &__key); \
+ __mutex_init((mutex), #mutex, &__key, \
+ __FILE__, __LINE__); \
} while (0)
# define mutex_destroy(mutex) do { } while (0)
#endif

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# define __DEP_MAP_MUTEX_INITIALIZER(lockname) \
- , .dep_map = { .name = #lockname }
+ , .dep_map = { \
+ .file = __FILE__, \
+ .line = __LINE__, \
+ .name = #lockname, \
+ }
#else
# define __DEP_MAP_MUTEX_INITIALIZER(lockname)
#endif
@@ -105,7 +110,8 @@ do { \
struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)

extern void __mutex_init(struct mutex *lock, const char *name,
- struct lock_class_key *key);
+ struct lock_class_key *key,
+ const char *file, unsigned int line);

/**
* mutex_is_locked - is the mutex locked
diff --git a/kernel/mutex-debug.c b/kernel/mutex-debug.c
index ec815a9..4a1321b 100644
--- a/kernel/mutex-debug.c
+++ b/kernel/mutex-debug.c
@@ -81,14 +81,15 @@ void debug_mutex_unlock(struct mutex *lock)
}

void debug_mutex_init(struct mutex *lock, const char *name,
- struct lock_class_key *key)
+ struct lock_class_key *key,
+ const char *file, unsigned int line)
{
#ifdef CONFIG_DEBUG_LOCK_ALLOC
/*
* Make sure we are not reinitializing a held lock:
*/
debug_check_no_locks_freed((void *)lock, sizeof(*lock));
- lockdep_init_map(&lock->dep_map, name, key, 0);
+ __lockdep_init_map(&lock->dep_map, name, key, 0, file, line);
#endif
lock->magic = lock;
}
diff --git a/kernel/mutex-debug.h b/kernel/mutex-debug.h
index 6b2d735..802a2d7 100644
--- a/kernel/mutex-debug.h
+++ b/kernel/mutex-debug.h
@@ -25,7 +25,8 @@ extern void mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,
struct thread_info *ti);
extern void debug_mutex_unlock(struct mutex *lock);
extern void debug_mutex_init(struct mutex *lock, const char *name,
- struct lock_class_key *key);
+ struct lock_class_key *key,
+ const char *file, unsigned int line);

static inline void mutex_set_owner(struct mutex *lock)
{
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 947b3ad..7c49bab 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -46,14 +46,15 @@
* It is not allowed to initialize an already locked mutex.
*/
void
-__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
+__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key,
+ const char *file, unsigned int line)
{
atomic_set(&lock->count, 1);
spin_lock_init(&lock->wait_lock);
INIT_LIST_HEAD(&lock->wait_list);
mutex_clear_owner(lock);

- debug_mutex_init(lock, name, key);
+ debug_mutex_init(lock, name, key, file, line);
}

EXPORT_SYMBOL(__mutex_init);
diff --git a/kernel/mutex.h b/kernel/mutex.h
index 67578ca..81914e6 100644
--- a/kernel/mutex.h
+++ b/kernel/mutex.h
@@ -40,7 +40,7 @@ static inline void mutex_clear_owner(struct mutex *lock)
#define debug_mutex_free_waiter(waiter) do { } while (0)
#define debug_mutex_add_waiter(lock, waiter, ti) do { } while (0)
#define debug_mutex_unlock(lock) do { } while (0)
-#define debug_mutex_init(lock, name, key) do { } while (0)
+#define debug_mutex_init(lock, name, key, file, line) do { } while (0)

static inline void
debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter)
--
1.6.5.2

2010-01-07 10:40:07

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 0/5] lockdep: Add information of file and line to lockdep_map

There are a lot of lock instances with same names (e.g. port_lock).
This patch series add __FILE__ and __LINE__ to lockdep_map,
and these will be used for trace lock events.

Example use from perf lock map:

| 0xffffea0004c992b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
| 0xffffea0004b112b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
| 0xffffea0004a3f2b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
| 0xffffea0004cd5228: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
| 0xffff8800b91e2b28: &sb->s_type->i_lock_key (src: fs/inode.c, line: 166)
| 0xffff8800bb9d7ae0: key (src: kernel/wait.c, line: 16)
| 0xffff8800aa07dae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
| 0xffff8800b07fbae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
| 0xffff8800b07f3ae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
| 0xffff8800bf15fae0: &sighand->siglock (src: kernel/fork.c, line: 1490)
| 0xffff8800b90f7ae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
| ...

(This output of perf lock map is produced by my local version,
I'll send this later.)

And sadly, as Peter Zijlstra predicted, this produces certain overhead.

Before appling this series:
| % sudo ./perf lock rec perf bench sched messaging
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
| Total time: 3.834 [sec]
After:
sudo ./perf lock rec perf bench sched messaging
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
| Total time: 5.415 [sec]
| [ perf record: Woken up 0 times to write data ]
| [ perf record: Captured and wrote 53.512 MB perf.data (~2337993 samples) ]

But raw exec of perf bench sched messaging is this:
| % perf bench sched messaging
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
| Total time: 0.498 [sec]

Tracing lock events already produces amount of overhead.
I think the overhead produced by this series is not a fatal problem,
radically optimization is required...

Could you merge this into perf/lock branch, Ingo?

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>

Hitoshi Mitake (5):
lockdep: Add file and line to initialize sequence of spin and rw lock
lockdep: Add file and line to initialize sequence of rwsem
lockdep: Add file and line to initialize sequence of rwsem
lockdep: Add file and line to initialize sequence of mutex
lockdep: Fix the way to initialize class_mutex for information of
file and line

arch/x86/include/asm/rwsem.h | 9 +++++++--
drivers/base/class.c | 3 ++-
include/linux/mutex-debug.h | 2 +-
include/linux/mutex.h | 12 +++++++++---
include/linux/rwsem-spinlock.h | 11 ++++++++---
include/linux/spinlock.h | 12 ++++++++----
include/linux/spinlock_types.h | 12 ++++++++++--
kernel/mutex-debug.c | 5 +++--
kernel/mutex-debug.h | 3 ++-
kernel/mutex.c | 5 +++--
kernel/mutex.h | 2 +-
lib/rwsem-spinlock.c | 5 +++--
lib/rwsem.c | 5 +++--
lib/spinlock_debug.c | 10 ++++++----
14 files changed, 66 insertions(+), 30 deletions(-)

2010-01-07 10:40:55

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 3/5] lockdep: Add file and line to initialize sequence of rwsem

This patch adds __FILE__ and __LINE__ to lockdep_map of rw_semaphore.
This patch only affects to x86 architecture dependent rwsem.

If this patch is accepted, I'll prepare and send the patch
adds __FILE__ and __LINE__ to rw_semaphore of other architectures
which provides architecture dependent rwsem.h .

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
arch/x86/include/asm/rwsem.h | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index ca7517d..24cbe3a 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -74,7 +74,12 @@ struct rw_semaphore {
};

#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __RWSEM_DEP_MAP_INIT(lockname) , .dep_map = { .name = #lockname }
+# define __RWSEM_DEP_MAP_INIT(lockname) \
+ , .dep_map = { \
+ .file = __FILE__, \
+ .line = __LINE__, \
+ .name = #lockname, \
+ }
#else
# define __RWSEM_DEP_MAP_INIT(lockname)
#endif
@@ -96,7 +101,7 @@ extern void __init_rwsem(struct rw_semaphore *sem, const char *name,
do { \
static struct lock_class_key __key; \
\
- __init_rwsem((sem), #sem, &__key); \
+ __init_rwsem((sem), #sem, &__key, __FILE__, __LINE__); \
} while (0)

/*
--
1.6.5.2

2010-01-07 10:41:16

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH 1/5] lockdep: Add file and line to initialize sequence of spin and rw lock

each of spinlock and rwlock has two way to be initialized.
1: the macros DEFINE_{SPIN,RW}LOCK for statically defined locks
2: the functions {spin_,rw}lock_init() for locks on dynamically allocated memory

This patch modifies these two initialize sequences
for adding __FILE__ and __LINE__ to lockdep_map.

Signed-off-by: Hitoshi Mitake <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
---
include/linux/spinlock.h | 12 ++++++++----
include/linux/spinlock_types.h | 12 ++++++++++--
lib/spinlock_debug.c | 10 ++++++----
3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index f0ca7a7..cf526d8 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -92,12 +92,14 @@ extern int __lockfunc generic__raw_read_trylock(raw_rwlock_t *lock);

#ifdef CONFIG_DEBUG_SPINLOCK
extern void __spin_lock_init(spinlock_t *lock, const char *name,
- struct lock_class_key *key);
+ struct lock_class_key *key,
+ const char *file, unsigned int line);
# define spin_lock_init(lock) \
do { \
static struct lock_class_key __key; \
\
- __spin_lock_init((lock), #lock, &__key); \
+ __spin_lock_init((lock), #lock, &__key, \
+ __FILE__, __LINE__); \
} while (0)

#else
@@ -107,12 +109,14 @@ do { \

#ifdef CONFIG_DEBUG_SPINLOCK
extern void __rwlock_init(rwlock_t *lock, const char *name,
- struct lock_class_key *key);
+ struct lock_class_key *key,
+ const char *file, unsigned int line);
# define rwlock_init(lock) \
do { \
static struct lock_class_key __key; \
\
- __rwlock_init((lock), #lock, &__key); \
+ __rwlock_init((lock), #lock, &__key, \
+ __FILE__, __LINE__); \
} while (0)
#else
# define rwlock_init(lock) \
diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h
index 68d88f7..13c04a0 100644
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -52,13 +52,21 @@ typedef struct {
#define SPINLOCK_OWNER_INIT ((void *)-1L)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define SPIN_DEP_MAP_INIT(lockname) .dep_map = { .name = #lockname }
+# define SPIN_DEP_MAP_INIT(lockname) .dep_map = { \
+ .file = __FILE__, \
+ .line = __LINE__, \
+ .name = #lockname, \
+ }
#else
# define SPIN_DEP_MAP_INIT(lockname)
#endif

#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define RW_DEP_MAP_INIT(lockname) .dep_map = { .name = #lockname }
+# define RW_DEP_MAP_INIT(lockname) .dep_map = { \
+ .file = __FILE__, \
+ .line = __LINE__, \
+ .name = #lockname, \
+ }
#else
# define RW_DEP_MAP_INIT(lockname)
#endif
diff --git a/lib/spinlock_debug.c b/lib/spinlock_debug.c
index 9c4b025..fdab55d 100644
--- a/lib/spinlock_debug.c
+++ b/lib/spinlock_debug.c
@@ -14,14 +14,15 @@
#include <linux/module.h>

void __spin_lock_init(spinlock_t *lock, const char *name,
- struct lock_class_key *key)
+ struct lock_class_key *key,
+ const char *file, unsigned int line)
{
#ifdef CONFIG_DEBUG_LOCK_ALLOC
/*
* Make sure we are not reinitializing a held lock:
*/
debug_check_no_locks_freed((void *)lock, sizeof(*lock));
- lockdep_init_map(&lock->dep_map, name, key, 0);
+ __lockdep_init_map(&lock->dep_map, name, key, 0, file, line);
#endif
lock->raw_lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
lock->magic = SPINLOCK_MAGIC;
@@ -32,14 +33,15 @@ void __spin_lock_init(spinlock_t *lock, const char *name,
EXPORT_SYMBOL(__spin_lock_init);

void __rwlock_init(rwlock_t *lock, const char *name,
- struct lock_class_key *key)
+ struct lock_class_key *key,
+ const char *file, unsigned int line)
{
#ifdef CONFIG_DEBUG_LOCK_ALLOC
/*
* Make sure we are not reinitializing a held lock:
*/
debug_check_no_locks_freed((void *)lock, sizeof(*lock));
- lockdep_init_map(&lock->dep_map, name, key, 0);
+ __lockdep_init_map(&lock->dep_map, name, key, 0, file, line);
#endif
lock->raw_lock = (raw_rwlock_t) __RAW_RW_LOCK_UNLOCKED;
lock->magic = RWLOCK_MAGIC;
--
1.6.5.2

2010-01-13 09:52:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/5] lockdep: Add information of file and line to lockdep_map

On Thu, 2010-01-07 at 19:39 +0900, Hitoshi Mitake wrote:
> There are a lot of lock instances with same names (e.g. port_lock).
> This patch series add __FILE__ and __LINE__ to lockdep_map,
> and these will be used for trace lock events.
>
> Example use from perf lock map:
>
> | 0xffffea0004c992b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> | 0xffffea0004b112b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> | 0xffffea0004a3f2b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> | 0xffffea0004cd5228: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> | 0xffff8800b91e2b28: &sb->s_type->i_lock_key (src: fs/inode.c, line: 166)
> | 0xffff8800bb9d7ae0: key (src: kernel/wait.c, line: 16)
> | 0xffff8800aa07dae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> | 0xffff8800b07fbae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> | 0xffff8800b07f3ae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> | 0xffff8800bf15fae0: &sighand->siglock (src: kernel/fork.c, line: 1490)
> | 0xffff8800b90f7ae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> | ...
>
> (This output of perf lock map is produced by my local version,
> I'll send this later.)
>
> And sadly, as Peter Zijlstra predicted, this produces certain overhead.
>
> Before appling this series:
> | % sudo ./perf lock rec perf bench sched messaging
> | # Running sched/messaging benchmark...
> | # 20 sender and receiver processes per group
> | # 10 groups == 400 processes run
> |
> | Total time: 3.834 [sec]
> After:
> sudo ./perf lock rec perf bench sched messaging
> | # Running sched/messaging benchmark...
> | # 20 sender and receiver processes per group
> | # 10 groups == 400 processes run
> |
> | Total time: 5.415 [sec]
> | [ perf record: Woken up 0 times to write data ]
> | [ perf record: Captured and wrote 53.512 MB perf.data (~2337993 samples) ]
>
> But raw exec of perf bench sched messaging is this:
> | % perf bench sched messaging
> | # Running sched/messaging benchmark...
> | # 20 sender and receiver processes per group
> | # 10 groups == 400 processes run
> |
> | Total time: 0.498 [sec]
>
> Tracing lock events already produces amount of overhead.
> I think the overhead produced by this series is not a fatal problem,
> radically optimization is required...

Right, these patches look OK, for the tracing overhead, you could
possibly hash the file:line into a u64 and reduce the tracepoint size,
that should improve the situation I tihnk, because I seem to remember
the only thing that really matters for speed is the size of things.

2010-01-13 10:02:33

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 5/5] lockdep: Fix the way to initialize class_mutex for information of file and line


* Hitoshi Mitake <[email protected]> wrote:

> This patch adds __FILE__ and __LINE__ to arguments passed for
> __mutex_init() called in __class_register().
> This affects to lockdep_map of class_mutex of struct class_private,
> and these will be used to make trace lock events more human friendly.
>
> Signed-off-by: Hitoshi Mitake <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> ---
> drivers/base/class.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/base/class.c b/drivers/base/class.c
> index 161746d..cc0630f 100644
> --- a/drivers/base/class.c
> +++ b/drivers/base/class.c
> @@ -162,7 +162,8 @@ int __class_register(struct class *cls, struct lock_class_key *key)
> klist_init(&cp->class_devices, klist_class_dev_get, klist_class_dev_put);
> INIT_LIST_HEAD(&cp->class_interfaces);
> kset_init(&cp->class_dirs);
> - __mutex_init(&cp->class_mutex, "struct class mutex", key);
> + __mutex_init(&cp->class_mutex, "struct class mutex", key,
> + __FILE__, __LINE__);
> error = kobject_set_name(&cp->class_subsys.kobj, "%s", cls->name);
> if (error) {
> kfree(cp);

Greg, any objection to this type of embellishment of lock data that comes from
the driver core?

Thanks,

Ingo

2010-01-13 10:09:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] lockdep: Add information of file and line to lockdep_map


* Peter Zijlstra <[email protected]> wrote:

> On Thu, 2010-01-07 at 19:39 +0900, Hitoshi Mitake wrote:
> > There are a lot of lock instances with same names (e.g. port_lock).
> > This patch series add __FILE__ and __LINE__ to lockdep_map,
> > and these will be used for trace lock events.
> >
> > Example use from perf lock map:
> >
> > | 0xffffea0004c992b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> > | 0xffffea0004b112b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> > | 0xffffea0004a3f2b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> > | 0xffffea0004cd5228: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
> > | 0xffff8800b91e2b28: &sb->s_type->i_lock_key (src: fs/inode.c, line: 166)
> > | 0xffff8800bb9d7ae0: key (src: kernel/wait.c, line: 16)
> > | 0xffff8800aa07dae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> > | 0xffff8800b07fbae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> > | 0xffff8800b07f3ae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> > | 0xffff8800bf15fae0: &sighand->siglock (src: kernel/fork.c, line: 1490)
> > | 0xffff8800b90f7ae0: &dentry->d_lock (src: fs/dcache.c, line: 944)
> > | ...
> >
> > (This output of perf lock map is produced by my local version,
> > I'll send this later.)
> >
> > And sadly, as Peter Zijlstra predicted, this produces certain overhead.
> >
> > Before appling this series:
> > | % sudo ./perf lock rec perf bench sched messaging
> > | # Running sched/messaging benchmark...
> > | # 20 sender and receiver processes per group
> > | # 10 groups == 400 processes run
> > |
> > | Total time: 3.834 [sec]
> > After:
> > sudo ./perf lock rec perf bench sched messaging
> > | # Running sched/messaging benchmark...
> > | # 20 sender and receiver processes per group
> > | # 10 groups == 400 processes run
> > |
> > | Total time: 5.415 [sec]
> > | [ perf record: Woken up 0 times to write data ]
> > | [ perf record: Captured and wrote 53.512 MB perf.data (~2337993 samples) ]
> >
> > But raw exec of perf bench sched messaging is this:
> > | % perf bench sched messaging
> > | # Running sched/messaging benchmark...
> > | # 20 sender and receiver processes per group
> > | # 10 groups == 400 processes run
> > |
> > | Total time: 0.498 [sec]
> >
> > Tracing lock events already produces amount of overhead.
> > I think the overhead produced by this series is not a fatal problem,
> > radically optimization is required...
>
> Right, these patches look OK, for the tracing overhead, you could possibly
> hash the file:line into a u64 and reduce the tracepoint size, that should
> improve the situation I tihnk, because I seem to remember the only thing
> that really matters for speed is the size of things.

ok, great. I looked into merging these bits into perf/lock and perf/lock into
tip:master - but the recent upstream raw-spinlock changes interact with the
new patches.

I also merged latest perf into perf/lock and there's some new build failures:

builtin-lock.c:14:27: error: util/data_map.h: No such file or directory
cc1: warnings being treated as errors
builtin-lock.c: In function 'process_sample_event':
builtin-lock.c:279: error: implicit declaration of function 'threads__findnew'
builtin-lock.c:279: error: nested extern declaration of 'threads__findnew'
builtin-lock.c:279: error: assignment makes pointer from integer without a cast
builtin-lock.c: At top level:
builtin-lock.c:357: error: variable 'file_handler' has initializer but incomplete type
builtin-lock.c:358: error: unknown field 'process_sample_event' specified in initializer
builtin-lock.c:358: error: excess elements in struct initializer
builtin-lock.c:358: error: (near initialization for 'file_handler')
builtin-lock.c:359: error: unknown field 'sample_type_check' specified in initializer
builtin-lock.c:359: error: excess elements in struct initializer
builtin-lock.c:359: error: (near initialization for 'file_handler')
builtin-lock.c: In function 'read_events':
builtin-lock.c:364: error: implicit declaration of function 'register_idle_thread'
builtin-lock.c:364: error: nested extern declaration of 'register_idle_thread'
builtin-lock.c:365: error: implicit declaration of function 'register_perf_file_handler'
builtin-lock.c:365: error: nested extern declaration of 'register_perf_file_handler'
builtin-lock.c:367: error: implicit declaration of function 'mmap_dispatch_perf_file'
builtin-lock.c:367: error: nested extern declaration of 'mmap_dispatch_perf_file'
builtin-lock.c:368: error: 'event__cwdlen' undeclared (first use in this function)
builtin-lock.c:368: error: (Each undeclared identifier is reported only once
builtin-lock.c:368: error: for each function it appears in.)
builtin-lock.c:368: error: 'event__cwd' undeclared (first use in this function)
builtin-lock.c: In function 'cmd_lock':
builtin-lock.c:429: error: too many arguments to function 'symbol__init'
make: *** [builtin-lock.o] Error 1
make: *** Waiting for unfinished jobs....

once those are resolved and we have the merged in patches we can graduate this
topic into tip:master.

Thanks,

Ingo

2010-01-13 23:20:20

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5/5] lockdep: Fix the way to initialize class_mutex for information of file and line

On Wed, Jan 13, 2010 at 11:00:22AM +0100, Ingo Molnar wrote:
>
> * Hitoshi Mitake <[email protected]> wrote:
>
> > This patch adds __FILE__ and __LINE__ to arguments passed for
> > __mutex_init() called in __class_register().
> > This affects to lockdep_map of class_mutex of struct class_private,
> > and these will be used to make trace lock events more human friendly.
> >
> > Signed-off-by: Hitoshi Mitake <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Paul Mackerras <[email protected]>
> > Cc: Frederic Weisbecker <[email protected]>
> > Cc: Greg Kroah-Hartman <[email protected]>
> > ---
> > drivers/base/class.c | 3 ++-
> > 1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/base/class.c b/drivers/base/class.c
> > index 161746d..cc0630f 100644
> > --- a/drivers/base/class.c
> > +++ b/drivers/base/class.c
> > @@ -162,7 +162,8 @@ int __class_register(struct class *cls, struct lock_class_key *key)
> > klist_init(&cp->class_devices, klist_class_dev_get, klist_class_dev_put);
> > INIT_LIST_HEAD(&cp->class_interfaces);
> > kset_init(&cp->class_dirs);
> > - __mutex_init(&cp->class_mutex, "struct class mutex", key);
> > + __mutex_init(&cp->class_mutex, "struct class mutex", key,
> > + __FILE__, __LINE__);
> > error = kobject_set_name(&cp->class_subsys.kobj, "%s", cls->name);
> > if (error) {
> > kfree(cp);
>
> Greg, any objection to this type of embellishment of lock data that comes from
> the driver core?

No, I'll queue it up in my tree.

thanks,

greg k-h

2010-01-13 23:20:39

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5/5] lockdep: Fix the way to initialize class_mutex for information of file and line

On Wed, Jan 13, 2010 at 03:17:55PM -0800, Greg KH wrote:
> On Wed, Jan 13, 2010 at 11:00:22AM +0100, Ingo Molnar wrote:
> >
> > * Hitoshi Mitake <[email protected]> wrote:
> >
> > > This patch adds __FILE__ and __LINE__ to arguments passed for
> > > __mutex_init() called in __class_register().
> > > This affects to lockdep_map of class_mutex of struct class_private,
> > > and these will be used to make trace lock events more human friendly.
> > >
> > > Signed-off-by: Hitoshi Mitake <[email protected]>
> > > Cc: Peter Zijlstra <[email protected]>
> > > Cc: Paul Mackerras <[email protected]>
> > > Cc: Frederic Weisbecker <[email protected]>
> > > Cc: Greg Kroah-Hartman <[email protected]>
> > > ---
> > > drivers/base/class.c | 3 ++-
> > > 1 files changed, 2 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/drivers/base/class.c b/drivers/base/class.c
> > > index 161746d..cc0630f 100644
> > > --- a/drivers/base/class.c
> > > +++ b/drivers/base/class.c
> > > @@ -162,7 +162,8 @@ int __class_register(struct class *cls, struct lock_class_key *key)
> > > klist_init(&cp->class_devices, klist_class_dev_get, klist_class_dev_put);
> > > INIT_LIST_HEAD(&cp->class_interfaces);
> > > kset_init(&cp->class_dirs);
> > > - __mutex_init(&cp->class_mutex, "struct class mutex", key);
> > > + __mutex_init(&cp->class_mutex, "struct class mutex", key,
> > > + __FILE__, __LINE__);
> > > error = kobject_set_name(&cp->class_subsys.kobj, "%s", cls->name);
> > > if (error) {
> > > kfree(cp);
> >
> > Greg, any objection to this type of embellishment of lock data that comes from
> > the driver core?
>
> No, I'll queue it up in my tree.

Er, oops, sorry, that would go through your tree, not mine :)

Feel free to add:
Acked-by: Greg Kroah-Hartman <[email protected]>

In looking at it, this might solve the lockdep issue with the device
tree as well, I'll have to play with it once this makes it into Linus's
tree.

thanks,

greg k-h

2010-01-18 07:20:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/5] lockdep: Add information of file and line to lockdep_map

On Sat, 2010-01-16 at 22:01 +0900, Hitoshi Mitake wrote:
>
> And I want lockdep_map to have another thing, type of lock.
> For example, mutex and spinlock has completely different acquired
> time
> and attributes,
> so I want to separate these things. If lockdep_map has member
> to express type, things will be easy.

I'm worrying about growing things too much.. esp. when the gain is
dubious. People will see the type soon enough when they go look at the
code.

2010-01-26 05:56:59

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH] Add type of locks to lock trace events

# Sorry, I wrote wrong Cc address. Previous mail was rejected by mailer-daemon.
# This is second time sending, if you already received this, please discard it...

There's no need to add any member to lockdep_map
for adding information of type of locks to lock trace events.

Example of perf trace:
| init-0 [001] 335.078670: lock_acquired: 0xffff8800059d6bd8 &rq->lock kernel/lockdep.c:2973 (0 ns)
| rb_consumer-424 [001] 335.078673: lock_acquire: 0xffff8800059d6bd8 1 &rq->lock kernel/lockdep.c:2973
| # ^ &rq->lock is spin lock!
| rb_consumer-424 [001] 335.078677: lock_acquire: 0xffff8800bba5e8e8 1 buffer->reader_lock_key kernel/trace/ring_
| rb_consumer-424 [001] 335.078679: lock_acquired: 0xffff8800bba5e8e8 buffer->reader_lock_key kernel/trace/ring_b
| rb_consumer-424 [001] 335.078684: lock_acquire: 0xffff8800059d12e8 1 &q->lock kernel/smp.c:83

Of course, as you told, type of lock dealing with is clear for human.
But it is not clear for programs like perf lock.

What I want to do is limiting types of lock focus on.
e.g. perf lock prof --type spin,rwlock

How do you think, Peter?
(This is test edition, don't apply it!)

---
include/linux/lockdep.h | 67 ++++++++++++++++++++++++++++++-------------
include/linux/rcupdate.h | 2 +-
include/trace/events/lock.h | 12 +++++--
kernel/lockdep.c | 6 ++-
4 files changed, 60 insertions(+), 27 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index a631afa..e267823 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -19,6 +19,14 @@ struct lockdep_map;
#include <linux/debug_locks.h>
#include <linux/stacktrace.h>

+enum {
+ LOCK_TYPE_OTHER = 0,
+ LOCK_TYPE_SPIN,
+ LOCK_TYPE_RWLOCK,
+ LOCK_TYPE_RWSEM,
+ LOCK_TYPE_MUTEX,
+};
+
/*
* We'd rather not expose kernel/lockdep_states.h this wide, but we do need
* the total number of states... :-(
@@ -306,7 +314,8 @@ static inline int lockdep_match_key(struct lockdep_map *lock,
*/
extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
- struct lockdep_map *nest_lock, unsigned long ip);
+ struct lockdep_map *nest_lock, unsigned long ip,
+ int type);

extern void lock_release(struct lockdep_map *lock, int nested,
unsigned long ip);
@@ -345,7 +354,7 @@ static inline void lockdep_on(void)
{
}

-# define lock_acquire(l, s, t, r, c, n, i) do { } while (0)
+# define lock_acquire(l, s, t, r, c, n, i, ty) do { } while (0)
# define lock_release(l, n, i) do { } while (0)
# define lock_set_class(l, n, k, s, i) do { } while (0)
# define lock_set_subclass(l, s, i) do { } while (0)
@@ -462,11 +471,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define spin_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
-# define spin_acquire_nest(l, s, t, n, i) lock_acquire(l, s, t, 0, 2, n, i)
+# define spin_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_SPIN)
+# define spin_acquire_nest(l, s, t, n, i) \
+ lock_acquire(l, s, t, 0, 2, n, i, LOCK_TYPE_SPIN)
# else
-# define spin_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
-# define spin_acquire_nest(l, s, t, n, i) lock_acquire(l, s, t, 0, 1, NULL, i)
+# define spin_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_SPIN)
+# define spin_acquire_nest(l, s, t, n, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_SPIN)
# endif
# define spin_release(l, n, i) lock_release(l, n, i)
#else
@@ -476,11 +489,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define rwlock_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
-# define rwlock_acquire_read(l, s, t, i) lock_acquire(l, s, t, 2, 2, NULL, i)
+# define rwlock_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_RWLOCK)
+# define rwlock_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 2, 2, NULL, i, LOCK_TYPE_RWLOCK)
# else
-# define rwlock_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
-# define rwlock_acquire_read(l, s, t, i) lock_acquire(l, s, t, 2, 1, NULL, i)
+# define rwlock_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_RWLOCK)
+# define rwlock_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 2, 1, NULL, i, LOCK_TYPE_RWLOCK)
# endif
# define rwlock_release(l, n, i) lock_release(l, n, i)
#else
@@ -491,9 +508,11 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define mutex_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
+# define mutex_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_MUTEX)
# else
-# define mutex_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
+# define mutex_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_MUTEX)
# endif
# define mutex_release(l, n, i) lock_release(l, n, i)
#else
@@ -503,11 +522,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define rwsem_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
-# define rwsem_acquire_read(l, s, t, i) lock_acquire(l, s, t, 1, 2, NULL, i)
+# define rwsem_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_RWSEM)
+# define rwsem_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 1, 2, NULL, i, LOCK_TYPE_RWSEM)
# else
-# define rwsem_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
-# define rwsem_acquire_read(l, s, t, i) lock_acquire(l, s, t, 1, 1, NULL, i)
+# define rwsem_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_RWSEM)
+# define rwsem_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 1, 1, NULL, i, LOCK_TYPE_RWSEM)
# endif
# define rwsem_release(l, n, i) lock_release(l, n, i)
#else
@@ -518,9 +541,11 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define lock_map_acquire(l) lock_acquire(l, 0, 0, 0, 2, NULL, _THIS_IP_)
+# define lock_map_acquire(l) \
+ lock_acquire(l, 0, 0, 0, 2, NULL, _THIS_IP_, LOCK_TYPE_OTHER)
# else
-# define lock_map_acquire(l) lock_acquire(l, 0, 0, 0, 1, NULL, _THIS_IP_)
+# define lock_map_acquire(l) \
+ lock_acquire(l, 0, 0, 0, 1, NULL, _THIS_IP_, LOCK_TYPE_OTHER)
# endif
# define lock_map_release(l) lock_release(l, 1, _THIS_IP_)
#else
@@ -532,13 +557,15 @@ static inline void print_irqtrace_events(struct task_struct *curr)
# define might_lock(lock) \
do { \
typecheck(struct lockdep_map *, &(lock)->dep_map); \
- lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, NULL, _THIS_IP_); \
+ lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, \
+ NULL, _THIS_IP_, LOCK_TYPE_OTHER); \
lock_release(&(lock)->dep_map, 0, _THIS_IP_); \
} while (0)
# define might_lock_read(lock) \
do { \
typecheck(struct lockdep_map *, &(lock)->dep_map); \
- lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, NULL, _THIS_IP_); \
+ lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, \
+ NULL, _THIS_IP_, LOCK_TYPE_OTHER); \
lock_release(&(lock)->dep_map, 0, _THIS_IP_); \
} while (0)
#else
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 24440f4..9c5d1b9 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -80,7 +80,7 @@ extern void rcu_init(void);
#ifdef CONFIG_DEBUG_LOCK_ALLOC
extern struct lockdep_map rcu_lock_map;
# define rcu_read_acquire() \
- lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
+ lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_, LOCK_TYPE_OTHER)
# define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_)
#else
# define rcu_read_acquire() do { } while (0)
diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h
index 3eef226..89107de 100644
--- a/include/trace/events/lock.h
+++ b/include/trace/events/lock.h
@@ -12,10 +12,11 @@
TRACE_EVENT(lock_acquire,

TP_PROTO(struct lockdep_map *lock, unsigned int subclass,
- int trylock, int read, int check,
- struct lockdep_map *next_lock, unsigned long ip),
+ int trylock, int read, int check,
+ struct lockdep_map *next_lock, unsigned long ip,
+ int type),

- TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),
+ TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip, type),

TP_STRUCT__entry(
__field(unsigned int, flags)
@@ -23,6 +24,7 @@ TRACE_EVENT(lock_acquire,
__field(void *, lockdep_addr)
__string(file, lock->file)
__field(unsigned int, line)
+ __field(int, type)
),

TP_fast_assign(
@@ -31,9 +33,11 @@ TRACE_EVENT(lock_acquire,
__entry->lockdep_addr = lock;
__assign_str(file, lock->file);
__entry->line = lock->line;
+ __entry->type = type;
),

- TP_printk("%p %s%s%s %s:%u", __entry->lockdep_addr,
+ TP_printk("%p %d %s%s%s %s:%u", __entry->lockdep_addr,
+ __entry->type,
(__entry->flags & 1) ? "try " : "",
(__entry->flags & 2) ? "read " : "",
__get_str(name), __get_str(file), __entry->line)
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index f0f6dfd..12e5d73 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3211,11 +3211,13 @@ EXPORT_SYMBOL_GPL(lock_set_class);
*/
void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
- struct lockdep_map *nest_lock, unsigned long ip)
+ struct lockdep_map *nest_lock, unsigned long ip,
+ int type)
{
unsigned long flags;

- trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
+ trace_lock_acquire(lock, subclass, trylock,
+ read, check, nest_lock, ip, type);

if (unlikely(current->lockdep_recursion))
return;
--
1.6.5.2

2010-01-28 10:22:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Add type of locks to lock trace events

On Tue, 2010-01-26 at 14:56 +0900, Hitoshi Mitake wrote:
> # Sorry, I wrote wrong Cc address. Previous mail was rejected by mailer-daemon.
> # This is second time sending, if you already received this, please discard it...
>
> There's no need to add any member to lockdep_map
> for adding information of type of locks to lock trace events.
>
> Example of perf trace:
> | init-0 [001] 335.078670: lock_acquired: 0xffff8800059d6bd8 &rq->lock kernel/lockdep.c:2973 (0 ns)
> | rb_consumer-424 [001] 335.078673: lock_acquire: 0xffff8800059d6bd8 1 &rq->lock kernel/lockdep.c:2973
> | # ^ &rq->lock is spin lock!
> | rb_consumer-424 [001] 335.078677: lock_acquire: 0xffff8800bba5e8e8 1 buffer->reader_lock_key kernel/trace/ring_
> | rb_consumer-424 [001] 335.078679: lock_acquired: 0xffff8800bba5e8e8 buffer->reader_lock_key kernel/trace/ring_b
> | rb_consumer-424 [001] 335.078684: lock_acquire: 0xffff8800059d12e8 1 &q->lock kernel/smp.c:83
>
> Of course, as you told, type of lock dealing with is clear for human.
> But it is not clear for programs like perf lock.
>
> What I want to do is limiting types of lock focus on.
> e.g. perf lock prof --type spin,rwlock
>
> How do you think, Peter?

I still don't see the use for it, surely you're going to be familiar
with the code if you're looking at lock statistics?

2010-01-29 17:36:05

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] Add type of locks to lock trace events

On Thu, Jan 28, 2010 at 11:21:53AM +0100, Peter Zijlstra wrote:
> On Tue, 2010-01-26 at 14:56 +0900, Hitoshi Mitake wrote:
> > # Sorry, I wrote wrong Cc address. Previous mail was rejected by mailer-daemon.
> > # This is second time sending, if you already received this, please discard it...
> >
> > There's no need to add any member to lockdep_map
> > for adding information of type of locks to lock trace events.
> >
> > Example of perf trace:
> > | init-0 [001] 335.078670: lock_acquired: 0xffff8800059d6bd8 &rq->lock kernel/lockdep.c:2973 (0 ns)
> > | rb_consumer-424 [001] 335.078673: lock_acquire: 0xffff8800059d6bd8 1 &rq->lock kernel/lockdep.c:2973
> > | # ^ &rq->lock is spin lock!
> > | rb_consumer-424 [001] 335.078677: lock_acquire: 0xffff8800bba5e8e8 1 buffer->reader_lock_key kernel/trace/ring_
> > | rb_consumer-424 [001] 335.078679: lock_acquired: 0xffff8800bba5e8e8 buffer->reader_lock_key kernel/trace/ring_b
> > | rb_consumer-424 [001] 335.078684: lock_acquire: 0xffff8800059d12e8 1 &q->lock kernel/smp.c:83
> >
> > Of course, as you told, type of lock dealing with is clear for human.
> > But it is not clear for programs like perf lock.
> >
> > What I want to do is limiting types of lock focus on.
> > e.g. perf lock prof --type spin,rwlock
> >
> > How do you think, Peter?
>
> I still don't see the use for it, surely you're going to be familiar
> with the code if you're looking at lock statistics?



Comparing the avg/max time locks are acquired/waited between same
types of locks is interesting, but doing so between spinlocks and mutexes
is definetly pointless.

spinlocks will always be in the bottom of the report while their measures
should be taken in an utterly different order of magnitude.

Each lock nature should be reported separately. An rwlock won't have
the same behaviour pattern than a spinlock. It may spin less in average
but more in the maximum snapshot, and still that depends on a write_lock
or a read_lock operations.

That said I don't think this is the right approach. We only need this
information when the lock is initialized. That's the same for the
file and line things, especially since these add 1/5 more overhead.
And that's even the same for the name of the lock.

Actually we need a lock_init event that brings:

- lock addr
- lock name
- lock type
- file/line

This can be triggered on spinlock_init(), mutex_init(), etc..
for the dynamic cases and use a synthetize_lock_events()
from perf tools that could work toward events injection
through perf ioctl, for the static cases.
Or/and we could use a debugfs file for that, otherwise only
perf will be able to use the lock events. Or this can be
an event injection from ftrace.

And all other lock events should not be giving anymore the
name, type, or file/line. We should deduce it from the
lock_init events and the lock addr in each lock events.

I think this becomes critical, given the high overhead nature
of lock events and their impact in profiling them.

2010-01-29 17:41:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Add type of locks to lock trace events

On Fri, 2010-01-29 at 18:29 +0100, Frederic Weisbecker wrote:

> > I still don't see the use for it, surely you're going to be familiar
> > with the code if you're looking at lock statistics?

> Comparing the avg/max time locks are acquired/waited between same
> types of locks is interesting, but doing so between spinlocks and mutexes
> is definetly pointless.

Sure, never claimed otherwise, but:
1) if you're looking at lock behaviour you should also look at the code
otherwise you have no context to place the contention behaviour in.

2) exactly because these hold/acquire times differ so much its hard to
mistake mistake them for anything else.

The concern I have is adding the storage (avoided by not doing it at
init like you suggested) and the event size overhead. Jens' report
confirms that we need to be very careful here, because lock sites are
very high frequent.

So again, _why_? If you're looking at these things its easy (and I'd say
rather crucial) to look up the code.

There simply is no point in looking at these stats if you're not also
going to look at the code. If you're looking at them in a quick
diagnostic way, then the difference against the baseline is important.

2010-01-29 22:13:04

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] Add type of locks to lock trace events

On Fri, Jan 29, 2010 at 06:41:05PM +0100, Peter Zijlstra wrote:
> Sure, never claimed otherwise, but:
> 1) if you're looking at lock behaviour you should also look at the code
> otherwise you have no context to place the contention behaviour in.
>
> 2) exactly because these hold/acquire times differ so much its hard to
> mistake mistake them for anything else.
>
> The concern I have is adding the storage (avoided by not doing it at
> init like you suggested) and the event size overhead. Jens' report
> confirms that we need to be very careful here, because lock sites are
> very high frequent.
>
> So again, _why_? If you're looking at these things its easy (and I'd say
> rather crucial) to look up the code.
>
> There simply is no point in looking at these stats if you're not also
> going to look at the code. If you're looking at them in a quick
> diagnostic way, then the difference against the baseline is important.



Ok, looking at the code is mandatory if you want to enter into any
detail. But before going into this step, you need a first glance, an
overview, unless you are targeting a very precise lock, in which case
you need to use filters and only trace this one.

Now say you want to find the locks in the kernel that are not fine
grained, taken for too long, in average or punctually, then mixing
up mutexes and spinlocks is not going to be helpful at all, quite
the opposite, you are going to miss very interesting things.

In this latter case you really want to separate by family of locks
so that you can immediately point out which lock is badly set up.

See? This is really a matter of overview, IMO this is required if
you want perf lock to be useful. And this is not going to bring any
overhead if we do that from lock init events.

So we can have one report for mutexes, one for spinlocks, one for
rwlocks (distinguishing read/write locks), etc...

We can even ask to report only spinlocks, mutexes, and that by filters.

2010-01-16 12:55:39

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH 5/5] lockdep: Fix the way to initialize class_mutex for information of file and line

On 2010年01月14日 08:19, Greg KH wrote:
> On Wed, Jan 13, 2010 at 03:17:55PM -0800, Greg KH wrote:
>> On Wed, Jan 13, 2010 at 11:00:22AM +0100, Ingo Molnar wrote:
>>>
>>> * Hitoshi Mitake<[email protected]> wrote:
>>>
>>>> This patch adds __FILE__ and __LINE__ to arguments passed for
>>>> __mutex_init() called in __class_register().
>>>> This affects to lockdep_map of class_mutex of struct class_private,
>>>> and these will be used to make trace lock events more human friendly.
>>>>
>>>> Signed-off-by: Hitoshi Mitake<[email protected]>
>>>> Cc: Peter Zijlstra<[email protected]>
>>>> Cc: Paul Mackerras<[email protected]>
>>>> Cc: Frederic Weisbecker<[email protected]>
>>>> Cc: Greg Kroah-Hartman<[email protected]>
>>>> ---
>>>> drivers/base/class.c | 3 ++-
>>>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/drivers/base/class.c b/drivers/base/class.c
>>>> index 161746d..cc0630f 100644
>>>> --- a/drivers/base/class.c
>>>> +++ b/drivers/base/class.c
>>>> @@ -162,7 +162,8 @@ int __class_register(struct class *cls, struct lock_class_key *key)
>>>> klist_init(&cp->class_devices, klist_class_dev_get, klist_class_dev_put);
>>>> INIT_LIST_HEAD(&cp->class_interfaces);
>>>> kset_init(&cp->class_dirs);
>>>> - __mutex_init(&cp->class_mutex, "struct class mutex", key);
>>>> + __mutex_init(&cp->class_mutex, "struct class mutex", key,
>>>> + __FILE__, __LINE__);
>>>> error = kobject_set_name(&cp->class_subsys.kobj, "%s", cls->name);
>>>> if (error) {
>>>> kfree(cp);
>>>
>>> Greg, any objection to this type of embellishment of lock data that comes from
>>> the driver core?
>>
>> No, I'll queue it up in my tree.
>
> Er, oops, sorry, that would go through your tree, not mine :)
>
> Feel free to add:
> Acked-by: Greg Kroah-Hartman<[email protected]>
>
> In looking at it, this might solve the lockdep issue with the device
> tree as well, I'll have to play with it once this makes it into Linus's
> tree.
>
> thanks,
>
> greg k-h
>


Greg,

I'll add your ack in future version of this patch, thanks!
(perf lock has some build errors)

Hitoshi

2010-01-16 13:01:25

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH 0/5] lockdep: Add information of file and line to lockdep_map

On 2010年01月13日 18:52, Peter Zijlstra wrote:
> On Thu, 2010-01-07 at 19:39 +0900, Hitoshi Mitake wrote:
>> There are a lot of lock instances with same names (e.g. port_lock).
>> This patch series add __FILE__ and __LINE__ to lockdep_map,
>> and these will be used for trace lock events.
>>
>> Example use from perf lock map:
>>
>> | 0xffffea0004c992b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
>> | 0xffffea0004b112b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
>> | 0xffffea0004a3f2b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
>> | 0xffffea0004cd5228: __pte_lockptr(page) (src: include/linux/mm.h, line: 952)
>> | 0xffff8800b91e2b28:&sb->s_type->i_lock_key (src: fs/inode.c, line: 166)
>> | 0xffff8800bb9d7ae0: key (src: kernel/wait.c, line: 16)
>> | 0xffff8800aa07dae0:&dentry->d_lock (src: fs/dcache.c, line: 944)
>> | 0xffff8800b07fbae0:&dentry->d_lock (src: fs/dcache.c, line: 944)
>> | 0xffff8800b07f3ae0:&dentry->d_lock (src: fs/dcache.c, line: 944)
>> | 0xffff8800bf15fae0:&sighand->siglock (src: kernel/fork.c, line: 1490)
>> | 0xffff8800b90f7ae0:&dentry->d_lock (src: fs/dcache.c, line: 944)
>> | ...
>>
>> (This output of perf lock map is produced by my local version,
>> I'll send this later.)
>>
>> And sadly, as Peter Zijlstra predicted, this produces certain overhead.
>>
>> Before appling this series:
>> | % sudo ./perf lock rec perf bench sched messaging
>> | # Running sched/messaging benchmark...
>> | # 20 sender and receiver processes per group
>> | # 10 groups == 400 processes run
>> |
>> | Total time: 3.834 [sec]
>> After:
>> sudo ./perf lock rec perf bench sched messaging
>> | # Running sched/messaging benchmark...
>> | # 20 sender and receiver processes per group
>> | # 10 groups == 400 processes run
>> |
>> | Total time: 5.415 [sec]
>> | [ perf record: Woken up 0 times to write data ]
>> | [ perf record: Captured and wrote 53.512 MB perf.data (~2337993 samples) ]
>>
>> But raw exec of perf bench sched messaging is this:
>> | % perf bench sched messaging
>> | # Running sched/messaging benchmark...
>> | # 20 sender and receiver processes per group
>> | # 10 groups == 400 processes run
>> |
>> | Total time: 0.498 [sec]
>>
>> Tracing lock events already produces amount of overhead.
>> I think the overhead produced by this series is not a fatal problem,
>> radically optimization is required...
>
> Right, these patches look OK, for the tracing overhead, you could
> possibly hash the file:line into a u64 and reduce the tracepoint size,
> that should improve the situation I tihnk, because I seem to remember
> the only thing that really matters for speed is the size of things.
>
>

Thanks for your opinion, Peter.
I'll work on reducing size of events later. Hashing is a good idea.
I think indexing is also way to reduce size.

And I want lockdep_map to have another thing, type of lock.
For example, mutex and spinlock has completely different acquired time
and attributes,
so I want to separate these things. If lockdep_map has member
to express type, things will be easy.

How do you think?

2010-02-24 09:03:15

by Hitoshi Mitake

[permalink] [raw]
Subject: [PATCH] Separate lock events with types

Sorry for my long silence...

Thanks for Frederic's great work like trace_lock_class_init(),
overhead of perf lock was reduced a lot.
But still there is overhead which cannot be disregarded.

So I'd like to suggest that separating lock trace events into each types of lock.
e.g.
trace_lock_acquire() -> spin_trace_lock_acquire(), rwlock_trace_lock_acquire()
I think that mutex and spinlock are completely different things.
And as I describe below, filtering at recording phase can reduce overhead of tracing.

CAUTION:
This patch is the proof of concept. The way this patch employes
is different from one I described above. This patch adds if statement
before trace_lock_*(). Implementation of separating events per types will be
a big one, so this is an only trial edition for performance improvements.

Below is the list of typical scores of each situations:

| raw score of perf bench sched messaging:
|
| % ./perf bench sched messaging
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
| Total time: 0.573 [sec]



| before (Frederic's perf/core):
|
| % sudo ./perf lock record ./perf bench sched messaging
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
| Total time: 3.265 [sec]
| [ perf record: Woken up 0 times to write data ]
| [ perf record: Captured and wrote 143.952 MB perf.data (~6289344 samples) ]



| after (tracing only spinlocks):
|
| % sudo ./perf lock record ./perf bench sched messaging
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
| Total time: 1.580 [sec] <-- x2 faster!!
| [ perf record: Woken up 0 times to write data ]
| [ perf record: Captured and wrote 100.274 MB perf.data (~4381037 samples) ]

How do you think, Peter and Frederic?

---
fs/sysfs/dir.c | 4 +-
include/linux/lockdep.h | 103 ++++++++++++++++++++++++--------------
include/linux/rcupdate.h | 4 +-
include/linux/rwlock_api_smp.h | 16 +++---
include/linux/spinlock_api_smp.h | 8 ++--
include/trace/events/lock.h | 4 +-
kernel/lockdep.c | 26 ++++++----
kernel/mutex.c | 4 +-
kernel/rwsem.c | 8 ++--
kernel/spinlock.c | 6 +-
10 files changed, 108 insertions(+), 75 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 699f371..8782409 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -204,13 +204,13 @@ static void sysfs_deactivate(struct sysfs_dirent *sd)
v = atomic_add_return(SD_DEACTIVATED_BIAS, &sd->s_active);

if (v != SD_DEACTIVATED_BIAS) {
- lock_contended(&sd->dep_map, _RET_IP_);
+ lock_contended(&sd->dep_map, _RET_IP_, LOCK_TYPE_OTHER);
wait_for_completion(&wait);
}

sd->s_sibling = NULL;

- lock_acquired(&sd->dep_map, _RET_IP_);
+ lock_acquired(&sd->dep_map, _RET_IP_, LOCK_TYPE_OTHER);
rwsem_release(&sd->dep_map, 1, _RET_IP_);
}

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 0b662fc..b93976a 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -19,6 +19,14 @@ struct lockdep_map;
#include <linux/debug_locks.h>
#include <linux/stacktrace.h>

+enum {
+ LOCK_TYPE_OTHER = 0,
+ LOCK_TYPE_SPIN,
+ LOCK_TYPE_RWLOCK,
+ LOCK_TYPE_RWSEM,
+ LOCK_TYPE_MUTEX,
+};
+
/*
* We'd rather not expose kernel/lockdep_states.h this wide, but we do need
* the total number of states... :-(
@@ -294,10 +302,11 @@ static inline int lockdep_match_key(struct lockdep_map *lock,
*/
extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
- struct lockdep_map *nest_lock, unsigned long ip);
+ struct lockdep_map *nest_lock, unsigned long ip,
+ int type);

extern void lock_release(struct lockdep_map *lock, int nested,
- unsigned long ip);
+ unsigned long ip, int type);

#define lockdep_is_held(lock) lock_is_held(&(lock)->dep_map)

@@ -337,7 +346,7 @@ static inline void lockdep_on(void)
{
}

-# define lock_acquire(l, s, t, r, c, n, i) do { } while (0)
+# define lock_acquire(l, s, t, r, c, n, i, ty) do { } while (0)
# define lock_release(l, n, i) do { } while (0)
# define lock_set_class(l, n, k, s, i) do { } while (0)
# define lock_set_subclass(l, s, i) do { } while (0)
@@ -377,16 +386,16 @@ struct lock_class_key { };

#ifdef CONFIG_LOCK_STAT

-extern void lock_contended(struct lockdep_map *lock, unsigned long ip);
-extern void lock_acquired(struct lockdep_map *lock, unsigned long ip);
+extern void lock_contended(struct lockdep_map *lock, unsigned long ip, int type);
+extern void lock_acquired(struct lockdep_map *lock, unsigned long ip, int type);

-#define LOCK_CONTENDED(_lock, try, lock) \
+#define LOCK_CONTENDED(_lock, try, lock, type) \
do { \
if (!try(_lock)) { \
- lock_contended(&(_lock)->dep_map, _RET_IP_); \
+ lock_contended(&(_lock)->dep_map, _RET_IP_, type); \
lock(_lock); \
} \
- lock_acquired(&(_lock)->dep_map, _RET_IP_); \
+ lock_acquired(&(_lock)->dep_map, _RET_IP_, type); \
} while (0)

#else /* CONFIG_LOCK_STAT */
@@ -394,7 +403,7 @@ do { \
#define lock_contended(lockdep_map, ip) do {} while (0)
#define lock_acquired(lockdep_map, ip) do {} while (0)

-#define LOCK_CONTENDED(_lock, try, lock) \
+#define LOCK_CONTENDED(_lock, try, lock, type) \
lock(_lock)

#endif /* CONFIG_LOCK_STAT */
@@ -406,13 +415,13 @@ do { \
* _raw_*_lock_flags() code, because lockdep assumes
* that interrupts are not re-enabled during lock-acquire:
*/
-#define LOCK_CONTENDED_FLAGS(_lock, try, lock, lockfl, flags) \
- LOCK_CONTENDED((_lock), (try), (lock))
+#define LOCK_CONTENDED_FLAGS(_lock, try, lock, lockfl, flags, type) \
+ LOCK_CONTENDED((_lock), (try), (lock), type)

#else /* CONFIG_LOCKDEP */

-#define LOCK_CONTENDED_FLAGS(_lock, try, lock, lockfl, flags) \
- lockfl((_lock), (flags))
+#define LOCK_CONTENDED_FLAGS(_lock, try, lock, lockfl, flags, type) \
+ lockfl((_lock), (flags), type)

#endif /* CONFIG_LOCKDEP */

@@ -454,13 +463,17 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define spin_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
-# define spin_acquire_nest(l, s, t, n, i) lock_acquire(l, s, t, 0, 2, n, i)
+# define spin_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_SPIN)
+# define spin_acquire_nest(l, s, t, n, i) \
+ lock_acquire(l, s, t, 0, 2, n, i, LOCK_TYPE_SPIN)
# else
-# define spin_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
-# define spin_acquire_nest(l, s, t, n, i) lock_acquire(l, s, t, 0, 1, NULL, i)
+# define spin_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_SPIN)
+# define spin_acquire_nest(l, s, t, n, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_SPIN)
# endif
-# define spin_release(l, n, i) lock_release(l, n, i)
+# define spin_release(l, n, i) lock_release(l, n, i, LOCK_TYPE_SPIN)
#else
# define spin_acquire(l, s, t, i) do { } while (0)
# define spin_release(l, n, i) do { } while (0)
@@ -468,13 +481,17 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define rwlock_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
-# define rwlock_acquire_read(l, s, t, i) lock_acquire(l, s, t, 2, 2, NULL, i)
+# define rwlock_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_RWLOCK)
+# define rwlock_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 2, 2, NULL, i, LOCK_TYPE_RWLOCK)
# else
-# define rwlock_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
-# define rwlock_acquire_read(l, s, t, i) lock_acquire(l, s, t, 2, 1, NULL, i)
+# define rwlock_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_RWLOCK)
+# define rwlock_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 2, 1, NULL, i, LOCK_TYPE_RWLOCK)
# endif
-# define rwlock_release(l, n, i) lock_release(l, n, i)
+# define rwlock_release(l, n, i) lock_release(l, n, i, LOCK_TYPE_RWLOCK)
#else
# define rwlock_acquire(l, s, t, i) do { } while (0)
# define rwlock_acquire_read(l, s, t, i) do { } while (0)
@@ -483,11 +500,13 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define mutex_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
+# define mutex_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_MUTEX)
# else
-# define mutex_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
+# define mutex_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_MUTEX)
# endif
-# define mutex_release(l, n, i) lock_release(l, n, i)
+# define mutex_release(l, n, i) lock_release(l, n, i, LOCK_TYPE_MUTEX)
#else
# define mutex_acquire(l, s, t, i) do { } while (0)
# define mutex_release(l, n, i) do { } while (0)
@@ -495,13 +514,17 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define rwsem_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, NULL, i)
-# define rwsem_acquire_read(l, s, t, i) lock_acquire(l, s, t, 1, 2, NULL, i)
+# define rwsem_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 2, NULL, i, LOCK_TYPE_RWSEM)
+# define rwsem_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 1, 2, NULL, i, LOCK_TYPE_RWSEM)
# else
-# define rwsem_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, NULL, i)
-# define rwsem_acquire_read(l, s, t, i) lock_acquire(l, s, t, 1, 1, NULL, i)
+# define rwsem_acquire(l, s, t, i) \
+ lock_acquire(l, s, t, 0, 1, NULL, i, LOCK_TYPE_RWSEM)
+# define rwsem_acquire_read(l, s, t, i) \
+ lock_acquire(l, s, t, 1, 1, NULL, i, LOCK_TYPE_RWSEM)
# endif
-# define rwsem_release(l, n, i) lock_release(l, n, i)
+# define rwsem_release(l, n, i) lock_release(l, n, i, LOCK_TYPE_RWSEM)
#else
# define rwsem_acquire(l, s, t, i) do { } while (0)
# define rwsem_acquire_read(l, s, t, i) do { } while (0)
@@ -510,11 +533,13 @@ static inline void print_irqtrace_events(struct task_struct *curr)

#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
-# define lock_map_acquire(l) lock_acquire(l, 0, 0, 0, 2, NULL, _THIS_IP_)
+# define lock_map_acquire(l) \
+ lock_acquire(l, 0, 0, 0, 2, NULL, _THIS_IP_, LOCK_TYPE_OTHER)
# else
-# define lock_map_acquire(l) lock_acquire(l, 0, 0, 0, 1, NULL, _THIS_IP_)
+# define lock_map_acquire(l) \
+ lock_acquire(l, 0, 0, 0, 1, NULL, _THIS_IP_, LOCK_TYPE_OTHER)
# endif
-# define lock_map_release(l) lock_release(l, 1, _THIS_IP_)
+# define lock_map_release(l) lock_release(l, 1, _THIS_IP_, LOCK_TYPE_OTHER)
#else
# define lock_map_acquire(l) do { } while (0)
# define lock_map_release(l) do { } while (0)
@@ -524,14 +549,16 @@ static inline void print_irqtrace_events(struct task_struct *curr)
# define might_lock(lock) \
do { \
typecheck(struct lockdep_map *, &(lock)->dep_map); \
- lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, NULL, _THIS_IP_); \
- lock_release(&(lock)->dep_map, 0, _THIS_IP_); \
+ lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, \
+ NULL, _THIS_IP_, LOCK_TYPE_OTHER); \
+ lock_release(&(lock)->dep_map, 0, _THIS_IP_, LOCK_TYPE_OTHER); \
} while (0)
# define might_lock_read(lock) \
do { \
typecheck(struct lockdep_map *, &(lock)->dep_map); \
- lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, NULL, _THIS_IP_); \
- lock_release(&(lock)->dep_map, 0, _THIS_IP_); \
+ lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, \
+ NULL, _THIS_IP_, LOCK_TYPE_OTHER); \
+ lock_release(&(lock)->dep_map, 0, _THIS_IP_, LOCK_TYPE_OTHER); \
} while (0)
#else
# define might_lock(lock) do { } while (0)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 24440f4..91a092f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -80,8 +80,8 @@ extern void rcu_init(void);
#ifdef CONFIG_DEBUG_LOCK_ALLOC
extern struct lockdep_map rcu_lock_map;
# define rcu_read_acquire() \
- lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_)
-# define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_)
+ lock_acquire(&rcu_lock_map, 0, 0, 2, 1, NULL, _THIS_IP_, LOCK_TYPE_OTHER)
+# define rcu_read_release() lock_release(&rcu_lock_map, 1, _THIS_IP_, LOCK_TYPE_OTHER)
#else
# define rcu_read_acquire() do { } while (0)
# define rcu_read_release() do { } while (0)
diff --git a/include/linux/rwlock_api_smp.h b/include/linux/rwlock_api_smp.h
index 9c9f049..63b1da7 100644
--- a/include/linux/rwlock_api_smp.h
+++ b/include/linux/rwlock_api_smp.h
@@ -147,7 +147,7 @@ static inline void __raw_read_lock(rwlock_t *lock)
{
preempt_disable();
rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock);
+ LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock, LOCK_TYPE_RWLOCK);
}

static inline unsigned long __raw_read_lock_irqsave(rwlock_t *lock)
@@ -158,7 +158,7 @@ static inline unsigned long __raw_read_lock_irqsave(rwlock_t *lock)
preempt_disable();
rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
LOCK_CONTENDED_FLAGS(lock, do_raw_read_trylock, do_raw_read_lock,
- do_raw_read_lock_flags, &flags);
+ do_raw_read_lock_flags, &flags, LOCK_TYPE_RWLOCK);
return flags;
}

@@ -167,7 +167,7 @@ static inline void __raw_read_lock_irq(rwlock_t *lock)
local_irq_disable();
preempt_disable();
rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock);
+ LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock, LOCK_TYPE_RWLOCK);
}

static inline void __raw_read_lock_bh(rwlock_t *lock)
@@ -175,7 +175,7 @@ static inline void __raw_read_lock_bh(rwlock_t *lock)
local_bh_disable();
preempt_disable();
rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock);
+ LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock, LOCK_TYPE_RWLOCK);
}

static inline unsigned long __raw_write_lock_irqsave(rwlock_t *lock)
@@ -186,7 +186,7 @@ static inline unsigned long __raw_write_lock_irqsave(rwlock_t *lock)
preempt_disable();
rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
LOCK_CONTENDED_FLAGS(lock, do_raw_write_trylock, do_raw_write_lock,
- do_raw_write_lock_flags, &flags);
+ do_raw_write_lock_flags, &flags, LOCK_TYPE_RWLOCK);
return flags;
}

@@ -195,7 +195,7 @@ static inline void __raw_write_lock_irq(rwlock_t *lock)
local_irq_disable();
preempt_disable();
rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
+ LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock, LOCK_TYPE_RWLOCK);
}

static inline void __raw_write_lock_bh(rwlock_t *lock)
@@ -203,14 +203,14 @@ static inline void __raw_write_lock_bh(rwlock_t *lock)
local_bh_disable();
preempt_disable();
rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
+ LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock, LOCK_TYPE_RWLOCK);
}

static inline void __raw_write_lock(rwlock_t *lock)
{
preempt_disable();
rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
+ LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock, LOCK_TYPE_RWLOCK);
}

#endif /* CONFIG_PREEMPT */
diff --git a/include/linux/spinlock_api_smp.h b/include/linux/spinlock_api_smp.h
index e253ccd..0cf97ec 100644
--- a/include/linux/spinlock_api_smp.h
+++ b/include/linux/spinlock_api_smp.h
@@ -114,7 +114,7 @@ static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
* that interrupts are not re-enabled during lock-acquire:
*/
#ifdef CONFIG_LOCKDEP
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
+ LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock, LOCK_TYPE_SPIN);
#else
do_raw_spin_lock_flags(lock, &flags);
#endif
@@ -126,7 +126,7 @@ static inline void __raw_spin_lock_irq(raw_spinlock_t *lock)
local_irq_disable();
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
+ LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock, LOCK_TYPE_SPIN);
}

static inline void __raw_spin_lock_bh(raw_spinlock_t *lock)
@@ -134,14 +134,14 @@ static inline void __raw_spin_lock_bh(raw_spinlock_t *lock)
local_bh_disable();
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
+ LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock, LOCK_TYPE_SPIN);
}

static inline void __raw_spin_lock(raw_spinlock_t *lock)
{
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
+ LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock, LOCK_TYPE_SPIN);
}

#endif /* CONFIG_PREEMPT */
diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h
index ced4ceb..353dbd6 100644
--- a/include/trace/events/lock.h
+++ b/include/trace/events/lock.h
@@ -43,8 +43,8 @@ TRACE_EVENT_INJECT(lock_class_init,
TRACE_EVENT(lock_acquire,

TP_PROTO(struct lockdep_map *lock, unsigned int subclass,
- int trylock, int read, int check,
- struct lockdep_map *next_lock, unsigned long ip),
+ int trylock, int read, int check,
+ struct lockdep_map *next_lock, unsigned long ip),

TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 4d9eef8..d4ee446 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3211,11 +3211,14 @@ EXPORT_SYMBOL_GPL(lock_set_class);
*/
void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
- struct lockdep_map *nest_lock, unsigned long ip)
+ struct lockdep_map *nest_lock, unsigned long ip,
+ int type)
{
unsigned long flags;

- trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
+ if (type == LOCK_TYPE_SPIN)
+ trace_lock_acquire(lock, subclass, trylock,
+ read, check, nest_lock, ip);

if (unlikely(current->lockdep_recursion))
return;
@@ -3232,11 +3235,12 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
EXPORT_SYMBOL_GPL(lock_acquire);

void lock_release(struct lockdep_map *lock, int nested,
- unsigned long ip)
+ unsigned long ip, int type)
{
unsigned long flags;

- trace_lock_release(lock, nested, ip);
+ if (type == LOCK_TYPE_SPIN)
+ trace_lock_release(lock, nested, ip);

if (unlikely(current->lockdep_recursion))
return;
@@ -3357,7 +3361,7 @@ found_it:
}

static void
-__lock_acquired(struct lockdep_map *lock, unsigned long ip)
+__lock_acquired(struct lockdep_map *lock, unsigned long ip, int type)
{
struct task_struct *curr = current;
struct held_lock *hlock, *prev_hlock;
@@ -3396,7 +3400,8 @@ found_it:
hlock->holdtime_stamp = now;
}

- trace_lock_acquired(lock, ip, waittime);
+ if (type == LOCK_TYPE_SPIN)
+ trace_lock_acquired(lock, ip, waittime);

stats = get_lock_stats(hlock_class(hlock));
if (waittime) {
@@ -3413,11 +3418,12 @@ found_it:
lock->ip = ip;
}

-void lock_contended(struct lockdep_map *lock, unsigned long ip)
+void lock_contended(struct lockdep_map *lock, unsigned long ip, int type)
{
unsigned long flags;

- trace_lock_contended(lock, ip);
+ if (type == LOCK_TYPE_SPIN)
+ trace_lock_contended(lock, ip);

if (unlikely(!lock_stat))
return;
@@ -3434,7 +3440,7 @@ void lock_contended(struct lockdep_map *lock, unsigned long ip)
}
EXPORT_SYMBOL_GPL(lock_contended);

-void lock_acquired(struct lockdep_map *lock, unsigned long ip)
+void lock_acquired(struct lockdep_map *lock, unsigned long ip, int type)
{
unsigned long flags;

@@ -3447,7 +3453,7 @@ void lock_acquired(struct lockdep_map *lock, unsigned long ip)
raw_local_irq_save(flags);
check_flags(flags);
current->lockdep_recursion = 1;
- __lock_acquired(lock, ip);
+ __lock_acquired(lock, ip, type);
current->lockdep_recursion = 0;
raw_local_irq_restore(flags);
}
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 632f04c..b95826f 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -216,7 +216,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (atomic_xchg(&lock->count, -1) == 1)
goto done;

- lock_contended(&lock->dep_map, ip);
+ lock_contended(&lock->dep_map, ip, LOCK_TYPE_MUTEX);

for (;;) {
/*
@@ -256,7 +256,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
}

done:
- lock_acquired(&lock->dep_map, ip);
+ lock_acquired(&lock->dep_map, ip, LOCK_TYPE_MUTEX);
/* got the lock - rejoice! */
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
diff --git a/kernel/rwsem.c b/kernel/rwsem.c
index cae050b..d02869e 100644
--- a/kernel/rwsem.c
+++ b/kernel/rwsem.c
@@ -21,7 +21,7 @@ void __sched down_read(struct rw_semaphore *sem)
might_sleep();
rwsem_acquire_read(&sem->dep_map, 0, 0, _RET_IP_);

- LOCK_CONTENDED(sem, __down_read_trylock, __down_read);
+ LOCK_CONTENDED(sem, __down_read_trylock, __down_read, LOCK_TYPE_RWSEM);
}

EXPORT_SYMBOL(down_read);
@@ -48,7 +48,7 @@ void __sched down_write(struct rw_semaphore *sem)
might_sleep();
rwsem_acquire(&sem->dep_map, 0, 0, _RET_IP_);

- LOCK_CONTENDED(sem, __down_write_trylock, __down_write);
+ LOCK_CONTENDED(sem, __down_write_trylock, __down_write, LOCK_TYPE_RWSEM);
}

EXPORT_SYMBOL(down_write);
@@ -112,7 +112,7 @@ void down_read_nested(struct rw_semaphore *sem, int subclass)
might_sleep();
rwsem_acquire_read(&sem->dep_map, subclass, 0, _RET_IP_);

- LOCK_CONTENDED(sem, __down_read_trylock, __down_read);
+ LOCK_CONTENDED(sem, __down_read_trylock, __down_read, LOCK_TYPE_RWSEM);
}

EXPORT_SYMBOL(down_read_nested);
@@ -131,7 +131,7 @@ void down_write_nested(struct rw_semaphore *sem, int subclass)
might_sleep();
rwsem_acquire(&sem->dep_map, subclass, 0, _RET_IP_);

- LOCK_CONTENDED(sem, __down_write_trylock, __down_write);
+ LOCK_CONTENDED(sem, __down_write_trylock, __down_write, LOCK_TYPE_RWSEM);
}

EXPORT_SYMBOL(down_write_nested);
diff --git a/kernel/spinlock.c b/kernel/spinlock.c
index be6517f..81067c2 100644
--- a/kernel/spinlock.c
+++ b/kernel/spinlock.c
@@ -345,7 +345,7 @@ void __lockfunc _raw_spin_lock_nested(raw_spinlock_t *lock, int subclass)
{
preempt_disable();
spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
+ LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock, LOCK_TYPE_SPIN);
}
EXPORT_SYMBOL(_raw_spin_lock_nested);

@@ -358,7 +358,7 @@ unsigned long __lockfunc _raw_spin_lock_irqsave_nested(raw_spinlock_t *lock,
preempt_disable();
spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
LOCK_CONTENDED_FLAGS(lock, do_raw_spin_trylock, do_raw_spin_lock,
- do_raw_spin_lock_flags, &flags);
+ do_raw_spin_lock_flags, &flags, LOCK_TYPE_SPIN);
return flags;
}
EXPORT_SYMBOL(_raw_spin_lock_irqsave_nested);
@@ -368,7 +368,7 @@ void __lockfunc _raw_spin_lock_nest_lock(raw_spinlock_t *lock,
{
preempt_disable();
spin_acquire_nest(&lock->dep_map, 0, 0, nest_lock, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
+ LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock, LOCK_TYPE_SPIN);
}
EXPORT_SYMBOL(_raw_spin_lock_nest_lock);

--
1.6.5.2

2010-04-05 10:37:32

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH] Separate lock events with types

On 03/27/10 08:33, Frederic Weisbecker wrote:
> On Wed, Feb 24, 2010 at 06:02:46PM +0900, Hitoshi Mitake wrote:
>> Sorry for my long silence...
>>
>> Thanks for Frederic's great work like trace_lock_class_init(),
>> overhead of perf lock was reduced a lot.
>> But still there is overhead which cannot be disregarded.
>>
>> So I'd like to suggest that separating lock trace events into each
types of lock.
>> e.g.
>> trace_lock_acquire() -> spin_trace_lock_acquire(),
rwlock_trace_lock_acquire()
>> I think that mutex and spinlock are completely different things.
>> And as I describe below, filtering at recording phase can reduce
overhead of tracing.
>>
>> CAUTION:
>> This patch is the proof of concept. The way this patch employes
>> is different from one I described above. This patch adds if statement
>> before trace_lock_*(). Implementation of separating events per types
will be
>> a big one, so this is an only trial edition for performance
improvements.
>
>
> Instead of having one different event for each type of locks,
> I would rather suggest to add a "lock type" field in the (future)
> lock_init_class class. This requires we implement event injection
> properly before.
>
> So if we store the lock type in the lockdep_map, we can just dump
> the type on lock class initialization:
>
> - on register_lock_class
> - on event injection to catchup with lock that have already registered
>

Yeah, this is a smarter way than the one I suggested.

But specifying types of locks to focuse on will require special way.
I thought that ioctl() will make this type specifying possible,
but it will not match to other events...
e.g. perf record will be required dirty change for it.

Do you have any good ideas at this point?

Thanks,
Hitoshi

2010-04-06 09:00:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] Separate lock events with types

On Sat, 2010-03-27 at 00:33 +0100, Frederic Weisbecker wrote:
> So if we store the lock type in the lockdep_map, we can just dump
> the type on lock class initialization:
>
> - on register_lock_class
> - on event injection to catchup with lock that have already registered
>
> That's what does my tree perf/inject (minus the lock type), but this
> all require a redesign, in both ftrace and perf sides.
>

Right, and I don't like to bloat dep_map for no reason. I still think
all this lock type stuff is a waste of time.

2010-04-06 09:45:08

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] Separate lock events with types

On Tue, Apr 06, 2010 at 10:26:06AM +0200, Peter Zijlstra wrote:
> On Sat, 2010-03-27 at 00:33 +0100, Frederic Weisbecker wrote:
> > So if we store the lock type in the lockdep_map, we can just dump
> > the type on lock class initialization:
> >
> > - on register_lock_class
> > - on event injection to catchup with lock that have already registered
> >
> > That's what does my tree perf/inject (minus the lock type), but this
> > all require a redesign, in both ftrace and perf sides.
> >
>
> Right, and I don't like to bloat dep_map for no reason. I still think
> all this lock type stuff is a waste of time.


Again, it makes no sense to mix up reports of spinlocks, rwlocks, mutexes
or whatever together in the same latency report.

I agree with you that such report makes no sense if you don't look at
the code and then find the nature of the locks on the deeper overview.

But the first overview is going to be unhelpful at best if you have
everything in the same linear report.

You'll naturally find the mutexes first reporting the worst latencies,
then rwsem, then the spinlocks, then the read rwlocks (read) at the end
of the list (depending on the cases).

And yet their latency intervals have utter different meanings. You can't
provide a taste report that mixes beers and wines qualities based on the
same magnitudes.

I really believe perf lock is going to suck at best without this.