2009-12-09 03:29:51

by Xiao Guangrong

[permalink] [raw]
Subject: [PATCH 1/3] perf_event: cleanup for __perf_event_init_context()

This is a cleanup patch and does:
- define 'perf_cpu_context' variable with 'static'
- using kzalloc() instead of kmalloc() and memset()

Signed-off-by: Xiao Guangrong <[email protected]>
---
kernel/perf_event.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 08f5718..fbebe7b 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -36,7 +36,7 @@
/*
* Each CPU has a list of per CPU events:
*/
-DEFINE_PER_CPU(struct perf_cpu_context, perf_cpu_context);
+static DEFINE_PER_CPU(struct perf_cpu_context, perf_cpu_context);

int perf_max_events __read_mostly = 1;
static int perf_reserved_percpu __read_mostly;
@@ -1579,7 +1579,6 @@ static void
__perf_event_init_context(struct perf_event_context *ctx,
struct task_struct *task)
{
- memset(ctx, 0, sizeof(*ctx));
raw_spin_lock_init(&ctx->lock);
mutex_init(&ctx->mutex);
INIT_LIST_HEAD(&ctx->group_list);
@@ -1654,7 +1653,7 @@ static struct perf_event_context *find_get_context(pid_t pid, int cpu)
}

if (!ctx) {
- ctx = kmalloc(sizeof(struct perf_event_context), GFP_KERNEL);
+ ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
err = -ENOMEM;
if (!ctx)
goto errout;
@@ -5105,7 +5104,7 @@ int perf_event_init_task(struct task_struct *child)
* First allocate and initialize a context for the child.
*/

- child_ctx = kmalloc(sizeof(struct perf_event_context), GFP_KERNEL);
+ child_ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
if (!child_ctx)
return -ENOMEM;

--
1.6.1.2


2009-12-09 03:31:20

by Xiao Guangrong

[permalink] [raw]
Subject: [PATCH 2/3] perf_event: allocate children's perf_event_ctxp at the right time

In current code, children task will allocate memory for
'child->perf_event_ctxp' if the parent is counted, we can
do it only if the parent allowed children inherit it.

It can save memory and reduce overhead

Signed-off-by: Xiao Guangrong <[email protected]>
---
kernel/perf_event.c | 37 ++++++++++++++++++++++---------------
1 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index fbebe7b..592b293 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -5083,7 +5083,7 @@ again:
*/
int perf_event_init_task(struct task_struct *child)
{
- struct perf_event_context *child_ctx, *parent_ctx;
+ struct perf_event_context *child_ctx = NULL, *parent_ctx;
struct perf_event_context *cloned_ctx;
struct perf_event *event;
struct task_struct *parent = current;
@@ -5099,20 +5099,6 @@ int perf_event_init_task(struct task_struct *child)
return 0;

/*
- * This is executed from the parent task context, so inherit
- * events that have been marked for cloning.
- * First allocate and initialize a context for the child.
- */
-
- child_ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
- if (!child_ctx)
- return -ENOMEM;
-
- __perf_event_init_context(child_ctx, child);
- child->perf_event_ctxp = child_ctx;
- get_task_struct(child);
-
- /*
* If the parent's context is a clone, pin it so it won't get
* swapped under us.
*/
@@ -5142,6 +5128,26 @@ int perf_event_init_task(struct task_struct *child)
continue;
}

+ if (!child->perf_event_ctxp) {
+ /*
+ * This is executed from the parent task context, so
+ * inherit events that have been marked for cloning.
+ * First allocate and initialize a context for the
+ * child.
+ */
+
+ child_ctx = kzalloc(sizeof(struct perf_event_context),
+ GFP_KERNEL);
+ if (!child_ctx) {
+ ret = -ENOMEM;
+ goto exit;
+ }
+
+ __perf_event_init_context(child_ctx, child);
+ child->perf_event_ctxp = child_ctx;
+ get_task_struct(child);
+ }
+
ret = inherit_group(event, parent, parent_ctx,
child, child_ctx);
if (ret) {
@@ -5170,6 +5176,7 @@ int perf_event_init_task(struct task_struct *child)
get_ctx(child_ctx->parent_ctx);
}

+exit:
mutex_unlock(&parent_ctx->mutex);

perf_unpin_context(parent_ctx);
--
1.6.1.2

2009-12-09 03:32:14

by Xiao Guangrong

[permalink] [raw]
Subject: [PATCH 3/3] perf_event: cleanup for cpu_clock_perf_event_update()

Using atomic64_xchg() instead of atomic64_read() and atomic64_set().

Signed-off-by: Xiao Guangrong <[email protected]>
---
kernel/perf_event.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 592b293..8f46012 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4079,8 +4079,7 @@ static void cpu_clock_perf_event_update(struct perf_event *event)
u64 now;

now = cpu_clock(cpu);
- prev = atomic64_read(&event->hw.prev_count);
- atomic64_set(&event->hw.prev_count, now);
+ prev = atomic64_xchg(&event->hw.prev_count, now);
atomic64_add(now - prev, &event->count);
}

--
1.6.1.2

2009-12-09 08:33:32

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf_event: cleanup for __perf_event_init_context()

On Wed, Dec 09, 2009 at 11:28:13AM +0800, Xiao Guangrong wrote:
> This is a cleanup patch and does:
> - define 'perf_cpu_context' variable with 'static'
> - using kzalloc() instead of kmalloc() and memset()
>
> Signed-off-by: Xiao Guangrong <[email protected]>
> ---


And perf_event_init_cpu() doesn't need it as it's supposed
to be statically zeroed already.

Looks good to me.

Reviewed-by: Frederic Weisbecker <[email protected]>

2009-12-09 08:43:07

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 2/3] perf_event: allocate children's perf_event_ctxp at the right time

On Wed, Dec 09, 2009 at 11:29:44AM +0800, Xiao Guangrong wrote:
> In current code, children task will allocate memory for
> 'child->perf_event_ctxp' if the parent is counted, we can
> do it only if the parent allowed children inherit it.
>
> It can save memory and reduce overhead
>
> Signed-off-by: Xiao Guangrong <[email protected]>



Reviewed-by: Frederic Weisbecker <[email protected]>

2009-12-09 08:54:20

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 3/3] perf_event: cleanup for cpu_clock_perf_event_update()

On Wed, Dec 09, 2009 at 11:30:36AM +0800, Xiao Guangrong wrote:
> Using atomic64_xchg() instead of atomic64_read() and atomic64_set().
>
> Signed-off-by: Xiao Guangrong <[email protected]>
> ---
> kernel/perf_event.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 592b293..8f46012 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -4079,8 +4079,7 @@ static void cpu_clock_perf_event_update(struct perf_event *event)
> u64 now;
>
> now = cpu_clock(cpu);
> - prev = atomic64_read(&event->hw.prev_count);
> - atomic64_set(&event->hw.prev_count, now);
> + prev = atomic64_xchg(&event->hw.prev_count, now);
> atomic64_add(now - prev, &event->count);
> }


Reviewed-by: Frederic Weisbecker <[email protected]>

2009-12-09 09:52:35

by Xiao Guangrong

[permalink] [raw]
Subject: [tip:perf/urgent] perf_event: Clean up __perf_event_init_context()

Commit-ID: aa5452d70c0d559310598b243b8b1033c10056e7
Gitweb: http://git.kernel.org/tip/aa5452d70c0d559310598b243b8b1033c10056e7
Author: Xiao Guangrong <[email protected]>
AuthorDate: Wed, 9 Dec 2009 11:28:13 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 9 Dec 2009 09:56:27 +0100

perf_event: Clean up __perf_event_init_context()

Clean up the code a bit:

- define 'perf_cpu_context' variable with 'static'
- use kzalloc() instead of kmalloc() and memset()

Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/perf_event.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 3b0cf86..2b06c45 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -36,7 +36,7 @@
/*
* Each CPU has a list of per CPU events:
*/
-DEFINE_PER_CPU(struct perf_cpu_context, perf_cpu_context);
+static DEFINE_PER_CPU(struct perf_cpu_context, perf_cpu_context);

int perf_max_events __read_mostly = 1;
static int perf_reserved_percpu __read_mostly;
@@ -1579,7 +1579,6 @@ static void
__perf_event_init_context(struct perf_event_context *ctx,
struct task_struct *task)
{
- memset(ctx, 0, sizeof(*ctx));
spin_lock_init(&ctx->lock);
mutex_init(&ctx->mutex);
INIT_LIST_HEAD(&ctx->group_list);
@@ -1654,7 +1653,7 @@ static struct perf_event_context *find_get_context(pid_t pid, int cpu)
}

if (!ctx) {
- ctx = kmalloc(sizeof(struct perf_event_context), GFP_KERNEL);
+ ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
err = -ENOMEM;
if (!ctx)
goto errout;
@@ -5105,7 +5104,7 @@ int perf_event_init_task(struct task_struct *child)
* First allocate and initialize a context for the child.
*/

- child_ctx = kmalloc(sizeof(struct perf_event_context), GFP_KERNEL);
+ child_ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
if (!child_ctx)
return -ENOMEM;

2009-12-09 09:53:00

by Xiao Guangrong

[permalink] [raw]
Subject: [tip:perf/urgent] perf_event: Allocate children's perf_event_ctxp at the right time

Commit-ID: b93f7978ad6b46133e9453b90ccc057dc2429e75
Gitweb: http://git.kernel.org/tip/b93f7978ad6b46133e9453b90ccc057dc2429e75
Author: Xiao Guangrong <[email protected]>
AuthorDate: Wed, 9 Dec 2009 11:29:44 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 9 Dec 2009 09:56:27 +0100

perf_event: Allocate children's perf_event_ctxp at the right time

In current code, children task will allocate memory for
'child->perf_event_ctxp' if the parent is counted, we can
do it only if the parent allowed children inherit it.

It can save memory and reduce overhead.

Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/perf_event.c | 37 ++++++++++++++++++++++---------------
1 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 2b06c45..77641ae 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -5083,7 +5083,7 @@ again:
*/
int perf_event_init_task(struct task_struct *child)
{
- struct perf_event_context *child_ctx, *parent_ctx;
+ struct perf_event_context *child_ctx = NULL, *parent_ctx;
struct perf_event_context *cloned_ctx;
struct perf_event *event;
struct task_struct *parent = current;
@@ -5099,20 +5099,6 @@ int perf_event_init_task(struct task_struct *child)
return 0;

/*
- * This is executed from the parent task context, so inherit
- * events that have been marked for cloning.
- * First allocate and initialize a context for the child.
- */
-
- child_ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
- if (!child_ctx)
- return -ENOMEM;
-
- __perf_event_init_context(child_ctx, child);
- child->perf_event_ctxp = child_ctx;
- get_task_struct(child);
-
- /*
* If the parent's context is a clone, pin it so it won't get
* swapped under us.
*/
@@ -5142,6 +5128,26 @@ int perf_event_init_task(struct task_struct *child)
continue;
}

+ if (!child->perf_event_ctxp) {
+ /*
+ * This is executed from the parent task context, so
+ * inherit events that have been marked for cloning.
+ * First allocate and initialize a context for the
+ * child.
+ */
+
+ child_ctx = kzalloc(sizeof(struct perf_event_context),
+ GFP_KERNEL);
+ if (!child_ctx) {
+ ret = -ENOMEM;
+ goto exit;
+ }
+
+ __perf_event_init_context(child_ctx, child);
+ child->perf_event_ctxp = child_ctx;
+ get_task_struct(child);
+ }
+
ret = inherit_group(event, parent, parent_ctx,
child, child_ctx);
if (ret) {
@@ -5170,6 +5176,7 @@ int perf_event_init_task(struct task_struct *child)
get_ctx(child_ctx->parent_ctx);
}

+exit:
mutex_unlock(&parent_ctx->mutex);

perf_unpin_context(parent_ctx);

2009-12-09 09:53:16

by Xiao Guangrong

[permalink] [raw]
Subject: [tip:perf/urgent] perf_event: Cleanup for cpu_clock_perf_event_update()

Commit-ID: ec89a06fd4e12301f11ab039ee07d2353a18addc
Gitweb: http://git.kernel.org/tip/ec89a06fd4e12301f11ab039ee07d2353a18addc
Author: Xiao Guangrong <[email protected]>
AuthorDate: Wed, 9 Dec 2009 11:30:36 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 9 Dec 2009 09:56:27 +0100

perf_event: Cleanup for cpu_clock_perf_event_update()

Using atomic64_xchg() instead of atomic64_read() and
atomic64_set().

Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/perf_event.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 77641ae..94e1b28 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4079,8 +4079,7 @@ static void cpu_clock_perf_event_update(struct perf_event *event)
u64 now;

now = cpu_clock(cpu);
- prev = atomic64_read(&event->hw.prev_count);
- atomic64_set(&event->hw.prev_count, now);
+ prev = atomic64_xchg(&event->hw.prev_count, now);
atomic64_add(now - prev, &event->count);
}