I was looking through kernel/trace/trace_event_profile.c and I saw
this code:
static int ftrace_profile_enable_event(struct ftrace_event_call *event)
{
char *buf;
int ret = -ENOMEM;
if (atomic_inc_return(&event->profile_count))
return 0;
if (!total_profile_count++) {
buf = (char *)alloc_percpu(profile_buf_t);
if (!buf)
goto fail_buf;
rcu_assign_pointer(trace_profile_buf, buf);
buf = (char *)alloc_percpu(profile_buf_t);
if (!buf)
goto fail_buf_nmi;
rcu_assign_pointer(trace_profile_buf_nmi, buf);
}
ret = event->profile_enable();
if (!ret)
return 0;
kfree(trace_profile_buf_nmi);
fail_buf_nmi:
kfree(trace_profile_buf);
fail_buf:
total_profile_count--;
...
So we only allocate trace_profile_buf and trace_profile_buf_nmi if
total_profile_count was zero on entry, but if we get an error returned
from event->profile_enable(), we free them both unconditionally,
regardless of the value of total_profile_count. That seems wrong. Is
there a subtle reason why that is the right thing to do?
(Also, is kfree the appropriate counterpart to alloc_percpu?)
Paul.
On Thu, 2009-10-01 at 14:50 +1000, Paul Mackerras wrote:
> So we only allocate trace_profile_buf and trace_profile_buf_nmi if
> total_profile_count was zero on entry, but if we get an error returned
> from event->profile_enable(), we free them both unconditionally,
> regardless of the value of total_profile_count. That seems wrong. Is
> there a subtle reason why that is the right thing to do?
>
> (Also, is kfree the appropriate counterpart to alloc_percpu?)
Hi Paul,
I think you have a valid point. Frederic and I are here in Dresden (last
day). I'll make sure he sees this.
Thanks,
-- Steve
2009/10/1 Paul Mackerras <[email protected]>:
> I was looking through kernel/trace/trace_event_profile.c and I saw
> this code:
>
> static int ftrace_profile_enable_event(struct ftrace_event_call *event)
> {
> ? ? ? ?char *buf;
> ? ? ? ?int ret = -ENOMEM;
>
> ? ? ? ?if (atomic_inc_return(&event->profile_count))
> ? ? ? ? ? ? ? ?return 0;
>
> ? ? ? ?if (!total_profile_count++) {
> ? ? ? ? ? ? ? ?buf = (char *)alloc_percpu(profile_buf_t);
> ? ? ? ? ? ? ? ?if (!buf)
> ? ? ? ? ? ? ? ? ? ? ? ?goto fail_buf;
>
> ? ? ? ? ? ? ? ?rcu_assign_pointer(trace_profile_buf, buf);
>
> ? ? ? ? ? ? ? ?buf = (char *)alloc_percpu(profile_buf_t);
> ? ? ? ? ? ? ? ?if (!buf)
> ? ? ? ? ? ? ? ? ? ? ? ?goto fail_buf_nmi;
>
> ? ? ? ? ? ? ? ?rcu_assign_pointer(trace_profile_buf_nmi, buf);
> ? ? ? ?}
>
> ? ? ? ?ret = event->profile_enable();
> ? ? ? ?if (!ret)
> ? ? ? ? ? ? ? ?return 0;
>
> ? ? ? ?kfree(trace_profile_buf_nmi);
> fail_buf_nmi:
> ? ? ? ?kfree(trace_profile_buf);
> fail_buf:
> ? ? ? ?total_profile_count--;
>
> ...
>
> So we only allocate trace_profile_buf and trace_profile_buf_nmi if
> total_profile_count was zero on entry, but if we get an error returned
> from event->profile_enable(), we free them both unconditionally,
> regardless of the value of total_profile_count. ?That seems wrong. ?Is
> there a subtle reason why that is the right thing to do?
Oh right. Good catch.
I'll fix that soon.
>
> (Also, is kfree the appropriate counterpart to alloc_percpu?)
No indeed.
That said it looks harmless for now because percpu_free seem to just
roughly wrap kfree. But
its implementation may change later, so I'll fix that too.
Thanks.
Ingo,
Please pull these .33 tracing fixes from
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
tracing/core
These address reviews from Paul mackerras.
I also plan to rename the whole "ftrace event profile" naming to
"ftrace event perf" soon.
Thanks.
Frederic Weisbecker (2):
tracing: Check total refcount before releasing bufs in profile_enable failure
tracing: Use free_percpu instead of kfree
kernel/trace/trace_event_profile.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
When we call the profile_enable() callback of an event, we release the
shared perf event tracing buffers unconditionnaly in the failure path.
This is wrong because there may be other users of these. Then check the
total refcount before doing this.
Reported-by: Paul Mackerras <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Li Zefan <[email protected]>
---
kernel/trace/trace_event_profile.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/trace/trace_event_profile.c b/kernel/trace/trace_event_profile.c
index dd44b87..e52784b 100644
--- a/kernel/trace/trace_event_profile.c
+++ b/kernel/trace/trace_event_profile.c
@@ -31,7 +31,7 @@ static int ftrace_profile_enable_event(struct ftrace_event_call *event)
if (atomic_inc_return(&event->profile_count))
return 0;
- if (!total_profile_count++) {
+ if (!total_profile_count) {
buf = (char *)alloc_percpu(profile_buf_t);
if (!buf)
goto fail_buf;
@@ -46,14 +46,19 @@ static int ftrace_profile_enable_event(struct ftrace_event_call *event)
}
ret = event->profile_enable();
- if (!ret)
+ if (!ret) {
+ total_profile_count++;
return 0;
+ }
- kfree(trace_profile_buf_nmi);
fail_buf_nmi:
- kfree(trace_profile_buf);
+ if (!total_profile_count) {
+ kfree(trace_profile_buf_nmi);
+ kfree(trace_profile_buf);
+ trace_profile_buf_nmi = NULL;
+ trace_profile_buf = NULL;
+ }
fail_buf:
- total_profile_count--;
atomic_dec(&event->profile_count);
return ret;
--
1.6.2.3
In the event->profile_enable() failure path, we release the per cpu
buffers using kfree which is wrong because they are per cpu pointers.
Although free_percpu only wraps kfree for now, that may change in the
future so lets use the correct way.
Reported-by: Paul Mackerras <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Li Zefan <[email protected]>
---
kernel/trace/trace_event_profile.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_event_profile.c b/kernel/trace/trace_event_profile.c
index e52784b..8d5c171 100644
--- a/kernel/trace/trace_event_profile.c
+++ b/kernel/trace/trace_event_profile.c
@@ -53,8 +53,8 @@ static int ftrace_profile_enable_event(struct ftrace_event_call *event)
fail_buf_nmi:
if (!total_profile_count) {
- kfree(trace_profile_buf_nmi);
- kfree(trace_profile_buf);
+ free_percpu(trace_profile_buf_nmi);
+ free_percpu(trace_profile_buf);
trace_profile_buf_nmi = NULL;
trace_profile_buf = NULL;
}
--
1.6.2.3
On Sat, Oct 03, 2009 at 03:47:28PM +0200, Frederic Weisbecker wrote:
> Ingo,
>
> Please pull these .33 tracing fixes from
Hmm, now I realize this code is upstream already.
I should better rebase it against latest Linus's tree.
Sorry.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> tracing/core
>
> These address reviews from Paul mackerras.
>
> I also plan to rename the whole "ftrace event profile" naming to
> "ftrace event perf" soon.
>
> Thanks.
>
> Frederic Weisbecker (2):
> tracing: Check total refcount before releasing bufs in profile_enable failure
> tracing: Use free_percpu instead of kfree
>
> kernel/trace/trace_event_profile.c | 15 ++++++++++-----
> 1 files changed, 10 insertions(+), 5 deletions(-)
On Sat, Oct 03, 2009 at 03:47:28PM +0200, Frederic Weisbecker wrote:
> Ingo,
>
> Please pull these .33 tracing fixes from
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> tracing/core
>
I've rebased it against latest Linus, you can pull it from
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
tracing/urgent
Thanks!
Frederic.
* Frederic Weisbecker <[email protected]> wrote:
> On Sat, Oct 03, 2009 at 03:47:28PM +0200, Frederic Weisbecker wrote:
> > Ingo,
> >
> > Please pull these .33 tracing fixes from
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> > tracing/core
> >
>
>
> I've rebased it against latest Linus, you can pull it from
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> tracing/urgent
Pulled, thanks a lot Frederic!
Ingo
Frederic Weisbecker writes:
> When we call the profile_enable() callback of an event, we release the
> shared perf event tracing buffers unconditionnaly in the failure path.
> This is wrong because there may be other users of these. Then check the
> total refcount before doing this.
[snip]
> - kfree(trace_profile_buf_nmi);
> fail_buf_nmi:
> - kfree(trace_profile_buf);
> + if (!total_profile_count) {
A small problem here: total_profile_count will be 1, not 0, in the
case where we need to free...
> + kfree(trace_profile_buf_nmi);
> + kfree(trace_profile_buf);
> + trace_profile_buf_nmi = NULL;
> + trace_profile_buf = NULL;
> + }
> fail_buf:
> - total_profile_count--;
since we don't decrement total_profile_count until here.
Paul.
On Wed, Oct 07, 2009 at 02:08:27PM +1100, Paul Mackerras wrote:
> Frederic Weisbecker writes:
>
> > When we call the profile_enable() callback of an event, we release the
> > shared perf event tracing buffers unconditionnaly in the failure path.
> > This is wrong because there may be other users of these. Then check the
> > total refcount before doing this.
>
> [snip]
>
> > - kfree(trace_profile_buf_nmi);
> > fail_buf_nmi:
> > - kfree(trace_profile_buf);
> > + if (!total_profile_count) {
>
> A small problem here: total_profile_count will be 1, not 0, in the
> case where we need to free...
>
> > + kfree(trace_profile_buf_nmi);
> > + kfree(trace_profile_buf);
> > + trace_profile_buf_nmi = NULL;
> > + trace_profile_buf = NULL;
> > + }
> > fail_buf:
> > - total_profile_count--;
>
> since we don't decrement total_profile_count until here.
>
> Paul.
No, because now we only increment total_profile_count if
nothing has failed. This is even the last thing done
in the succeeded path. So if we fail and need to free, it
means its values is still 0.
Frederic Weisbecker writes:
> No, because now we only increment total_profile_count if
> nothing has failed. This is even the last thing done
> in the succeeded path. So if we fail and need to free, it
> means its values is still 0.
Ah yes, good point.
Paul.