2012-06-22 19:36:58

by Dave Jones

[permalink] [raw]
Subject: big allocation failure during tracing init.

I just hit this during a low-memory test..

trinity-child6: page allocation failure: order:4, mode:0xc0d0
Pid: 25842, comm: trinity-child6 Not tainted 3.5.0-rc3+ #95
Call Trace:
[<ffffffff81147fab>] warn_alloc_failed+0xeb/0x150
[<ffffffff8114c2d6>] __alloc_pages_nodemask+0x836/0xab0
[<ffffffff8118a630>] alloc_pages_current+0xb0/0x120
[<ffffffff8108e377>] ? local_clock+0x47/0x60
[<ffffffff811471d4>] __get_free_pages+0x14/0x50
[<ffffffff811963c9>] kmalloc_order_trace+0x39/0x190
[<ffffffff810b3108>] ? trace_hardirqs_off_caller+0x28/0xc0
[<ffffffff81196cbe>] __kmalloc+0x23e/0x290
[<ffffffff811d270c>] __seq_open_private+0x2c/0x80
[<ffffffff81114d8c>] tracing_open+0x8c/0x340
[<ffffffff811aadbf>] do_dentry_open+0x25f/0x310
[<ffffffff81114d00>] ? tracing_iter_reset+0x90/0x90
[<ffffffff811ac2fe>] nameidata_to_filp+0x7e/0xe0
[<ffffffff811bc427>] do_last+0x4a7/0x9f0
[<ffffffff811be2ac>] path_openat+0xdc/0x410
[<ffffffff8108e377>] ? local_clock+0x47/0x60
[<ffffffff811cd231>] ? alloc_fd+0x171/0x220
[<ffffffff811cd231>] ? alloc_fd+0x171/0x220
[<ffffffff811be701>] do_filp_open+0x41/0xa0
[<ffffffff816501e5>] ? _raw_spin_unlock+0x35/0x60
[<ffffffff811cd231>] ? alloc_fd+0x171/0x220
[<ffffffff811ac44d>] do_sys_open+0xed/0x1c0
[<ffffffff811ac564>] sys_openat+0x14/0x20
[<ffffffff81657b96>] system_call_fastpath+0x1a/0x1f

I guess the problem is on kernels built with CONFIG_MAXCPUS, struct trace_iterator
can be pretty large due to this embedded in it..

struct ring_buffer_iter *buffer_iter[NR_CPUS];

Any thoughts on how to change that to a smaller dynamic data structure,
without adding too much overhead to tracing ? (I'm assuming a linked-list walk
for eg would be excessive?)

Dave


2012-06-28 00:52:46

by Steven Rostedt

[permalink] [raw]
Subject: Re: big allocation failure during tracing init.

On Fri, 2012-06-22 at 15:36 -0400, Dave Jones wrote:

> I guess the problem is on kernels built with CONFIG_MAXCPUS, struct trace_iterator
> can be pretty large due to this embedded in it..
>
> struct ring_buffer_iter *buffer_iter[NR_CPUS];
>
> Any thoughts on how to change that to a smaller dynamic data structure,
> without adding too much overhead to tracing ? (I'm assuming a linked-list walk
> for eg would be excessive?)
>

Does this patch fix it for you?

-- Steve

commit 0339abcabbef5f1fc57a8af046861a04e0a0ef9f
Author: Steven Rostedt <[email protected]>
Date: Wed Jun 27 20:46:14 2012 -0400

tracing: Remove NR_CPUS array from trace_iterator

Replace the NR_CPUS array of buffer_iter from the trace_iterator
with an allocated array. This will just create an array of
possible CPUS instead of the max number specified.

Added a helper function called 'trace_buffer_iter()' that returns
the buffer_iter item or NULL if it is not defined or the array was
not allocated. Some routines do not require the array
(tracing_open_pipe() for one).

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 1aff183..af961d6 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -65,7 +65,7 @@ struct trace_iterator {
void *private;
int cpu_file;
struct mutex mutex;
- struct ring_buffer_iter *buffer_iter[NR_CPUS];
+ struct ring_buffer_iter **buffer_iter;
unsigned long iter_flags;

/* trace_seq for __print_flags() and __print_symbolic() etc. */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a8d5654..85073f9 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1710,9 +1710,11 @@ EXPORT_SYMBOL_GPL(trace_vprintk);

static void trace_iterator_increment(struct trace_iterator *iter)
{
+ struct ring_buffer_iter *buf_iter = trace_buffer_iter(iter, iter->cpu);
+
iter->idx++;
- if (iter->buffer_iter[iter->cpu])
- ring_buffer_read(iter->buffer_iter[iter->cpu], NULL);
+ if (buf_iter)
+ ring_buffer_read(buf_iter, NULL);
}

static struct trace_entry *
@@ -1720,7 +1722,7 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts,
unsigned long *lost_events)
{
struct ring_buffer_event *event;
- struct ring_buffer_iter *buf_iter = iter->buffer_iter[cpu];
+ struct ring_buffer_iter *buf_iter = trace_buffer_iter(iter, cpu);

if (buf_iter)
event = ring_buffer_iter_peek(buf_iter, ts);
@@ -1858,10 +1860,10 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu)

tr->data[cpu]->skipped_entries = 0;

- if (!iter->buffer_iter[cpu])
+ buf_iter = trace_buffer_iter(iter, cpu);
+ if (!buf_iter)
return;

- buf_iter = iter->buffer_iter[cpu];
ring_buffer_iter_reset(buf_iter);

/*
@@ -2207,13 +2209,15 @@ static enum print_line_t print_bin_fmt(struct trace_iterator *iter)

int trace_empty(struct trace_iterator *iter)
{
+ struct ring_buffer_iter *buf_iter;
int cpu;

/* If we are looking at one CPU buffer, only check that one */
if (iter->cpu_file != TRACE_PIPE_ALL_CPU) {
cpu = iter->cpu_file;
- if (iter->buffer_iter[cpu]) {
- if (!ring_buffer_iter_empty(iter->buffer_iter[cpu]))
+ buf_iter = trace_buffer_iter(iter, cpu);
+ if (buf_iter) {
+ if (!ring_buffer_iter_empty(buf_iter))
return 0;
} else {
if (!ring_buffer_empty_cpu(iter->tr->buffer, cpu))
@@ -2223,8 +2227,9 @@ int trace_empty(struct trace_iterator *iter)
}

for_each_tracing_cpu(cpu) {
- if (iter->buffer_iter[cpu]) {
- if (!ring_buffer_iter_empty(iter->buffer_iter[cpu]))
+ buf_iter = trace_buffer_iter(iter, cpu);
+ if (buf_iter) {
+ if (!ring_buffer_iter_empty(buf_iter))
return 0;
} else {
if (!ring_buffer_empty_cpu(iter->tr->buffer, cpu))
@@ -2383,6 +2388,8 @@ __tracing_open(struct inode *inode, struct file *file)
if (!iter)
return ERR_PTR(-ENOMEM);

+ iter->buffer_iter = kzalloc(sizeof(*iter->buffer_iter) * num_possible_cpus(),
+ GFP_KERNEL);
/*
* We make a copy of the current tracer to avoid concurrent
* changes on it while we are reading.
@@ -2443,6 +2450,7 @@ __tracing_open(struct inode *inode, struct file *file)
fail:
mutex_unlock(&trace_types_lock);
kfree(iter->trace);
+ kfree(iter->buffer_iter);
seq_release_private(inode, file);
return ERR_PTR(-ENOMEM);
}
@@ -2483,6 +2491,7 @@ static int tracing_release(struct inode *inode, struct file *file)
mutex_destroy(&iter->mutex);
free_cpumask_var(iter->started);
kfree(iter->trace);
+ kfree(iter->buffer_iter);
seq_release_private(inode, file);
return 0;
}
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 5aec220..55e1f7f 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -317,6 +317,14 @@ struct tracer {

#define TRACE_PIPE_ALL_CPU -1

+static inline struct ring_buffer_iter *
+trace_buffer_iter(struct trace_iterator *iter, int cpu)
+{
+ if (iter->buffer_iter && iter->buffer_iter[cpu])
+ return iter->buffer_iter[cpu];
+ return NULL;
+}
+
int tracer_init(struct tracer *t, struct trace_array *tr);
int tracing_is_enabled(void);
void trace_wake_up(void);
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index a7d2a4c..ce27c8b 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -538,7 +538,7 @@ get_return_for_leaf(struct trace_iterator *iter,
next = &data->ret;
} else {

- ring_iter = iter->buffer_iter[iter->cpu];
+ ring_iter = trace_buffer_iter(iter, iter->cpu);

/* First peek to compare current entry and the next one */
if (ring_iter)

2012-06-28 02:22:26

by Dave Jones

[permalink] [raw]
Subject: Re: big allocation failure during tracing init.

On Wed, Jun 27, 2012 at 08:52:43PM -0400, Steven Rostedt wrote:
> On Fri, 2012-06-22 at 15:36 -0400, Dave Jones wrote:
>
> > I guess the problem is on kernels built with CONFIG_MAXCPUS, struct trace_iterator
> > can be pretty large due to this embedded in it..
> >
> > struct ring_buffer_iter *buffer_iter[NR_CPUS];
> >
> > Any thoughts on how to change that to a smaller dynamic data structure,
> > without adding too much overhead to tracing ? (I'm assuming a linked-list walk
> > for eg would be excessive?)
> >
>
> Does this patch fix it for you?

I'm not sure what I did to trigger the allocation failure last time,
but I'll run with this for a while, and see if anything falls out.

thanks!

Dave

2012-06-28 02:58:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: big allocation failure during tracing init.

On Wed, 2012-06-27 at 22:22 -0400, Dave Jones wrote:
> On Wed, Jun 27, 2012 at 08:52:43PM -0400, Steven Rostedt wrote:
> > On Fri, 2012-06-22 at 15:36 -0400, Dave Jones wrote:
> >
> > > I guess the problem is on kernels built with CONFIG_MAXCPUS, struct trace_iterator
> > > can be pretty large due to this embedded in it..
> > >
> > > struct ring_buffer_iter *buffer_iter[NR_CPUS];
> > >
> > > Any thoughts on how to change that to a smaller dynamic data structure,
> > > without adding too much overhead to tracing ? (I'm assuming a linked-list walk
> > > for eg would be excessive?)
> > >
> >
> > Does this patch fix it for you?
>
> I'm not sure what I did to trigger the allocation failure last time,
> but I'll run with this for a while, and see if anything falls out.

You probably need to do some tracing to trigger it. As it requires
opening of the trace files.

-- Steve