2006-12-05 22:11:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -rt 2/3] Make trace_freerunning work


* Karsten Wiese <[email protected]> wrote:

> Only reorder trace entries once if trace_freerunning is 1. Modify
> user_trace_stop() not to check report_latency(delta) then. Note that
> at least MAX_TRACE entries must have been generated between
> user_trace_start() and user_trace_stop() for a freerunning trace to be
> reliable.

my thinking behind the freerunning feature is this:

freerunning should behave the same way with regard to latency
measurement. I.e. report_latency() is still needed, and the kernel will
thus do a maximum search over all traces triggered via start/stop.

the difference is in the recording of the last-largest-latency:

- with !freerunning, the tracer stops recording after MAX_TRACE entries,
i.e. the "head" of the trace is preserved in the trace buffer.

- with freerunning, the tracer never stops, it 'wraps around' after
MAX_TRACE entries and starts overwriting the oldest entries. I.e. the
"tail" of the trace is preserved in the trace buffer.

depending on the situation, freerunning or !freerunning might be the
more useful mode.

but there should be no difference in measurement.

could you try to rework this patch with the above functionality in mind?
(or if you'd like to see new functionality, what would that be - and we
could/should implement that separately from the existing semantics of
freerunning and !freerunning)

Ingo


2006-12-06 15:07:57

by Karsten Wiese

[permalink] [raw]
Subject: [PATCH -rt 0/3] Make trace_freerunning work; Take 2

Am Dienstag, 5. Dezember 2006 23:10 schrieb Ingo Molnar:
>
> freerunning should behave the same way with regard to latency
> measurement. I.e. report_latency() is still needed, and the kernel will
> thus do a maximum search over all traces triggered via start/stop.
>
> the difference is in the recording of the last-largest-latency:
>
> - with !freerunning, the tracer stops recording after MAX_TRACE entries,
> i.e. the "head" of the trace is preserved in the trace buffer.
>
> - with freerunning, the tracer never stops, it 'wraps around' after
> MAX_TRACE entries and starts overwriting the oldest entries. I.e. the
> "tail" of the trace is preserved in the trace buffer.
>
> depending on the situation, freerunning or !freerunning might be the
> more useful mode.
>
> but there should be no difference in measurement.

Following 3 patches try to implement the above.

Tested on a UP only after this incantation:
echo 0 > /proc/sys/kernel/wakeup_timing
echo 1 > /proc/sys/kernel/trace_enabled
echo 1 > /proc/sys/kernel/trace_user_triggered

and for half of tests:
echo 1 > /proc/sys/kernel/trace_freerunning
or
echo 0 > /proc/sys/kernel/trace_freerunning
.

Karsten

2006-12-06 15:12:26

by Karsten Wiese

[permalink] [raw]
Subject: [PATCH -rt 1/3] Make trace_freerunning work; Take 2: Off by 1 tweaks


Needed to make last trace entry appear when trace_freerunning is 1.

Signed-off-by: Karsten Wiese <[email protected]>


--- rt6/kernel/latency_trace.c 2006-12-06 00:36:49.000000000 +0100
+++ rt6-kw/kernel/latency_trace.c 2006-12-06 14:43:52.000000000 +0100
@@ -181,7 +181,7 @@ static int report_latency(cycle_t delta)
/*
* Number of per-CPU trace entries:
*/
-#define MAX_TRACE (131072UL-1)
+#define MAX_TRACE 131072UL

#define CMDLINE_BYTES 16

@@ -609,7 +609,7 @@ again:
if (unlikely((trace_freerunning || print_functions) &&
(idx_next >= MAX_TRACE)))
idx_next = 0;
- if (unlikely(idx_next >= MAX_TRACE)) {
+ if (unlikely(idx >= MAX_TRACE)) {
atomic_inc(&tr->overrun);
goto out;
}
@@ -899,7 +899,7 @@ static void construct_pid_to_cmdline(str
if (!tr->trace)
return;

- entries = min(tr->trace_idx, MAX_TRACE-1);
+ entries = min(tr->trace_idx, MAX_TRACE);

for (i = 0; i < entries; i++) {
struct trace_entry *entry = tr->trace + i;
@@ -951,12 +951,12 @@ static void copy_trace(struct cpu_trace
idx = (idx0 + i) % MAX_TRACE;
save->trace[i] = tr->trace[idx];
}
- save->trace_idx = MAX_TRACE-1;
+ save->trace_idx = MAX_TRACE;
} else {
save->trace_idx = tr->trace_idx;

memcpy(save->trace, tr->trace,
- min(save->trace_idx + 1, MAX_TRACE-1) *
+ min(save->trace_idx, MAX_TRACE) *
sizeof(struct trace_entry));
}
save->overrun = tr->overrun;
@@ -979,7 +979,7 @@ static int min_idx(struct block_idx *bid

for_each_online_cpu(cpu) {
idx = bidx->idx[cpu];
- if (idx >= min(max_tr.traces[cpu].trace_idx, MAX_TRACE-1))
+ if (idx >= min(max_tr.traces[cpu].trace_idx, MAX_TRACE))
continue;
if (idx >= MAX_TRACE*NR_CPUS) {
printk("huh: idx (%d) > %ld*%d!\n", idx, MAX_TRACE, NR_CPUS);
@@ -1036,7 +1036,7 @@ static void update_out_trace(void)
out_entry = tmp_out->trace + 0;

if (!trace_all_cpus) {
- entries = min(tmp_out->trace_idx, MAX_TRACE-1);
+ entries = min(tmp_out->trace_idx, MAX_TRACE);
if (!entries)
return;
out_tr.first_timestamp = tmp_out->trace[0].timestamp;
@@ -1059,7 +1059,7 @@ static void update_out_trace(void)
* Save the timestamp range:
*/
tmp_max = max_tr.traces + max_tr.cpu;
- entries = min(tmp_max->trace_idx, MAX_TRACE-1);
+ entries = min(tmp_max->trace_idx, MAX_TRACE);
/*
* No saved trace yet?
*/
@@ -1075,7 +1075,7 @@ static void update_out_trace(void)

for_each_online_cpu(cpu) {
tmp_max = max_tr.traces + cpu;
- entries = min(tmp_max->trace_idx, MAX_TRACE-1);
+ entries = min(tmp_max->trace_idx, MAX_TRACE);
printk("CPU%d: %016Lx (%016Lx) ... #%d (%016Lx) %016Lx\n", cpu,
tmp_max->trace[0].timestamp,
tmp_max->trace[1].timestamp,
@@ -1084,7 +1084,7 @@ static void update_out_trace(void)
tmp_max->trace[entries-1].timestamp);
}
tmp_max = max_tr.traces + max_tr.cpu;
- entries = min(tmp_max->trace_idx, MAX_TRACE-1);
+ entries = min(tmp_max->trace_idx, MAX_TRACE);

printk("CPU%d entries: %d\n", max_tr.cpu, entries);
printk("first stamp: %016Lx\n", first_stamp);
@@ -1179,7 +1179,7 @@ static void * notrace l_start(struct seq
#endif
construct_pid_to_cmdline(tr);
}
- entries = min(tr->trace_idx, MAX_TRACE-1);
+ entries = min(tr->trace_idx, MAX_TRACE);

if (!n) {
seq_printf(m, "preemption latency trace v1.1.5 on %s\n", UTS_RELEASE);
@@ -1241,7 +1241,7 @@ static void * notrace l_start(struct seq
static void * notrace l_next(struct seq_file *m, void *p, loff_t *pos)
{
struct cpu_trace *tr = out_tr.traces;
- unsigned long entries = min(tr->trace_idx, MAX_TRACE-1);
+ unsigned long entries = min(tr->trace_idx, MAX_TRACE);

WARN_ON(!tr->trace);

@@ -2431,8 +2431,7 @@ void notrace stop_trace(void)

EXPORT_SYMBOL(stop_trace);

-static void print_entry(struct trace_entry *entry, struct trace_entry *entry0,
- struct trace_entry *next_entry)
+static void print_entry(struct trace_entry *entry, struct trace_entry *entry0)
{
unsigned long abs_usecs;
int hardirq, softirq;
@@ -2499,7 +2498,7 @@ void print_last_trace(void)
{
unsigned int idx0, idx, i, cpu;
struct cpu_trace *tr;
- struct trace_entry *entry0, *entry, *next_entry;
+ struct trace_entry *entry0, *entry;

preempt_disable();
cpu = smp_processor_id();
@@ -2521,12 +2520,11 @@ void print_last_trace(void)
idx0 = tr->trace_idx;
printk("curr idx: %d\n", idx0);
if (idx0 >= MAX_TRACE)
- idx0 = MAX_TRACE-1;
+ idx0 = 0;
idx = idx0;
entry0 = tr->trace + idx0;

for (i = 0; i < MAX_TRACE; i++) {
- next_entry = tr->trace + idx;
if (idx == 0)
idx = MAX_TRACE-1;
else
@@ -2536,7 +2534,7 @@ void print_last_trace(void)
case TRACE_FN:
case TRACE_SPECIAL:
case TRACE_SPECIAL_U64:
- print_entry(entry, entry0, next_entry);
+ print_entry(entry, entry0);
break;
}
}

2006-12-06 15:18:31

by Karsten Wiese

[permalink] [raw]
Subject: [PATCH -rt 2/3] Make trace_freerunning work; Take 2: Add atomic_t underrun


Add atomic_t underrun to struct cpu_trace.
Increment it only when trace_freerunning is set and an older trace
entry is overwritten.
Modify copy_trace() to reorder entries, if underrun != 0.

Signed-off-by: Karsten Wiese <[email protected]>


--- rt6-kw/kernel/latency_trace-tk2.1.c 2006-12-06 14:43:52.000000000 +0100
+++ rt6-kw/kernel/latency_trace.c 2006-12-06 14:58:44.000000000 +0100
@@ -228,6 +228,7 @@ struct cpu_trace {
cycle_t preempt_timestamp;
unsigned long critical_start, critical_end;
unsigned long critical_sequence;
+ atomic_t underrun;
atomic_t overrun;
int early_warning;
int latency_type;
@@ -606,16 +607,21 @@ again:
idx_next = idx + 1;
timestamp = now();

- if (unlikely((trace_freerunning || print_functions) &&
- (idx_next >= MAX_TRACE)))
+ if (unlikely((trace_freerunning || print_functions || atomic_read(&tr->underrun)) &&
+ (idx_next >= MAX_TRACE) && !atomic_read(&tr->overrun))) {
+ atomic_inc(&tr->underrun);
idx_next = 0;
+ }
if (unlikely(idx >= MAX_TRACE)) {
atomic_inc(&tr->overrun);
goto out;
}
#ifdef __HAVE_ARCH_CMPXCHG
- if (unlikely(cmpxchg(&tr->trace_idx, idx, idx_next) != idx))
+ if (unlikely(cmpxchg(&tr->trace_idx, idx, idx_next) != idx)) {
+ if (idx_next == 0)
+ atomic_dec(&tr->underrun);
goto again;
+ }
#else
# ifdef CONFIG_SMP
# error CMPXCHG missing
@@ -626,6 +632,9 @@ again:
tr->trace_idx = idx_next;
# endif
#endif
+ if (unlikely(idx_next != 0 && atomic_read(&tr->underrun)))
+ atomic_inc(&tr->underrun);
+
pc = preempt_count();

if (unlikely(!tr->trace))
@@ -938,13 +947,12 @@ char *pid_to_cmdline(unsigned long pid)
return cmdline;
}

-static void copy_trace(struct cpu_trace *save, struct cpu_trace *tr)
+static void copy_trace(struct cpu_trace *save, struct cpu_trace *tr, int reorder)
{
if (!save->trace || !tr->trace)
return;
/* free-running needs reordering */
- /* FIXME: what if we just switched back from freerunning mode? */
- if (trace_freerunning) {
+ if (reorder && atomic_read(&tr->underrun)) {
int i, idx, idx0 = tr->trace_idx;

for (i = 0; i < MAX_TRACE; i++) {
@@ -959,6 +967,7 @@ static void copy_trace(struct cpu_trace
min(save->trace_idx, MAX_TRACE) *
sizeof(struct trace_entry));
}
+ save->underrun = tr->underrun;
save->overrun = tr->overrun;
}

@@ -1010,7 +1019,7 @@ static void update_out_trace(void)
cycle_t stamp, first_stamp, last_stamp;
struct block_idx bidx = { { 0, }, };
struct cpu_trace *tmp_max, *tmp_out;
- int cpu, sum, entries, overrun_sum;
+ int cpu, sum, entries, underrun_sum, overrun_sum;

/*
* For out_tr we only have the first array's trace entries
@@ -1023,7 +1032,7 @@ static void update_out_trace(void)
* Easier to copy this way. Note: the trace buffer is private
* to the output buffer, so preserve it:
*/
- copy_trace(tmp_out, tmp_max);
+ copy_trace(tmp_out, tmp_max, 0);
tmp = tmp_out->trace;
*tmp_out = *tmp_max;
tmp_out->trace = tmp;
@@ -1134,12 +1143,15 @@ static void update_out_trace(void)
}

sum = 0;
+ underrun_sum = 0;
overrun_sum = 0;
for_each_online_cpu(cpu) {
sum += max_tr.traces[cpu].trace_idx;
+ underrun_sum += atomic_read(&max_tr.traces[cpu].underrun);
overrun_sum += atomic_read(&max_tr.traces[cpu].overrun);
}
tmp_out->trace_idx = sum;
+ atomic_set(&tmp_out->underrun, underrun_sum);
atomic_set(&tmp_out->overrun, overrun_sum);
}

@@ -1186,7 +1198,7 @@ static void * notrace l_start(struct seq
seq_puts(m, "--------------------------------------------------------------------\n");
seq_printf(m, " latency: %lu us, #%lu/%lu, CPU#%d | (M:%s VP:%d, KP:%d, SP:%d HP:%d",
cycles_to_usecs(tr->saved_latency),
- entries, entries + atomic_read(&tr->overrun),
+ entries, entries + atomic_read(&tr->underrun) + atomic_read(&tr->overrun),
out_tr.cpu,
#if defined(CONFIG_PREEMPT_NONE)
"server",
@@ -1629,11 +1641,11 @@ static void update_max_tr(struct cpu_tra

if (all_cpus) {
for_each_online_cpu(cpu) {
- copy_trace(max_tr.traces + cpu, cpu_traces + cpu);
+ copy_trace(max_tr.traces + cpu, cpu_traces + cpu, 1);
atomic_dec(&cpu_traces[cpu].disabled);
}
} else
- copy_trace(save, tr);
+ copy_trace(save, tr, 1);
}

#else /* !EVENT_TRACE */
@@ -1830,6 +1842,7 @@ __start_critical_timing(unsigned long ei
tr->critical_sequence = max_sequence;
tr->preempt_timestamp = get_monotonic_cycles();
tr->critical_start = eip;
+ atomic_set(&tr->underrun, 0);
atomic_set(&tr->overrun, 0);
reset_trace_idx(cpu, tr);
tr->latency_type = latency_type;
@@ -2208,6 +2221,7 @@ void __trace_start_sched_wakeup(struct t
tr->preempt_timestamp = get_monotonic_cycles();
tr->latency_type = WAKEUP_LATENCY;
tr->critical_start = CALLER_ADDR0;
+ atomic_set(&tr->underrun, 0);
atomic_set(&tr->overrun, 0);
_trace_cmdline(raw_smp_processor_id(), tr);
atomic_dec(&tr->disabled);
@@ -2318,6 +2332,7 @@ long user_trace_start(void)
tr->critical_sequence = max_sequence;
tr->preempt_timestamp = get_monotonic_cycles();
tr->critical_start = CALLER_ADDR0;
+ atomic_set(&tr->underrun, 0);
atomic_set(&tr->overrun, 0);
_trace_cmdline(cpu, tr);
mcount();

2006-12-06 15:26:18

by Karsten Wiese

[permalink] [raw]
Subject: [PATCH -rt 3/3] Make trace_freerunning work; Take 2: change reset_trace_idx()


Move atomic_set(&tr->underrun, 0) and atomic_set(&tr->overrun, 0)
occurrences into reset_trace_idx().
Note this leads to under/overrun being reset more often than before:
- in the trace_all_cpus case.
- from under check_critical_timing()

Signed-off-by: Karsten Wiese <[email protected]>


--- rt6-kw/kernel/latency_trace-tk2.2.c 2006-12-06 14:58:44.000000000 +0100
+++ rt6-kw/kernel/latency_trace.c 2006-12-06 15:37:08.000000000 +0100
@@ -1695,10 +1695,17 @@ __setup("preempt_thresh=", setup_preempt
static inline void notrace reset_trace_idx(int cpu, struct cpu_trace *tr)
{
if (trace_all_cpus)
- for_each_online_cpu(cpu)
- cpu_traces[cpu].trace_idx = 0;
- else
+ for_each_online_cpu(cpu) {
+ tr = cpu_traces + cpu;
+ tr->trace_idx = 0;
+ atomic_set(&tr->underrun, 0);
+ atomic_set(&tr->overrun, 0);
+ }
+ else{
tr->trace_idx = 0;
+ atomic_set(&tr->underrun, 0);
+ atomic_set(&tr->overrun, 0);
+ }
}

#ifdef CONFIG_CRITICAL_TIMING
@@ -1842,8 +1849,6 @@ __start_critical_timing(unsigned long ei
tr->critical_sequence = max_sequence;
tr->preempt_timestamp = get_monotonic_cycles();
tr->critical_start = eip;
- atomic_set(&tr->underrun, 0);
- atomic_set(&tr->overrun, 0);
reset_trace_idx(cpu, tr);
tr->latency_type = latency_type;
_trace_cmdline(cpu, tr);
@@ -2221,8 +2226,6 @@ void __trace_start_sched_wakeup(struct t
tr->preempt_timestamp = get_monotonic_cycles();
tr->latency_type = WAKEUP_LATENCY;
tr->critical_start = CALLER_ADDR0;
- atomic_set(&tr->underrun, 0);
- atomic_set(&tr->overrun, 0);
_trace_cmdline(raw_smp_processor_id(), tr);
atomic_dec(&tr->disabled);
// }
@@ -2332,8 +2335,6 @@ long user_trace_start(void)
tr->critical_sequence = max_sequence;
tr->preempt_timestamp = get_monotonic_cycles();
tr->critical_start = CALLER_ADDR0;
- atomic_set(&tr->underrun, 0);
- atomic_set(&tr->overrun, 0);
_trace_cmdline(cpu, tr);
mcount();

2006-12-06 15:55:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -rt 0/3] Make trace_freerunning work; Take 2


* Karsten Wiese <[email protected]> wrote:

> Following 3 patches try to implement the above.
>
> Tested on a UP only after this incantation:
> echo 0 > /proc/sys/kernel/wakeup_timing
> echo 1 > /proc/sys/kernel/trace_enabled
> echo 1 > /proc/sys/kernel/trace_user_triggered
>
> and for half of tests:
> echo 1 > /proc/sys/kernel/trace_freerunning
> or
> echo 0 > /proc/sys/kernel/trace_freerunning
> .

thanks - nice work! I have applied all 3 of them and it worked fine for
a couple of simple tests on my box too.

Ingo