Subject: [PATCH 0/26] oprofile: Performance counter multiplexing

This patch series introduces performance counter multiplexing for
oprofile.

The number of hardware counters is limited. The multiplexing feature
enables OProfile to gather more events than counters are provided by
the hardware. This is realized by switching between events at an user
specified time interval.

A new file (/dev/oprofile/time_slice) is added for the user to specify
the timer interval in ms. If the number of events to profile is higher
than the number of hardware counters available, the patch will
schedule a work queue that switches the event counter and re-writes
the different sets of values into it. The switching mechanism needs to
be implemented for each architecture to support multiplexing. This
patch series only implements AMD CPU support, but multiplexing can be
easily extended for other models and architectures.

The patch set includes the initial patch from Jason Yeh and many code
improvements and reworks on top. It evolved over time and documents
its development. Thus, for review also the enclosed overall diff might
be useful.

The patches can be pulled from:

git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git mux

-Robert


The following changes since commit 8045a4c293d36c61656a20d581b11f7f0cd7acd5:
Robert Richter (1):
x86/oprofile: Fix cast of counter value

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git mux

Jason Yeh (1):
oprofile: Implement performance counter multiplexing

Robert Richter (25):
x86/oprofile: Rework and simplify nmi_cpu_setup()
x86/oprofile: Whitespaces changes only
x86/oprofile: Fix usage of NUM_CONTROLS/NUM_COUNTERS macros
x86/oprofile: Use per_cpu() instead of __get_cpu_var()
x86/oprofile: Fix initialization of switch_index
oprofile: oprofile_set_timeout(), return with error for invalid args
oprofile: Rename variable timeout_jiffies and move to oprofile_files.c
oprofile: Remove oprofile_multiplexing_init()
oprofile: Grouping multiplexing code in oprof.c
oprofile: Introduce op_x86_phys_to_virt()
oprofile: Grouping multiplexing code in op_model_amd.c
x86/oprofile: Implement multiplexing setup/shutdown functions
x86/oprofile: Moving nmi_setup_cpu_mux() in nmi_int.c
x86/oprofile: Moving nmi_cpu_save/restore_mpx_registers() in nmi_int.c
x86/oprofile: Moving nmi_cpu_switch() in nmi_int.c
x86/oprofile: Remove const qualifier from struct op_x86_model_spec
x86/oprofile: Remove unused num_virt_controls from struct op_x86_model_spec
x86/oprofile: Modify initialization of num_virt_counters
x86/oprofile: Add function has_mux() to check multiplexing support
x86/oprofile: Enable multiplexing only if the model supports it
x86/oprofile: Implement mux_clone()
oprofile: Adding switch counter to oprofile statistic variables
x86/oprofile: Implement op_x86_virt_to_phys()
x86/oprofile: Add counter reservation check for virtual counters
x86/oprofile: Small coding style fixes

arch/Kconfig | 12 ++
arch/x86/oprofile/nmi_int.c | 307 +++++++++++++++++++++++++++++--------
arch/x86/oprofile/op_counter.h | 2 +-
arch/x86/oprofile/op_model_amd.c | 127 ++++++++++++----
arch/x86/oprofile/op_model_p4.c | 12 +-
arch/x86/oprofile/op_model_ppro.c | 10 +-
arch/x86/oprofile/op_x86_model.h | 16 ++-
drivers/oprofile/oprof.c | 71 +++++++++-
drivers/oprofile/oprof.h | 3 +
drivers/oprofile/oprofile_files.c | 46 ++++++
drivers/oprofile/oprofile_stats.c | 5 +
drivers/oprofile/oprofile_stats.h | 1 +
include/linux/oprofile.h | 3 +
13 files changed, 501 insertions(+), 114 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 99193b1..beea3cc 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -30,6 +30,18 @@ config OPROFILE_IBS

If unsure, say N.

+config OPROFILE_EVENT_MULTIPLEX
+ bool "OProfile multiplexing support (EXPERIMENTAL)"
+ default n
+ depends on OPROFILE && X86
+ help
+ The number of hardware counters is limited. The multiplexing
+ feature enables OProfile to gather more events than counters
+ are provided by the hardware. This is realized by switching
+ between events at an user specified time interval.
+
+ If unsure, say N.
+
config HAVE_OPROFILE
bool

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 93df76d..cb88b1a 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -1,11 +1,14 @@
/**
* @file nmi_int.c
*
- * @remark Copyright 2002-2008 OProfile authors
+ * @remark Copyright 2002-2009 OProfile authors
* @remark Read the file COPYING
*
* @author John Levon <[email protected]>
* @author Robert Richter <[email protected]>
+ * @author Barry Kasindorf <[email protected]>
+ * @author Jason Yeh <[email protected]>
+ * @author Suravee Suthikulpanit <[email protected]>
*/

#include <linux/init.h>
@@ -24,13 +27,15 @@
#include "op_counter.h"
#include "op_x86_model.h"

-static struct op_x86_model_spec const *model;
+static struct op_x86_model_spec *model;
static DEFINE_PER_CPU(struct op_msrs, cpu_msrs);
static DEFINE_PER_CPU(unsigned long, saved_lvtpc);

/* 0 == registered but off, 1 == registered and on */
static int nmi_enabled = 0;

+struct op_counter_config counter_config[OP_MAX_COUNTER];
+
/* common functions */

u64 op_x86_get_ctrl(struct op_x86_model_spec const *model,
@@ -87,13 +92,199 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs)
}
}

-static void nmi_save_registers(void *dummy)
+static void nmi_cpu_start(void *dummy)
+{
+ struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+ model->start(msrs);
+}
+
+static int nmi_start(void)
+{
+ on_each_cpu(nmi_cpu_start, NULL, 1);
+ return 0;
+}
+
+static void nmi_cpu_stop(void *dummy)
+{
+ struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+ model->stop(msrs);
+}
+
+static void nmi_stop(void)
+{
+ on_each_cpu(nmi_cpu_stop, NULL, 1);
+}
+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static DEFINE_PER_CPU(int, switch_index);
+
+static inline int has_mux(void)
+{
+ return !!model->switch_ctrl;
+}
+
+inline int op_x86_phys_to_virt(int phys)
+{
+ return __get_cpu_var(switch_index) + phys;
+}
+
+inline int op_x86_virt_to_phys(int virt)
+{
+ return virt % model->num_counters;
+}
+
+static void nmi_shutdown_mux(void)
+{
+ int i;
+
+ if (!has_mux())
+ return;
+
+ for_each_possible_cpu(i) {
+ kfree(per_cpu(cpu_msrs, i).multiplex);
+ per_cpu(cpu_msrs, i).multiplex = NULL;
+ per_cpu(switch_index, i) = 0;
+ }
+}
+
+static int nmi_setup_mux(void)
+{
+ size_t multiplex_size =
+ sizeof(struct op_msr) * model->num_virt_counters;
+ int i;
+
+ if (!has_mux())
+ return 1;
+
+ for_each_possible_cpu(i) {
+ per_cpu(cpu_msrs, i).multiplex =
+ kmalloc(multiplex_size, GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).multiplex)
+ return 0;
+ }
+
+ return 1;
+}
+
+static void nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs)
+{
+ int i;
+ struct op_msr *multiplex = msrs->multiplex;
+
+ if (!has_mux())
+ return;
+
+ for (i = 0; i < model->num_virt_counters; ++i) {
+ if (counter_config[i].enabled) {
+ multiplex[i].saved = -(u64)counter_config[i].count;
+ } else {
+ multiplex[i].addr = 0;
+ multiplex[i].saved = 0;
+ }
+ }
+
+ per_cpu(switch_index, cpu) = 0;
+}
+
+static void nmi_cpu_save_mpx_registers(struct op_msrs *msrs)
+{
+ struct op_msr *multiplex = msrs->multiplex;
+ int i;
+
+ for (i = 0; i < model->num_counters; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (multiplex[virt].addr)
+ rdmsrl(multiplex[virt].addr, multiplex[virt].saved);
+ }
+}
+
+static void nmi_cpu_restore_mpx_registers(struct op_msrs *msrs)
+{
+ struct op_msr *multiplex = msrs->multiplex;
+ int i;
+
+ for (i = 0; i < model->num_counters; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (multiplex[virt].addr)
+ wrmsrl(multiplex[virt].addr, multiplex[virt].saved);
+ }
+}
+
+static void nmi_cpu_switch(void *dummy)
{
int cpu = smp_processor_id();
+ int si = per_cpu(switch_index, cpu);
struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
- nmi_cpu_save_registers(msrs);
+
+ nmi_cpu_stop(NULL);
+ nmi_cpu_save_mpx_registers(msrs);
+
+ /* move to next set */
+ si += model->num_counters;
+ if ((si > model->num_virt_counters) || (counter_config[si].count == 0))
+ per_cpu(switch_index, cpu) = 0;
+ else
+ per_cpu(switch_index, cpu) = si;
+
+ model->switch_ctrl(model, msrs);
+ nmi_cpu_restore_mpx_registers(msrs);
+
+ nmi_cpu_start(NULL);
}

+
+/*
+ * Quick check to see if multiplexing is necessary.
+ * The check should be sufficient since counters are used
+ * in ordre.
+ */
+static int nmi_multiplex_on(void)
+{
+ return counter_config[model->num_counters].count ? 0 : -EINVAL;
+}
+
+static int nmi_switch_event(void)
+{
+ if (!has_mux())
+ return -ENOSYS; /* not implemented */
+ if (nmi_multiplex_on() < 0)
+ return -EINVAL; /* not necessary */
+
+ on_each_cpu(nmi_cpu_switch, NULL, 1);
+
+ return 0;
+}
+
+static inline void mux_init(struct oprofile_operations *ops)
+{
+ if (has_mux())
+ ops->switch_events = nmi_switch_event;
+}
+
+static void mux_clone(int cpu)
+{
+ if (!has_mux())
+ return;
+
+ memcpy(per_cpu(cpu_msrs, cpu).multiplex,
+ per_cpu(cpu_msrs, 0).multiplex,
+ sizeof(struct op_msr) * model->num_virt_counters);
+}
+
+#else
+
+inline int op_x86_phys_to_virt(int phys) { return phys; }
+inline int op_x86_virt_to_phys(int virt) { return virt; }
+static inline void nmi_shutdown_mux(void) { }
+static inline int nmi_setup_mux(void) { return 1; }
+static inline void
+nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs) { }
+static inline void mux_init(struct oprofile_operations *ops) { }
+static void mux_clone(int cpu) { }
+
+#endif
+
static void free_msrs(void)
{
int i;
@@ -107,38 +298,32 @@ static void free_msrs(void)

static int allocate_msrs(void)
{
- int success = 1;
size_t controls_size = sizeof(struct op_msr) * model->num_controls;
size_t counters_size = sizeof(struct op_msr) * model->num_counters;

int i;
for_each_possible_cpu(i) {
per_cpu(cpu_msrs, i).counters = kmalloc(counters_size,
- GFP_KERNEL);
- if (!per_cpu(cpu_msrs, i).counters) {
- success = 0;
- break;
- }
+ GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).counters)
+ return 0;
per_cpu(cpu_msrs, i).controls = kmalloc(controls_size,
- GFP_KERNEL);
- if (!per_cpu(cpu_msrs, i).controls) {
- success = 0;
- break;
- }
+ GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).controls)
+ return 0;
}

- if (!success)
- free_msrs();
-
- return success;
+ return 1;
}

static void nmi_cpu_setup(void *dummy)
{
int cpu = smp_processor_id();
struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
+ nmi_cpu_save_registers(msrs);
spin_lock(&oprofilefs_lock);
model->setup_ctrs(model, msrs);
+ nmi_cpu_setup_mux(cpu, msrs);
spin_unlock(&oprofilefs_lock);
per_cpu(saved_lvtpc, cpu) = apic_read(APIC_LVTPC);
apic_write(APIC_LVTPC, APIC_DM_NMI);
@@ -156,11 +341,15 @@ static int nmi_setup(void)
int cpu;

if (!allocate_msrs())
- return -ENOMEM;
+ err = -ENOMEM;
+ else if (!nmi_setup_mux())
+ err = -ENOMEM;
+ else
+ err = register_die_notifier(&profile_exceptions_nb);

- err = register_die_notifier(&profile_exceptions_nb);
if (err) {
free_msrs();
+ nmi_shutdown_mux();
return err;
}

@@ -171,24 +360,25 @@ static int nmi_setup(void)
/* Assume saved/restored counters are the same on all CPUs */
model->fill_in_addresses(&per_cpu(cpu_msrs, 0));
for_each_possible_cpu(cpu) {
- if (cpu != 0) {
- memcpy(per_cpu(cpu_msrs, cpu).counters,
- per_cpu(cpu_msrs, 0).counters,
- sizeof(struct op_msr) * model->num_counters);
-
- memcpy(per_cpu(cpu_msrs, cpu).controls,
- per_cpu(cpu_msrs, 0).controls,
- sizeof(struct op_msr) * model->num_controls);
- }
+ if (!cpu)
+ continue;

+ memcpy(per_cpu(cpu_msrs, cpu).counters,
+ per_cpu(cpu_msrs, 0).counters,
+ sizeof(struct op_msr) * model->num_counters);
+
+ memcpy(per_cpu(cpu_msrs, cpu).controls,
+ per_cpu(cpu_msrs, 0).controls,
+ sizeof(struct op_msr) * model->num_controls);
+
+ mux_clone(cpu);
}
- on_each_cpu(nmi_save_registers, NULL, 1);
on_each_cpu(nmi_cpu_setup, NULL, 1);
nmi_enabled = 1;
return 0;
}

-static void nmi_restore_registers(struct op_msrs *msrs)
+static void nmi_cpu_restore_registers(struct op_msrs *msrs)
{
struct op_msr *counters = msrs->counters;
struct op_msr *controls = msrs->controls;
@@ -209,7 +399,7 @@ static void nmi_cpu_shutdown(void *dummy)
{
unsigned int v;
int cpu = smp_processor_id();
- struct op_msrs *msrs = &__get_cpu_var(cpu_msrs);
+ struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);

/* restoring APIC_LVTPC can trigger an apic error because the delivery
* mode and vector nr combination can be illegal. That's by design: on
@@ -220,7 +410,7 @@ static void nmi_cpu_shutdown(void *dummy)
apic_write(APIC_LVTERR, v | APIC_LVT_MASKED);
apic_write(APIC_LVTPC, per_cpu(saved_lvtpc, cpu));
apic_write(APIC_LVTERR, v);
- nmi_restore_registers(msrs);
+ nmi_cpu_restore_registers(msrs);
}

static void nmi_shutdown(void)
@@ -230,42 +420,18 @@ static void nmi_shutdown(void)
nmi_enabled = 0;
on_each_cpu(nmi_cpu_shutdown, NULL, 1);
unregister_die_notifier(&profile_exceptions_nb);
+ nmi_shutdown_mux();
msrs = &get_cpu_var(cpu_msrs);
model->shutdown(msrs);
free_msrs();
put_cpu_var(cpu_msrs);
}

-static void nmi_cpu_start(void *dummy)
-{
- struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
- model->start(msrs);
-}
-
-static int nmi_start(void)
-{
- on_each_cpu(nmi_cpu_start, NULL, 1);
- return 0;
-}
-
-static void nmi_cpu_stop(void *dummy)
-{
- struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
- model->stop(msrs);
-}
-
-static void nmi_stop(void)
-{
- on_each_cpu(nmi_cpu_stop, NULL, 1);
-}
-
-struct op_counter_config counter_config[OP_MAX_COUNTER];
-
static int nmi_create_files(struct super_block *sb, struct dentry *root)
{
unsigned int i;

- for (i = 0; i < model->num_counters; ++i) {
+ for (i = 0; i < model->num_virt_counters; ++i) {
struct dentry *dir;
char buf[4];

@@ -274,7 +440,7 @@ static int nmi_create_files(struct super_block *sb, struct dentry *root)
* NOTE: assumes 1:1 mapping here (that counters are organized
* sequentially in their struct assignment).
*/
- if (unlikely(!avail_to_resrv_perfctr_nmi_bit(i)))
+ if (!avail_to_resrv_perfctr_nmi_bit(op_x86_virt_to_phys(i)))
continue;

snprintf(buf, sizeof(buf), "%d", i);
@@ -406,7 +572,7 @@ module_param_call(cpu_type, force_cpu_type, NULL, NULL, 0);
static int __init ppro_init(char **cpu_type)
{
__u8 cpu_model = boot_cpu_data.x86_model;
- struct op_x86_model_spec const *spec = &op_ppro_spec; /* default */
+ struct op_x86_model_spec *spec = &op_ppro_spec; /* default */

if (force_arch_perfmon && cpu_has_arch_perfmon)
return 0;
@@ -523,18 +689,23 @@ int __init op_nmi_init(struct oprofile_operations *ops)
register_cpu_notifier(&oprofile_cpu_nb);
#endif
/* default values, can be overwritten by model */
- ops->create_files = nmi_create_files;
- ops->setup = nmi_setup;
- ops->shutdown = nmi_shutdown;
- ops->start = nmi_start;
- ops->stop = nmi_stop;
- ops->cpu_type = cpu_type;
+ ops->create_files = nmi_create_files;
+ ops->setup = nmi_setup;
+ ops->shutdown = nmi_shutdown;
+ ops->start = nmi_start;
+ ops->stop = nmi_stop;
+ ops->cpu_type = cpu_type;

if (model->init)
ret = model->init(ops);
if (ret)
return ret;

+ if (!model->num_virt_counters)
+ model->num_virt_counters = model->num_counters;
+
+ mux_init(ops);
+
init_sysfs();
using_nmi = 1;
printk(KERN_INFO "oprofile: using NMI interrupt.\n");
diff --git a/arch/x86/oprofile/op_counter.h b/arch/x86/oprofile/op_counter.h
index 91b6a11..e28398d 100644
--- a/arch/x86/oprofile/op_counter.h
+++ b/arch/x86/oprofile/op_counter.h
@@ -10,7 +10,7 @@
#ifndef OP_COUNTER_H
#define OP_COUNTER_H

-#define OP_MAX_COUNTER 8
+#define OP_MAX_COUNTER 32

/* Per-perfctr configuration as set via
* oprofilefs.
diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index 7ca8306..827beec 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -9,12 +9,15 @@
* @author Philippe Elie
* @author Graydon Hoare
* @author Robert Richter <[email protected]>
- * @author Barry Kasindorf
+ * @author Barry Kasindorf <[email protected]>
+ * @author Jason Yeh <[email protected]>
+ * @author Suravee Suthikulpanit <[email protected]>
*/

#include <linux/oprofile.h>
#include <linux/device.h>
#include <linux/pci.h>
+#include <linux/percpu.h>

#include <asm/ptrace.h>
#include <asm/msr.h>
@@ -25,12 +28,20 @@

#define NUM_COUNTERS 4
#define NUM_CONTROLS 4
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+#define NUM_VIRT_COUNTERS 32
+#define NUM_VIRT_CONTROLS 32
+#else
+#define NUM_VIRT_COUNTERS NUM_COUNTERS
+#define NUM_VIRT_CONTROLS NUM_CONTROLS
+#endif
+
#define OP_EVENT_MASK 0x0FFF
#define OP_CTR_OVERFLOW (1ULL<<31)

#define MSR_AMD_EVENTSEL_RESERVED ((0xFFFFFCF0ULL<<32)|(1ULL<<21))

-static unsigned long reset_value[NUM_COUNTERS];
+static unsigned long reset_value[NUM_VIRT_COUNTERS];

#ifdef CONFIG_OPROFILE_IBS

@@ -63,6 +74,45 @@ static struct op_ibs_config ibs_config;

#endif

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void op_mux_fill_in_addresses(struct op_msrs * const msrs)
+{
+ int i;
+
+ for (i = 0; i < NUM_VIRT_COUNTERS; i++) {
+ int hw_counter = op_x86_virt_to_phys(i);
+ if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
+ msrs->multiplex[i].addr = MSR_K7_PERFCTR0 + hw_counter;
+ else
+ msrs->multiplex[i].addr = 0;
+ }
+}
+
+static void op_mux_switch_ctrl(struct op_x86_model_spec const *model,
+ struct op_msrs const * const msrs)
+{
+ u64 val;
+ int i;
+
+ /* enable active counters */
+ for (i = 0; i < NUM_COUNTERS; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (!counter_config[virt].enabled)
+ continue;
+ rdmsrl(msrs->controls[i].addr, val);
+ val &= model->reserved;
+ val |= op_x86_get_ctrl(model, &counter_config[virt]);
+ wrmsrl(msrs->controls[i].addr, val);
+ }
+}
+
+#else
+
+static inline void op_mux_fill_in_addresses(struct op_msrs * const msrs) { }
+
+#endif
+
/* functions for op_amd_spec */

static void op_amd_fill_in_addresses(struct op_msrs * const msrs)
@@ -82,6 +132,8 @@ static void op_amd_fill_in_addresses(struct op_msrs * const msrs)
else
msrs->controls[i].addr = 0;
}
+
+ op_mux_fill_in_addresses(msrs);
}

static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
@@ -90,8 +142,16 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
u64 val;
int i;

+ /* setup reset_value */
+ for (i = 0; i < NUM_VIRT_COUNTERS; ++i) {
+ if (counter_config[i].enabled)
+ reset_value[i] = counter_config[i].count;
+ else
+ reset_value[i] = 0;
+ }
+
/* clear all counters */
- for (i = 0 ; i < NUM_CONTROLS; ++i) {
+ for (i = 0; i < NUM_CONTROLS; ++i) {
if (unlikely(!msrs->controls[i].addr))
continue;
rdmsrl(msrs->controls[i].addr, val);
@@ -108,17 +168,20 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,

/* enable active counters */
for (i = 0; i < NUM_COUNTERS; ++i) {
- if (counter_config[i].enabled && msrs->counters[i].addr) {
- reset_value[i] = counter_config[i].count;
- wrmsrl(msrs->counters[i].addr,
- -(u64)counter_config[i].count);
- rdmsrl(msrs->controls[i].addr, val);
- val &= model->reserved;
- val |= op_x86_get_ctrl(model, &counter_config[i]);
- wrmsrl(msrs->controls[i].addr, val);
- } else {
- reset_value[i] = 0;
- }
+ int virt = op_x86_phys_to_virt(i);
+ if (!counter_config[virt].enabled)
+ continue;
+ if (!msrs->counters[i].addr)
+ continue;
+
+ /* setup counter registers */
+ wrmsrl(msrs->counters[i].addr, -(u64)reset_value[virt]);
+
+ /* setup control registers */
+ rdmsrl(msrs->controls[i].addr, val);
+ val &= model->reserved;
+ val |= op_x86_get_ctrl(model, &counter_config[virt]);
+ wrmsrl(msrs->controls[i].addr, val);
}
}

@@ -229,15 +292,16 @@ static int op_amd_check_ctrs(struct pt_regs * const regs,
u64 val;
int i;

- for (i = 0 ; i < NUM_COUNTERS; ++i) {
- if (!reset_value[i])
+ for (i = 0; i < NUM_COUNTERS; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (!reset_value[virt])
continue;
rdmsrl(msrs->counters[i].addr, val);
/* bit is clear if overflowed: */
if (val & OP_CTR_OVERFLOW)
continue;
- oprofile_add_sample(regs, i);
- wrmsrl(msrs->counters[i].addr, -(u64)reset_value[i]);
+ oprofile_add_sample(regs, virt);
+ wrmsrl(msrs->counters[i].addr, -(u64)reset_value[virt]);
}

op_amd_handle_ibs(regs, msrs);
@@ -250,12 +314,13 @@ static void op_amd_start(struct op_msrs const * const msrs)
{
u64 val;
int i;
- for (i = 0 ; i < NUM_COUNTERS ; ++i) {
- if (reset_value[i]) {
- rdmsrl(msrs->controls[i].addr, val);
- val |= ARCH_PERFMON_EVENTSEL0_ENABLE;
- wrmsrl(msrs->controls[i].addr, val);
- }
+
+ for (i = 0; i < NUM_COUNTERS; ++i) {
+ if (!reset_value[op_x86_phys_to_virt(i)])
+ continue;
+ rdmsrl(msrs->controls[i].addr, val);
+ val |= ARCH_PERFMON_EVENTSEL0_ENABLE;
+ wrmsrl(msrs->controls[i].addr, val);
}

op_amd_start_ibs();
@@ -270,8 +335,8 @@ static void op_amd_stop(struct op_msrs const * const msrs)
* Subtle: stop on all counters to avoid race with setting our
* pm callback
*/
- for (i = 0 ; i < NUM_COUNTERS ; ++i) {
- if (!reset_value[i])
+ for (i = 0; i < NUM_COUNTERS; ++i) {
+ if (!reset_value[op_x86_phys_to_virt(i)])
continue;
rdmsrl(msrs->controls[i].addr, val);
val &= ~ARCH_PERFMON_EVENTSEL0_ENABLE;
@@ -285,11 +350,11 @@ static void op_amd_shutdown(struct op_msrs const * const msrs)
{
int i;

- for (i = 0 ; i < NUM_COUNTERS ; ++i) {
+ for (i = 0; i < NUM_COUNTERS; ++i) {
if (msrs->counters[i].addr)
release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
}
- for (i = 0 ; i < NUM_CONTROLS ; ++i) {
+ for (i = 0; i < NUM_CONTROLS; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
}
@@ -460,9 +525,10 @@ static void op_amd_exit(void) {}

#endif /* CONFIG_OPROFILE_IBS */

-struct op_x86_model_spec const op_amd_spec = {
+struct op_x86_model_spec op_amd_spec = {
.num_counters = NUM_COUNTERS,
.num_controls = NUM_CONTROLS,
+ .num_virt_counters = NUM_VIRT_COUNTERS,
.reserved = MSR_AMD_EVENTSEL_RESERVED,
.event_mask = OP_EVENT_MASK,
.init = op_amd_init,
@@ -473,4 +539,7 @@ struct op_x86_model_spec const op_amd_spec = {
.start = &op_amd_start,
.stop = &op_amd_stop,
.shutdown = &op_amd_shutdown,
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ .switch_ctrl = &op_mux_switch_ctrl,
+#endif
};
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 9db9e36..ac6b354 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -558,7 +558,7 @@ static void p4_setup_ctrs(struct op_x86_model_spec const *model,
}

/* clear the cccrs we will use */
- for (i = 0 ; i < num_counters ; i++) {
+ for (i = 0; i < num_counters; i++) {
if (unlikely(!msrs->controls[i].addr))
continue;
rdmsr(p4_counters[VIRT_CTR(stag, i)].cccr_address, low, high);
@@ -575,7 +575,7 @@ static void p4_setup_ctrs(struct op_x86_model_spec const *model,
}

/* setup all counters */
- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (counter_config[i].enabled && msrs->controls[i].addr) {
reset_value[i] = counter_config[i].count;
pmc_setup_one_p4_counter(i);
@@ -678,7 +678,7 @@ static void p4_shutdown(struct op_msrs const * const msrs)
{
int i;

- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (msrs->counters[i].addr)
release_perfctr_nmi(msrs->counters[i].addr);
}
@@ -687,7 +687,7 @@ static void p4_shutdown(struct op_msrs const * const msrs)
* conjunction with the counter registers (hence the starting offset).
* This saves a few bits.
*/
- for (i = num_counters ; i < num_controls ; ++i) {
+ for (i = num_counters; i < num_controls; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(msrs->controls[i].addr);
}
@@ -695,7 +695,7 @@ static void p4_shutdown(struct op_msrs const * const msrs)


#ifdef CONFIG_SMP
-struct op_x86_model_spec const op_p4_ht2_spec = {
+struct op_x86_model_spec op_p4_ht2_spec = {
.num_counters = NUM_COUNTERS_HT2,
.num_controls = NUM_CONTROLS_HT2,
.fill_in_addresses = &p4_fill_in_addresses,
@@ -707,7 +707,7 @@ struct op_x86_model_spec const op_p4_ht2_spec = {
};
#endif

-struct op_x86_model_spec const op_p4_spec = {
+struct op_x86_model_spec op_p4_spec = {
.num_counters = NUM_COUNTERS_NON_HT,
.num_controls = NUM_CONTROLS_NON_HT,
.fill_in_addresses = &p4_fill_in_addresses,
diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index cd72d5c..4899215 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -81,7 +81,7 @@ static void ppro_setup_ctrs(struct op_x86_model_spec const *model,
}

/* clear all counters */
- for (i = 0 ; i < num_counters; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (unlikely(!msrs->controls[i].addr))
continue;
rdmsrl(msrs->controls[i].addr, val);
@@ -125,7 +125,7 @@ static int ppro_check_ctrs(struct pt_regs * const regs,
if (unlikely(!reset_value))
goto out;

- for (i = 0 ; i < num_counters; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (!reset_value[i])
continue;
rdmsrl(msrs->counters[i].addr, val);
@@ -188,11 +188,11 @@ static void ppro_shutdown(struct op_msrs const * const msrs)
{
int i;

- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (msrs->counters[i].addr)
release_perfctr_nmi(MSR_P6_PERFCTR0 + i);
}
- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(MSR_P6_EVNTSEL0 + i);
}
@@ -203,7 +203,7 @@ static void ppro_shutdown(struct op_msrs const * const msrs)
}


-struct op_x86_model_spec const op_ppro_spec = {
+struct op_x86_model_spec op_ppro_spec = {
.num_counters = 2,
.num_controls = 2,
.reserved = MSR_PPRO_EVENTSEL_RESERVED,
diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h
index 5054898..b837761 100644
--- a/arch/x86/oprofile/op_x86_model.h
+++ b/arch/x86/oprofile/op_x86_model.h
@@ -23,6 +23,7 @@ struct op_msr {
struct op_msrs {
struct op_msr *counters;
struct op_msr *controls;
+ struct op_msr *multiplex;
};

struct pt_regs;
@@ -35,6 +36,7 @@ struct oprofile_operations;
struct op_x86_model_spec {
unsigned int num_counters;
unsigned int num_controls;
+ unsigned int num_virt_counters;
u64 reserved;
u16 event_mask;
int (*init)(struct oprofile_operations *ops);
@@ -47,17 +49,23 @@ struct op_x86_model_spec {
void (*start)(struct op_msrs const * const msrs);
void (*stop)(struct op_msrs const * const msrs);
void (*shutdown)(struct op_msrs const * const msrs);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ void (*switch_ctrl)(struct op_x86_model_spec const *model,
+ struct op_msrs const * const msrs);
+#endif
};

struct op_counter_config;

extern u64 op_x86_get_ctrl(struct op_x86_model_spec const *model,
struct op_counter_config *counter_config);
+extern int op_x86_phys_to_virt(int phys);
+extern int op_x86_virt_to_phys(int virt);

-extern struct op_x86_model_spec const op_ppro_spec;
-extern struct op_x86_model_spec const op_p4_spec;
-extern struct op_x86_model_spec const op_p4_ht2_spec;
-extern struct op_x86_model_spec const op_amd_spec;
+extern struct op_x86_model_spec op_ppro_spec;
+extern struct op_x86_model_spec op_p4_spec;
+extern struct op_x86_model_spec op_p4_ht2_spec;
+extern struct op_x86_model_spec op_amd_spec;
extern struct op_x86_model_spec op_arch_perfmon_spec;

#endif /* OP_X86_MODEL_H */
diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 3cffce9..dc8a042 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -12,6 +12,8 @@
#include <linux/init.h>
#include <linux/oprofile.h>
#include <linux/moduleparam.h>
+#include <linux/workqueue.h>
+#include <linux/time.h>
#include <asm/mutex.h>

#include "oprof.h"
@@ -87,6 +89,69 @@ out:
return err;
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void switch_worker(struct work_struct *work);
+static DECLARE_DELAYED_WORK(switch_work, switch_worker);
+
+static void start_switch_worker(void)
+{
+ if (oprofile_ops.switch_events)
+ schedule_delayed_work(&switch_work, oprofile_time_slice);
+}
+
+static void stop_switch_worker(void)
+{
+ cancel_delayed_work_sync(&switch_work);
+}
+
+static void switch_worker(struct work_struct *work)
+{
+ if (oprofile_ops.switch_events())
+ return;
+
+ atomic_inc(&oprofile_stats.multiplex_counter);
+ start_switch_worker();
+}
+
+/* User inputs in ms, converts to jiffies */
+int oprofile_set_timeout(unsigned long val_msec)
+{
+ int err = 0;
+ unsigned long time_slice;
+
+ mutex_lock(&start_mutex);
+
+ if (oprofile_started) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ if (!oprofile_ops.switch_events) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ time_slice = msecs_to_jiffies(val_msec);
+ if (time_slice == MAX_JIFFY_OFFSET) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ oprofile_time_slice = time_slice;
+
+out:
+ mutex_unlock(&start_mutex);
+ return err;
+
+}
+
+#else
+
+static inline void start_switch_worker(void) { }
+static inline void stop_switch_worker(void) { }
+
+#endif

/* Actually start profiling (echo 1>/dev/oprofile/enable) */
int oprofile_start(void)
@@ -108,6 +173,8 @@ int oprofile_start(void)
if ((err = oprofile_ops.start()))
goto out;

+ start_switch_worker();
+
oprofile_started = 1;
out:
mutex_unlock(&start_mutex);
@@ -123,6 +190,9 @@ void oprofile_stop(void)
goto out;
oprofile_ops.stop();
oprofile_started = 0;
+
+ stop_switch_worker();
+
/* wake up the daemon to read what remains */
wake_up_buffer_waiter();
out:
@@ -155,7 +225,6 @@ post_sync:
mutex_unlock(&start_mutex);
}

-
int oprofile_set_backtrace(unsigned long val)
{
int err = 0;
diff --git a/drivers/oprofile/oprof.h b/drivers/oprofile/oprof.h
index c288d3c..cb92f5c 100644
--- a/drivers/oprofile/oprof.h
+++ b/drivers/oprofile/oprof.h
@@ -24,6 +24,8 @@ struct oprofile_operations;
extern unsigned long oprofile_buffer_size;
extern unsigned long oprofile_cpu_buffer_size;
extern unsigned long oprofile_buffer_watershed;
+extern unsigned long oprofile_time_slice;
+
extern struct oprofile_operations oprofile_ops;
extern unsigned long oprofile_started;
extern unsigned long oprofile_backtrace_depth;
@@ -35,5 +37,6 @@ void oprofile_create_files(struct super_block *sb, struct dentry *root);
void oprofile_timer_init(struct oprofile_operations *ops);

int oprofile_set_backtrace(unsigned long depth);
+int oprofile_set_timeout(unsigned long time);

#endif /* OPROF_H */
diff --git a/drivers/oprofile/oprofile_files.c b/drivers/oprofile/oprofile_files.c
index 5d36ffc..bbd7516 100644
--- a/drivers/oprofile/oprofile_files.c
+++ b/drivers/oprofile/oprofile_files.c
@@ -9,6 +9,7 @@

#include <linux/fs.h>
#include <linux/oprofile.h>
+#include <linux/jiffies.h>

#include "event_buffer.h"
#include "oprofile_stats.h"
@@ -17,10 +18,51 @@
#define BUFFER_SIZE_DEFAULT 131072
#define CPU_BUFFER_SIZE_DEFAULT 8192
#define BUFFER_WATERSHED_DEFAULT 32768 /* FIXME: tune */
+#define TIME_SLICE_DEFAULT 1

unsigned long oprofile_buffer_size;
unsigned long oprofile_cpu_buffer_size;
unsigned long oprofile_buffer_watershed;
+unsigned long oprofile_time_slice;
+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static ssize_t timeout_read(struct file *file, char __user *buf,
+ size_t count, loff_t *offset)
+{
+ return oprofilefs_ulong_to_user(jiffies_to_msecs(oprofile_time_slice),
+ buf, count, offset);
+}
+
+
+static ssize_t timeout_write(struct file *file, char const __user *buf,
+ size_t count, loff_t *offset)
+{
+ unsigned long val;
+ int retval;
+
+ if (*offset)
+ return -EINVAL;
+
+ retval = oprofilefs_ulong_from_user(&val, buf, count);
+ if (retval)
+ return retval;
+
+ retval = oprofile_set_timeout(val);
+
+ if (retval)
+ return retval;
+ return count;
+}
+
+
+static const struct file_operations timeout_fops = {
+ .read = timeout_read,
+ .write = timeout_write,
+};
+
+#endif
+

static ssize_t depth_read(struct file *file, char __user *buf, size_t count, loff_t *offset)
{
@@ -129,6 +171,7 @@ void oprofile_create_files(struct super_block *sb, struct dentry *root)
oprofile_buffer_size = BUFFER_SIZE_DEFAULT;
oprofile_cpu_buffer_size = CPU_BUFFER_SIZE_DEFAULT;
oprofile_buffer_watershed = BUFFER_WATERSHED_DEFAULT;
+ oprofile_time_slice = msecs_to_jiffies(TIME_SLICE_DEFAULT);

oprofilefs_create_file(sb, root, "enable", &enable_fops);
oprofilefs_create_file_perm(sb, root, "dump", &dump_fops, 0666);
@@ -139,6 +182,9 @@ void oprofile_create_files(struct super_block *sb, struct dentry *root)
oprofilefs_create_file(sb, root, "cpu_type", &cpu_type_fops);
oprofilefs_create_file(sb, root, "backtrace_depth", &depth_fops);
oprofilefs_create_file(sb, root, "pointer_size", &pointer_size_fops);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ oprofilefs_create_file(sb, root, "time_slice", &timeout_fops);
+#endif
oprofile_create_stats_files(sb, root);
if (oprofile_ops.create_files)
oprofile_ops.create_files(sb, root);
diff --git a/drivers/oprofile/oprofile_stats.c b/drivers/oprofile/oprofile_stats.c
index 3c2270a..61689e8 100644
--- a/drivers/oprofile/oprofile_stats.c
+++ b/drivers/oprofile/oprofile_stats.c
@@ -34,6 +34,7 @@ void oprofile_reset_stats(void)
atomic_set(&oprofile_stats.sample_lost_no_mapping, 0);
atomic_set(&oprofile_stats.event_lost_overflow, 0);
atomic_set(&oprofile_stats.bt_lost_no_mapping, 0);
+ atomic_set(&oprofile_stats.multiplex_counter, 0);
}


@@ -76,4 +77,8 @@ void oprofile_create_stats_files(struct super_block *sb, struct dentry *root)
&oprofile_stats.event_lost_overflow);
oprofilefs_create_ro_atomic(sb, dir, "bt_lost_no_mapping",
&oprofile_stats.bt_lost_no_mapping);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ oprofilefs_create_ro_atomic(sb, dir, "multiplex_counter",
+ &oprofile_stats.multiplex_counter);
+#endif
}
diff --git a/drivers/oprofile/oprofile_stats.h b/drivers/oprofile/oprofile_stats.h
index 3da0d08..0b54e46 100644
--- a/drivers/oprofile/oprofile_stats.h
+++ b/drivers/oprofile/oprofile_stats.h
@@ -17,6 +17,7 @@ struct oprofile_stat_struct {
atomic_t sample_lost_no_mapping;
atomic_t bt_lost_no_mapping;
atomic_t event_lost_overflow;
+ atomic_t multiplex_counter;
};

extern struct oprofile_stat_struct oprofile_stats;
diff --git a/include/linux/oprofile.h b/include/linux/oprofile.h
index d68d2ed..5171639 100644
--- a/include/linux/oprofile.h
+++ b/include/linux/oprofile.h
@@ -67,6 +67,9 @@ struct oprofile_operations {

/* Initiate a stack backtrace. Optional. */
void (*backtrace)(struct pt_regs * const regs, unsigned int depth);
+
+ /* Multiplex between different events. Optional. */
+ int (*switch_events)(void);
/* CPU identification string. */
char * cpu_type;
};






Subject: [PATCH 01/26] x86/oprofile: Rework and simplify nmi_cpu_setup()

This patch removes the function nmi_save_registers(). Per-cpu code is
now executed only in the function nmi_cpu_setup(). Also, it renames
the per-cpu function nmi_restore_registers() to
nmi_cpu_restore_registers().

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 13 +++----------
1 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 93df76d..25da1e1 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -87,13 +87,6 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs)
}
}

-static void nmi_save_registers(void *dummy)
-{
- int cpu = smp_processor_id();
- struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
- nmi_cpu_save_registers(msrs);
-}
-
static void free_msrs(void)
{
int i;
@@ -137,6 +130,7 @@ static void nmi_cpu_setup(void *dummy)
{
int cpu = smp_processor_id();
struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
+ nmi_cpu_save_registers(msrs);
spin_lock(&oprofilefs_lock);
model->setup_ctrs(model, msrs);
spin_unlock(&oprofilefs_lock);
@@ -182,13 +176,12 @@ static int nmi_setup(void)
}

}
- on_each_cpu(nmi_save_registers, NULL, 1);
on_each_cpu(nmi_cpu_setup, NULL, 1);
nmi_enabled = 1;
return 0;
}

-static void nmi_restore_registers(struct op_msrs *msrs)
+static void nmi_cpu_restore_registers(struct op_msrs *msrs)
{
struct op_msr *counters = msrs->counters;
struct op_msr *controls = msrs->controls;
@@ -220,7 +213,7 @@ static void nmi_cpu_shutdown(void *dummy)
apic_write(APIC_LVTERR, v | APIC_LVT_MASKED);
apic_write(APIC_LVTPC, per_cpu(saved_lvtpc, cpu));
apic_write(APIC_LVTERR, v);
- nmi_restore_registers(msrs);
+ nmi_cpu_restore_registers(msrs);
}

static void nmi_shutdown(void)
--
1.6.3.3

Subject: [PATCH 04/26] x86/oprofile: Fix usage of NUM_CONTROLS/NUM_COUNTERS macros

Use the corresponding macros when iterating over counter and control
registers. Since NUM_CONTROLS and NUM_COUNTERS are equal for AMD cpus
the fix is more a cosmetical change.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/op_model_amd.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index fdbed3a..dcfd450 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -99,7 +99,7 @@ static void op_amd_fill_in_addresses(struct op_msrs * const msrs)

#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
for (i = 0; i < NUM_VIRT_COUNTERS; i++) {
- int hw_counter = i % NUM_CONTROLS;
+ int hw_counter = i % NUM_COUNTERS;
if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
msrs->multiplex[i].addr = MSR_K7_PERFCTR0 + hw_counter;
else
@@ -366,7 +366,7 @@ static void op_amd_shutdown(struct op_msrs const * const msrs)
if (msrs->counters[i].addr)
release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
}
- for (i = 0; i < NUM_COUNTERS; ++i) {
+ for (i = 0; i < NUM_CONTROLS; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
}
--
1.6.3.3

Subject: [PATCH 19/26] x86/oprofile: Modify initialization of num_virt_counters

Models that do not yet support counter multiplexing have to setup
num_virt_counters. This patch implements the setup from num_counters
if num_virt_counters is not set. Thus, num_virt_counters must be setup
only for multiplexing support.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 3 +++
arch/x86/oprofile/op_model_p4.c | 2 --
arch/x86/oprofile/op_model_ppro.c | 1 -
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 826f391..82ee295 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -674,6 +674,9 @@ int __init op_nmi_init(struct oprofile_operations *ops)
if (ret)
return ret;

+ if (!model->num_virt_counters)
+ model->num_virt_counters = model->num_counters;
+
init_sysfs();
using_nmi = 1;
printk(KERN_INFO "oprofile: using NMI interrupt.\n");
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 0a4f2de..ac6b354 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -698,7 +698,6 @@ static void p4_shutdown(struct op_msrs const * const msrs)
struct op_x86_model_spec op_p4_ht2_spec = {
.num_counters = NUM_COUNTERS_HT2,
.num_controls = NUM_CONTROLS_HT2,
- .num_virt_counters = NUM_COUNTERS_HT2,
.fill_in_addresses = &p4_fill_in_addresses,
.setup_ctrs = &p4_setup_ctrs,
.check_ctrs = &p4_check_ctrs,
@@ -711,7 +710,6 @@ struct op_x86_model_spec op_p4_ht2_spec = {
struct op_x86_model_spec op_p4_spec = {
.num_counters = NUM_COUNTERS_NON_HT,
.num_controls = NUM_CONTROLS_NON_HT,
- .num_virt_counters = NUM_COUNTERS_NON_HT,
.fill_in_addresses = &p4_fill_in_addresses,
.setup_ctrs = &p4_setup_ctrs,
.check_ctrs = &p4_check_ctrs,
diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index 753a02a..4899215 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -206,7 +206,6 @@ static void ppro_shutdown(struct op_msrs const * const msrs)
struct op_x86_model_spec op_ppro_spec = {
.num_counters = 2,
.num_controls = 2,
- .num_virt_counters = 2,
.reserved = MSR_PPRO_EVENTSEL_RESERVED,
.fill_in_addresses = &ppro_fill_in_addresses,
.setup_ctrs = &ppro_setup_ctrs,
--
1.6.3.3

Subject: [PATCH 03/26] oprofile: Implement performance counter multiplexing

From: Jason Yeh <[email protected]>

The number of hardware counters is limited. The multiplexing feature
enables OProfile to gather more events than counters are provided by
the hardware. This is realized by switching between events at an user
specified time interval.

A new file (/dev/oprofile/time_slice) is added for the user to specify
the timer interval in ms. If the number of events to profile is higher
than the number of hardware counters available, the patch will
schedule a work queue that switches the event counter and re-writes
the different sets of values into it. The switching mechanism needs to
be implemented for each architecture to support multiplexing. This
patch only implements AMD CPU support, but multiplexing can be easily
extended for other models and architectures.

There are follow-on patches that rework parts of this patch.

Signed-off-by: Jason Yeh <[email protected]>
Signed-off-by: Robert Richter <[email protected]>
---
arch/Kconfig | 12 +++
arch/x86/oprofile/nmi_int.c | 162 +++++++++++++++++++++++++++++++++++-
arch/x86/oprofile/op_counter.h | 2 +-
arch/x86/oprofile/op_model_amd.c | 110 ++++++++++++++++++++++---
arch/x86/oprofile/op_model_p4.c | 4 +
arch/x86/oprofile/op_model_ppro.c | 2 +
arch/x86/oprofile/op_x86_model.h | 7 ++
drivers/oprofile/oprof.c | 78 ++++++++++++++++++
drivers/oprofile/oprof.h | 2 +
drivers/oprofile/oprofile_files.c | 43 ++++++++++
drivers/oprofile/oprofile_stats.c | 10 +++
include/linux/oprofile.h | 3 +
12 files changed, 415 insertions(+), 20 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 99193b1..beea3cc 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -30,6 +30,18 @@ config OPROFILE_IBS

If unsure, say N.

+config OPROFILE_EVENT_MULTIPLEX
+ bool "OProfile multiplexing support (EXPERIMENTAL)"
+ default n
+ depends on OPROFILE && X86
+ help
+ The number of hardware counters is limited. The multiplexing
+ feature enables OProfile to gather more events than counters
+ are provided by the hardware. This is realized by switching
+ between events at an user specified time interval.
+
+ If unsure, say N.
+
config HAVE_OPROFILE
bool

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index fca8dc9..e54f6a0 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -1,11 +1,14 @@
/**
* @file nmi_int.c
*
- * @remark Copyright 2002-2008 OProfile authors
+ * @remark Copyright 2002-2009 OProfile authors
* @remark Read the file COPYING
*
* @author John Levon <[email protected]>
* @author Robert Richter <[email protected]>
+ * @author Barry Kasindorf <[email protected]>
+ * @author Jason Yeh <[email protected]>
+ * @author Suravee Suthikulpanit <[email protected]>
*/

#include <linux/init.h>
@@ -24,6 +27,12 @@
#include "op_counter.h"
#include "op_x86_model.h"

+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+DEFINE_PER_CPU(int, switch_index);
+#endif
+
+
static struct op_x86_model_spec const *model;
static DEFINE_PER_CPU(struct op_msrs, cpu_msrs);
static DEFINE_PER_CPU(unsigned long, saved_lvtpc);
@@ -31,6 +40,13 @@ static DEFINE_PER_CPU(unsigned long, saved_lvtpc);
/* 0 == registered but off, 1 == registered and on */
static int nmi_enabled = 0;

+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+extern atomic_t multiplex_counter;
+#endif
+
+struct op_counter_config counter_config[OP_MAX_COUNTER];
+
/* common functions */

u64 op_x86_get_ctrl(struct op_x86_model_spec const *model,
@@ -95,6 +111,11 @@ static void free_msrs(void)
per_cpu(cpu_msrs, i).counters = NULL;
kfree(per_cpu(cpu_msrs, i).controls);
per_cpu(cpu_msrs, i).controls = NULL;
+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ kfree(per_cpu(cpu_msrs, i).multiplex);
+ per_cpu(cpu_msrs, i).multiplex = NULL;
+#endif
}
}

@@ -103,6 +124,9 @@ static int allocate_msrs(void)
int success = 1;
size_t controls_size = sizeof(struct op_msr) * model->num_controls;
size_t counters_size = sizeof(struct op_msr) * model->num_counters;
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ size_t multiplex_size = sizeof(struct op_msr) * model->num_virt_counters;
+#endif

int i;
for_each_possible_cpu(i) {
@@ -118,6 +142,14 @@ static int allocate_msrs(void)
success = 0;
break;
}
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ per_cpu(cpu_msrs, i).multiplex =
+ kmalloc(multiplex_size, GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).multiplex) {
+ success = 0;
+ break;
+ }
+#endif
}

if (!success)
@@ -126,6 +158,25 @@ static int allocate_msrs(void)
return success;
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void nmi_setup_cpu_mux(struct op_msrs const * const msrs)
+{
+ int i;
+ struct op_msr *multiplex = msrs->multiplex;
+
+ for (i = 0; i < model->num_virt_counters; ++i) {
+ if (counter_config[i].enabled) {
+ multiplex[i].saved = -(u64)counter_config[i].count;
+ } else {
+ multiplex[i].addr = 0;
+ multiplex[i].saved = 0;
+ }
+ }
+}
+
+#endif
+
static void nmi_cpu_setup(void *dummy)
{
int cpu = smp_processor_id();
@@ -133,6 +184,9 @@ static void nmi_cpu_setup(void *dummy)
nmi_cpu_save_registers(msrs);
spin_lock(&oprofilefs_lock);
model->setup_ctrs(model, msrs);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ nmi_setup_cpu_mux(msrs);
+#endif
spin_unlock(&oprofilefs_lock);
per_cpu(saved_lvtpc, cpu) = apic_read(APIC_LVTPC);
apic_write(APIC_LVTPC, APIC_DM_NMI);
@@ -173,14 +227,52 @@ static int nmi_setup(void)
memcpy(per_cpu(cpu_msrs, cpu).controls,
per_cpu(cpu_msrs, 0).controls,
sizeof(struct op_msr) * model->num_controls);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ memcpy(per_cpu(cpu_msrs, cpu).multiplex,
+ per_cpu(cpu_msrs, 0).multiplex,
+ sizeof(struct op_msr) * model->num_virt_counters);
+#endif
}
-
}
on_each_cpu(nmi_cpu_setup, NULL, 1);
nmi_enabled = 1;
return 0;
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void nmi_cpu_save_mpx_registers(struct op_msrs *msrs)
+{
+ unsigned int si = __get_cpu_var(switch_index);
+ struct op_msr *multiplex = msrs->multiplex;
+ unsigned int i;
+
+ for (i = 0; i < model->num_counters; ++i) {
+ int offset = i + si;
+ if (multiplex[offset].addr) {
+ rdmsrl(multiplex[offset].addr,
+ multiplex[offset].saved);
+ }
+ }
+}
+
+static void nmi_cpu_restore_mpx_registers(struct op_msrs *msrs)
+{
+ unsigned int si = __get_cpu_var(switch_index);
+ struct op_msr *multiplex = msrs->multiplex;
+ unsigned int i;
+
+ for (i = 0; i < model->num_counters; ++i) {
+ int offset = i + si;
+ if (multiplex[offset].addr) {
+ wrmsrl(multiplex[offset].addr,
+ multiplex[offset].saved);
+ }
+ }
+}
+
+#endif
+
static void nmi_cpu_restore_registers(struct op_msrs *msrs)
{
struct op_msr *counters = msrs->counters;
@@ -214,6 +306,9 @@ static void nmi_cpu_shutdown(void *dummy)
apic_write(APIC_LVTPC, per_cpu(saved_lvtpc, cpu));
apic_write(APIC_LVTERR, v);
nmi_cpu_restore_registers(msrs);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ __get_cpu_var(switch_index) = 0;
+#endif
}

static void nmi_shutdown(void)
@@ -252,16 +347,15 @@ static void nmi_stop(void)
on_each_cpu(nmi_cpu_stop, NULL, 1);
}

-struct op_counter_config counter_config[OP_MAX_COUNTER];
-
static int nmi_create_files(struct super_block *sb, struct dentry *root)
{
unsigned int i;

- for (i = 0; i < model->num_counters; ++i) {
+ for (i = 0; i < model->num_virt_counters; ++i) {
struct dentry *dir;
char buf[4];

+#ifndef CONFIG_OPROFILE_EVENT_MULTIPLEX
/* quick little hack to _not_ expose a counter if it is not
* available for use. This should protect userspace app.
* NOTE: assumes 1:1 mapping here (that counters are organized
@@ -269,6 +363,7 @@ static int nmi_create_files(struct super_block *sb, struct dentry *root)
*/
if (unlikely(!avail_to_resrv_perfctr_nmi_bit(i)))
continue;
+#endif /* CONFIG_OPROFILE_EVENT_MULTIPLEX */

snprintf(buf, sizeof(buf), "%d", i);
dir = oprofilefs_mkdir(sb, root, buf);
@@ -283,6 +378,57 @@ static int nmi_create_files(struct super_block *sb, struct dentry *root)
return 0;
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void nmi_cpu_switch(void *dummy)
+{
+ int cpu = smp_processor_id();
+ int si = per_cpu(switch_index, cpu);
+ struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
+
+ nmi_cpu_stop(NULL);
+ nmi_cpu_save_mpx_registers(msrs);
+
+ /* move to next set */
+ si += model->num_counters;
+ if ((si > model->num_virt_counters) || (counter_config[si].count == 0))
+ per_cpu(switch_index, cpu) = 0;
+ else
+ per_cpu(switch_index, cpu) = si;
+
+ model->switch_ctrl(model, msrs);
+ nmi_cpu_restore_mpx_registers(msrs);
+
+ nmi_cpu_start(NULL);
+}
+
+
+/*
+ * Quick check to see if multiplexing is necessary.
+ * The check should be sufficient since counters are used
+ * in ordre.
+ */
+static int nmi_multiplex_on(void)
+{
+ return counter_config[model->num_counters].count ? 0 : -EINVAL;
+}
+
+static int nmi_switch_event(void)
+{
+ if (!model->switch_ctrl)
+ return -ENOSYS; /* not implemented */
+ if (nmi_multiplex_on() < 0)
+ return -EINVAL; /* not necessary */
+
+ on_each_cpu(nmi_cpu_switch, NULL, 1);
+
+ atomic_inc(&multiplex_counter);
+
+ return 0;
+}
+
+#endif
+
#ifdef CONFIG_SMP
static int oprofile_cpu_notifier(struct notifier_block *b, unsigned long action,
void *data)
@@ -516,12 +662,18 @@ int __init op_nmi_init(struct oprofile_operations *ops)
register_cpu_notifier(&oprofile_cpu_nb);
#endif
/* default values, can be overwritten by model */
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ __raw_get_cpu_var(switch_index) = 0;
+#endif
ops->create_files = nmi_create_files;
ops->setup = nmi_setup;
ops->shutdown = nmi_shutdown;
ops->start = nmi_start;
ops->stop = nmi_stop;
ops->cpu_type = cpu_type;
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ ops->switch_events = nmi_switch_event;
+#endif

if (model->init)
ret = model->init(ops);
diff --git a/arch/x86/oprofile/op_counter.h b/arch/x86/oprofile/op_counter.h
index 91b6a11..e28398d 100644
--- a/arch/x86/oprofile/op_counter.h
+++ b/arch/x86/oprofile/op_counter.h
@@ -10,7 +10,7 @@
#ifndef OP_COUNTER_H
#define OP_COUNTER_H

-#define OP_MAX_COUNTER 8
+#define OP_MAX_COUNTER 32

/* Per-perfctr configuration as set via
* oprofilefs.
diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index f676f88..fdbed3a 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -9,12 +9,15 @@
* @author Philippe Elie
* @author Graydon Hoare
* @author Robert Richter <[email protected]>
- * @author Barry Kasindorf
+ * @author Barry Kasindorf <[email protected]>
+ * @author Jason Yeh <[email protected]>
+ * @author Suravee Suthikulpanit <[email protected]>
*/

#include <linux/oprofile.h>
#include <linux/device.h>
#include <linux/pci.h>
+#include <linux/percpu.h>

#include <asm/ptrace.h>
#include <asm/msr.h>
@@ -25,12 +28,23 @@

#define NUM_COUNTERS 4
#define NUM_CONTROLS 4
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+#define NUM_VIRT_COUNTERS 32
+#define NUM_VIRT_CONTROLS 32
+#else
+#define NUM_VIRT_COUNTERS NUM_COUNTERS
+#define NUM_VIRT_CONTROLS NUM_CONTROLS
+#endif
+
#define OP_EVENT_MASK 0x0FFF
#define OP_CTR_OVERFLOW (1ULL<<31)

#define MSR_AMD_EVENTSEL_RESERVED ((0xFFFFFCF0ULL<<32)|(1ULL<<21))

-static unsigned long reset_value[NUM_COUNTERS];
+static unsigned long reset_value[NUM_VIRT_COUNTERS];
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+DECLARE_PER_CPU(int, switch_index);
+#endif

#ifdef CONFIG_OPROFILE_IBS

@@ -82,6 +96,16 @@ static void op_amd_fill_in_addresses(struct op_msrs * const msrs)
else
msrs->controls[i].addr = 0;
}
+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ for (i = 0; i < NUM_VIRT_COUNTERS; i++) {
+ int hw_counter = i % NUM_CONTROLS;
+ if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
+ msrs->multiplex[i].addr = MSR_K7_PERFCTR0 + hw_counter;
+ else
+ msrs->multiplex[i].addr = 0;
+ }
+#endif
}

static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
@@ -90,6 +114,15 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
u64 val;
int i;

+ /* setup reset_value */
+ for (i = 0; i < NUM_VIRT_COUNTERS; ++i) {
+ if (counter_config[i].enabled) {
+ reset_value[i] = counter_config[i].count;
+ } else {
+ reset_value[i] = 0;
+ }
+ }
+
/* clear all counters */
for (i = 0; i < NUM_CONTROLS; ++i) {
if (unlikely(!msrs->controls[i].addr))
@@ -108,20 +141,49 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,

/* enable active counters */
for (i = 0; i < NUM_COUNTERS; ++i) {
- if (counter_config[i].enabled && msrs->counters[i].addr) {
- reset_value[i] = counter_config[i].count;
- wrmsrl(msrs->counters[i].addr,
- -(u64)counter_config[i].count);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ int offset = i + __get_cpu_var(switch_index);
+#else
+ int offset = i;
+#endif
+ if (counter_config[offset].enabled && msrs->counters[i].addr) {
+ /* setup counter registers */
+ wrmsrl(msrs->counters[i].addr, -(u64)reset_value[offset]);
+
+ /* setup control registers */
rdmsrl(msrs->controls[i].addr, val);
val &= model->reserved;
- val |= op_x86_get_ctrl(model, &counter_config[i]);
+ val |= op_x86_get_ctrl(model, &counter_config[offset]);
+ wrmsrl(msrs->controls[i].addr, val);
+ }
+ }
+}
+
+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void op_amd_switch_ctrl(struct op_x86_model_spec const *model,
+ struct op_msrs const * const msrs)
+{
+ u64 val;
+ int i;
+
+ /* enable active counters */
+ for (i = 0; i < NUM_COUNTERS; ++i) {
+ int offset = i + __get_cpu_var(switch_index);
+ if (counter_config[offset].enabled) {
+ /* setup control registers */
+ rdmsrl(msrs->controls[i].addr, val);
+ val &= model->reserved;
+ val |= op_x86_get_ctrl(model, &counter_config[offset]);
wrmsrl(msrs->controls[i].addr, val);
- } else {
- reset_value[i] = 0;
}
}
}

+#endif
+
+
#ifdef CONFIG_OPROFILE_IBS

static inline int
@@ -230,14 +292,19 @@ static int op_amd_check_ctrs(struct pt_regs * const regs,
int i;

for (i = 0; i < NUM_COUNTERS; ++i) {
- if (!reset_value[i])
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ int offset = i + __get_cpu_var(switch_index);
+#else
+ int offset = i;
+#endif
+ if (!reset_value[offset])
continue;
rdmsrl(msrs->counters[i].addr, val);
/* bit is clear if overflowed: */
if (val & OP_CTR_OVERFLOW)
continue;
- oprofile_add_sample(regs, i);
- wrmsrl(msrs->counters[i].addr, -(u64)reset_value[i]);
+ oprofile_add_sample(regs, offset);
+ wrmsrl(msrs->counters[i].addr, -(u64)reset_value[offset]);
}

op_amd_handle_ibs(regs, msrs);
@@ -250,8 +317,14 @@ static void op_amd_start(struct op_msrs const * const msrs)
{
u64 val;
int i;
+
for (i = 0; i < NUM_COUNTERS; ++i) {
- if (reset_value[i]) {
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ int offset = i + __get_cpu_var(switch_index);
+#else
+ int offset = i;
+#endif
+ if (reset_value[offset]) {
rdmsrl(msrs->controls[i].addr, val);
val |= ARCH_PERFMON_EVENTSEL0_ENABLE;
wrmsrl(msrs->controls[i].addr, val);
@@ -271,7 +344,11 @@ static void op_amd_stop(struct op_msrs const * const msrs)
* pm callback
*/
for (i = 0; i < NUM_COUNTERS; ++i) {
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ if (!reset_value[i + per_cpu(switch_index, smp_processor_id())])
+#else
if (!reset_value[i])
+#endif
continue;
rdmsrl(msrs->controls[i].addr, val);
val &= ~ARCH_PERFMON_EVENTSEL0_ENABLE;
@@ -289,7 +366,7 @@ static void op_amd_shutdown(struct op_msrs const * const msrs)
if (msrs->counters[i].addr)
release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
}
- for (i = 0; i < NUM_CONTROLS; ++i) {
+ for (i = 0; i < NUM_COUNTERS; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
}
@@ -463,6 +540,8 @@ static void op_amd_exit(void) {}
struct op_x86_model_spec const op_amd_spec = {
.num_counters = NUM_COUNTERS,
.num_controls = NUM_CONTROLS,
+ .num_virt_counters = NUM_VIRT_COUNTERS,
+ .num_virt_controls = NUM_VIRT_CONTROLS,
.reserved = MSR_AMD_EVENTSEL_RESERVED,
.event_mask = OP_EVENT_MASK,
.init = op_amd_init,
@@ -473,4 +552,7 @@ struct op_x86_model_spec const op_amd_spec = {
.start = &op_amd_start,
.stop = &op_amd_stop,
.shutdown = &op_amd_shutdown,
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ .switch_ctrl = &op_amd_switch_ctrl,
+#endif
};
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 5921b7f..65b9237 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -698,6 +698,8 @@ static void p4_shutdown(struct op_msrs const * const msrs)
struct op_x86_model_spec const op_p4_ht2_spec = {
.num_counters = NUM_COUNTERS_HT2,
.num_controls = NUM_CONTROLS_HT2,
+ .num_virt_counters = NUM_COUNTERS_HT2,
+ .num_virt_controls = NUM_CONTROLS_HT2,
.fill_in_addresses = &p4_fill_in_addresses,
.setup_ctrs = &p4_setup_ctrs,
.check_ctrs = &p4_check_ctrs,
@@ -710,6 +712,8 @@ struct op_x86_model_spec const op_p4_ht2_spec = {
struct op_x86_model_spec const op_p4_spec = {
.num_counters = NUM_COUNTERS_NON_HT,
.num_controls = NUM_CONTROLS_NON_HT,
+ .num_virt_counters = NUM_COUNTERS_NON_HT,
+ .num_virt_controls = NUM_CONTROLS_NON_HT,
.fill_in_addresses = &p4_fill_in_addresses,
.setup_ctrs = &p4_setup_ctrs,
.check_ctrs = &p4_check_ctrs,
diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index 570d717..098cbca 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -206,6 +206,8 @@ static void ppro_shutdown(struct op_msrs const * const msrs)
struct op_x86_model_spec const op_ppro_spec = {
.num_counters = 2,
.num_controls = 2,
+ .num_virt_counters = 2,
+ .num_virt_controls = 2,
.reserved = MSR_PPRO_EVENTSEL_RESERVED,
.fill_in_addresses = &ppro_fill_in_addresses,
.setup_ctrs = &ppro_setup_ctrs,
diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h
index 5054898..0d07d23 100644
--- a/arch/x86/oprofile/op_x86_model.h
+++ b/arch/x86/oprofile/op_x86_model.h
@@ -23,6 +23,7 @@ struct op_msr {
struct op_msrs {
struct op_msr *counters;
struct op_msr *controls;
+ struct op_msr *multiplex;
};

struct pt_regs;
@@ -35,6 +36,8 @@ struct oprofile_operations;
struct op_x86_model_spec {
unsigned int num_counters;
unsigned int num_controls;
+ unsigned int num_virt_counters;
+ unsigned int num_virt_controls;
u64 reserved;
u16 event_mask;
int (*init)(struct oprofile_operations *ops);
@@ -47,6 +50,10 @@ struct op_x86_model_spec {
void (*start)(struct op_msrs const * const msrs);
void (*stop)(struct op_msrs const * const msrs);
void (*shutdown)(struct op_msrs const * const msrs);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ void (*switch_ctrl)(struct op_x86_model_spec const *model,
+ struct op_msrs const * const msrs);
+#endif
};

struct op_counter_config;
diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 3cffce9..7bc64af 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -12,6 +12,8 @@
#include <linux/init.h>
#include <linux/oprofile.h>
#include <linux/moduleparam.h>
+#include <linux/workqueue.h>
+#include <linux/time.h>
#include <asm/mutex.h>

#include "oprof.h"
@@ -27,6 +29,15 @@ unsigned long oprofile_backtrace_depth;
static unsigned long is_setup;
static DEFINE_MUTEX(start_mutex);

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void switch_worker(struct work_struct *work);
+static DECLARE_DELAYED_WORK(switch_work, switch_worker);
+unsigned long timeout_jiffies;
+#define MULTIPLEXING_TIMER_DEFAULT 1
+
+#endif
+
/* timer
0 - use performance monitoring hardware if available
1 - use the timer int mechanism regardless
@@ -87,6 +98,20 @@ out:
return err;
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void start_switch_worker(void)
+{
+ schedule_delayed_work(&switch_work, timeout_jiffies);
+}
+
+static void switch_worker(struct work_struct *work)
+{
+ if (!oprofile_ops.switch_events())
+ start_switch_worker();
+}
+
+#endif

/* Actually start profiling (echo 1>/dev/oprofile/enable) */
int oprofile_start(void)
@@ -108,6 +133,11 @@ int oprofile_start(void)
if ((err = oprofile_ops.start()))
goto out;

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ if (oprofile_ops.switch_events)
+ start_switch_worker();
+#endif
+
oprofile_started = 1;
out:
mutex_unlock(&start_mutex);
@@ -123,6 +153,11 @@ void oprofile_stop(void)
goto out;
oprofile_ops.stop();
oprofile_started = 0;
+
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ cancel_delayed_work_sync(&switch_work);
+#endif
+
/* wake up the daemon to read what remains */
wake_up_buffer_waiter();
out:
@@ -155,6 +190,36 @@ post_sync:
mutex_unlock(&start_mutex);
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+/* User inputs in ms, converts to jiffies */
+int oprofile_set_timeout(unsigned long val_msec)
+{
+ int err = 0;
+
+ mutex_lock(&start_mutex);
+
+ if (oprofile_started) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ if (!oprofile_ops.switch_events) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ timeout_jiffies = msecs_to_jiffies(val_msec);
+ if (timeout_jiffies == MAX_JIFFY_OFFSET)
+ timeout_jiffies = msecs_to_jiffies(MULTIPLEXING_TIMER_DEFAULT);
+
+out:
+ mutex_unlock(&start_mutex);
+ return err;
+
+}
+
+#endif

int oprofile_set_backtrace(unsigned long val)
{
@@ -179,10 +244,23 @@ out:
return err;
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void __init oprofile_multiplexing_init(void)
+{
+ timeout_jiffies = msecs_to_jiffies(MULTIPLEXING_TIMER_DEFAULT);
+}
+
+#endif
+
static int __init oprofile_init(void)
{
int err;

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ oprofile_multiplexing_init();
+#endif
+
err = oprofile_arch_init(&oprofile_ops);

if (err < 0 || timer) {
diff --git a/drivers/oprofile/oprof.h b/drivers/oprofile/oprof.h
index c288d3c..ee38abc 100644
--- a/drivers/oprofile/oprof.h
+++ b/drivers/oprofile/oprof.h
@@ -27,6 +27,7 @@ extern unsigned long oprofile_buffer_watershed;
extern struct oprofile_operations oprofile_ops;
extern unsigned long oprofile_started;
extern unsigned long oprofile_backtrace_depth;
+extern unsigned long timeout_jiffies;

struct super_block;
struct dentry;
@@ -35,5 +36,6 @@ void oprofile_create_files(struct super_block *sb, struct dentry *root);
void oprofile_timer_init(struct oprofile_operations *ops);

int oprofile_set_backtrace(unsigned long depth);
+int oprofile_set_timeout(unsigned long time);

#endif /* OPROF_H */
diff --git a/drivers/oprofile/oprofile_files.c b/drivers/oprofile/oprofile_files.c
index 5d36ffc..468ec3e 100644
--- a/drivers/oprofile/oprofile_files.c
+++ b/drivers/oprofile/oprofile_files.c
@@ -9,6 +9,7 @@

#include <linux/fs.h>
#include <linux/oprofile.h>
+#include <linux/jiffies.h>

#include "event_buffer.h"
#include "oprofile_stats.h"
@@ -22,6 +23,45 @@ unsigned long oprofile_buffer_size;
unsigned long oprofile_cpu_buffer_size;
unsigned long oprofile_buffer_watershed;

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static ssize_t timeout_read(struct file *file, char __user *buf,
+ size_t count, loff_t *offset)
+{
+ return oprofilefs_ulong_to_user(jiffies_to_msecs(timeout_jiffies),
+ buf, count, offset);
+}
+
+
+static ssize_t timeout_write(struct file *file, char const __user *buf,
+ size_t count, loff_t *offset)
+{
+ unsigned long val;
+ int retval;
+
+ if (*offset)
+ return -EINVAL;
+
+ retval = oprofilefs_ulong_from_user(&val, buf, count);
+ if (retval)
+ return retval;
+
+ retval = oprofile_set_timeout(val);
+
+ if (retval)
+ return retval;
+ return count;
+}
+
+
+static const struct file_operations timeout_fops = {
+ .read = timeout_read,
+ .write = timeout_write,
+};
+
+#endif
+
+
static ssize_t depth_read(struct file *file, char __user *buf, size_t count, loff_t *offset)
{
return oprofilefs_ulong_to_user(oprofile_backtrace_depth, buf, count,
@@ -139,6 +179,9 @@ void oprofile_create_files(struct super_block *sb, struct dentry *root)
oprofilefs_create_file(sb, root, "cpu_type", &cpu_type_fops);
oprofilefs_create_file(sb, root, "backtrace_depth", &depth_fops);
oprofilefs_create_file(sb, root, "pointer_size", &pointer_size_fops);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ oprofilefs_create_file(sb, root, "time_slice", &timeout_fops);
+#endif
oprofile_create_stats_files(sb, root);
if (oprofile_ops.create_files)
oprofile_ops.create_files(sb, root);
diff --git a/drivers/oprofile/oprofile_stats.c b/drivers/oprofile/oprofile_stats.c
index 3c2270a..77a57a6 100644
--- a/drivers/oprofile/oprofile_stats.c
+++ b/drivers/oprofile/oprofile_stats.c
@@ -16,6 +16,9 @@
#include "cpu_buffer.h"

struct oprofile_stat_struct oprofile_stats;
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+atomic_t multiplex_counter;
+#endif

void oprofile_reset_stats(void)
{
@@ -34,6 +37,9 @@ void oprofile_reset_stats(void)
atomic_set(&oprofile_stats.sample_lost_no_mapping, 0);
atomic_set(&oprofile_stats.event_lost_overflow, 0);
atomic_set(&oprofile_stats.bt_lost_no_mapping, 0);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ atomic_set(&multiplex_counter, 0);
+#endif
}


@@ -76,4 +82,8 @@ void oprofile_create_stats_files(struct super_block *sb, struct dentry *root)
&oprofile_stats.event_lost_overflow);
oprofilefs_create_ro_atomic(sb, dir, "bt_lost_no_mapping",
&oprofile_stats.bt_lost_no_mapping);
+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+ oprofilefs_create_ro_atomic(sb, dir, "multiplex_counter",
+ &multiplex_counter);
+#endif
}
diff --git a/include/linux/oprofile.h b/include/linux/oprofile.h
index d68d2ed..5171639 100644
--- a/include/linux/oprofile.h
+++ b/include/linux/oprofile.h
@@ -67,6 +67,9 @@ struct oprofile_operations {

/* Initiate a stack backtrace. Optional. */
void (*backtrace)(struct pt_regs * const regs, unsigned int depth);
+
+ /* Multiplex between different events. Optional. */
+ int (*switch_events)(void);
/* CPU identification string. */
char * cpu_type;
};
--
1.6.3.3

Subject: [PATCH 06/26] x86/oprofile: Fix initialization of switch_index

Variable switch_index must be initialized for each cpu. This patch
fixes the initialization by moving it to the per-cpu init function
nmi_cpu_setup().

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 16 +++++++++-------
1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 8cd4658..b211d33 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -160,7 +160,7 @@ static int allocate_msrs(void)

#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX

-static void nmi_setup_cpu_mux(struct op_msrs const * const msrs)
+static void nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs)
{
int i;
struct op_msr *multiplex = msrs->multiplex;
@@ -173,8 +173,15 @@ static void nmi_setup_cpu_mux(struct op_msrs const * const msrs)
multiplex[i].saved = 0;
}
}
+
+ per_cpu(switch_index, cpu) = 0;
}

+#else
+
+static inline void
+nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs) { }
+
#endif

static void nmi_cpu_setup(void *dummy)
@@ -184,9 +191,7 @@ static void nmi_cpu_setup(void *dummy)
nmi_cpu_save_registers(msrs);
spin_lock(&oprofilefs_lock);
model->setup_ctrs(model, msrs);
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- nmi_setup_cpu_mux(msrs);
-#endif
+ nmi_cpu_setup_mux(cpu, msrs);
spin_unlock(&oprofilefs_lock);
per_cpu(saved_lvtpc, cpu) = apic_read(APIC_LVTPC);
apic_write(APIC_LVTPC, APIC_DM_NMI);
@@ -662,9 +667,6 @@ int __init op_nmi_init(struct oprofile_operations *ops)
register_cpu_notifier(&oprofile_cpu_nb);
#endif
/* default values, can be overwritten by model */
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- __raw_get_cpu_var(switch_index) = 0;
-#endif
ops->create_files = nmi_create_files;
ops->setup = nmi_setup;
ops->shutdown = nmi_shutdown;
--
1.6.3.3

Subject: [PATCH 15/26] x86/oprofile: Moving nmi_cpu_save/restore_mpx_registers() in nmi_int.c

This patch moves some code in nmi_int.c to get a single separate
multiplexing code section.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 52 +++++++++++++++++++-----------------------
1 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index b1edfc9..f38c5cf 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -147,6 +147,30 @@ static void nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs)
per_cpu(switch_index, cpu) = 0;
}

+static void nmi_cpu_save_mpx_registers(struct op_msrs *msrs)
+{
+ struct op_msr *multiplex = msrs->multiplex;
+ int i;
+
+ for (i = 0; i < model->num_counters; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (multiplex[virt].addr)
+ rdmsrl(multiplex[virt].addr, multiplex[virt].saved);
+ }
+}
+
+static void nmi_cpu_restore_mpx_registers(struct op_msrs *msrs)
+{
+ struct op_msr *multiplex = msrs->multiplex;
+ int i;
+
+ for (i = 0; i < model->num_counters; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (multiplex[virt].addr)
+ wrmsrl(multiplex[virt].addr, multiplex[virt].saved);
+ }
+}
+
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
@@ -252,34 +276,6 @@ static int nmi_setup(void)
return 0;
}

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-static void nmi_cpu_save_mpx_registers(struct op_msrs *msrs)
-{
- struct op_msr *multiplex = msrs->multiplex;
- int i;
-
- for (i = 0; i < model->num_counters; ++i) {
- int virt = op_x86_phys_to_virt(i);
- if (multiplex[virt].addr)
- rdmsrl(multiplex[virt].addr, multiplex[virt].saved);
- }
-}
-
-static void nmi_cpu_restore_mpx_registers(struct op_msrs *msrs)
-{
- struct op_msr *multiplex = msrs->multiplex;
- int i;
-
- for (i = 0; i < model->num_counters; ++i) {
- int virt = op_x86_phys_to_virt(i);
- if (multiplex[virt].addr)
- wrmsrl(multiplex[virt].addr, multiplex[virt].saved);
- }
-}
-
-#endif
-
static void nmi_cpu_restore_registers(struct op_msrs *msrs)
{
struct op_msr *counters = msrs->counters;
--
1.6.3.3

Subject: [PATCH 11/26] oprofile: Introduce op_x86_phys_to_virt()

This new function translates physical to virtual counter numbers.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 43 +++++++++++---------
arch/x86/oprofile/op_model_amd.c | 80 +++++++++++++++-----------------------
arch/x86/oprofile/op_x86_model.h | 1 +
3 files changed, 55 insertions(+), 69 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index b211d33..02b57b8 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -27,12 +27,6 @@
#include "op_counter.h"
#include "op_x86_model.h"

-
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-DEFINE_PER_CPU(int, switch_index);
-#endif
-
-
static struct op_x86_model_spec const *model;
static DEFINE_PER_CPU(struct op_msrs, cpu_msrs);
static DEFINE_PER_CPU(unsigned long, saved_lvtpc);
@@ -103,6 +97,21 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs)
}
}

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static DEFINE_PER_CPU(int, switch_index);
+
+inline int op_x86_phys_to_virt(int phys)
+{
+ return __get_cpu_var(switch_index) + phys;
+}
+
+#else
+
+inline int op_x86_phys_to_virt(int phys) { return phys; }
+
+#endif
+
static void free_msrs(void)
{
int i;
@@ -248,31 +257,25 @@ static int nmi_setup(void)

static void nmi_cpu_save_mpx_registers(struct op_msrs *msrs)
{
- unsigned int si = __get_cpu_var(switch_index);
struct op_msr *multiplex = msrs->multiplex;
- unsigned int i;
+ int i;

for (i = 0; i < model->num_counters; ++i) {
- int offset = i + si;
- if (multiplex[offset].addr) {
- rdmsrl(multiplex[offset].addr,
- multiplex[offset].saved);
- }
+ int virt = op_x86_phys_to_virt(i);
+ if (multiplex[virt].addr)
+ rdmsrl(multiplex[virt].addr, multiplex[virt].saved);
}
}

static void nmi_cpu_restore_mpx_registers(struct op_msrs *msrs)
{
- unsigned int si = __get_cpu_var(switch_index);
struct op_msr *multiplex = msrs->multiplex;
- unsigned int i;
+ int i;

for (i = 0; i < model->num_counters; ++i) {
- int offset = i + si;
- if (multiplex[offset].addr) {
- wrmsrl(multiplex[offset].addr,
- multiplex[offset].saved);
- }
+ int virt = op_x86_phys_to_virt(i);
+ if (multiplex[virt].addr)
+ wrmsrl(multiplex[virt].addr, multiplex[virt].saved);
}
}

diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index dcfd450..67f830d 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -42,9 +42,6 @@
#define MSR_AMD_EVENTSEL_RESERVED ((0xFFFFFCF0ULL<<32)|(1ULL<<21))

static unsigned long reset_value[NUM_VIRT_COUNTERS];
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-DECLARE_PER_CPU(int, switch_index);
-#endif

#ifdef CONFIG_OPROFILE_IBS

@@ -141,21 +138,20 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,

/* enable active counters */
for (i = 0; i < NUM_COUNTERS; ++i) {
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- int offset = i + __get_cpu_var(switch_index);
-#else
- int offset = i;
-#endif
- if (counter_config[offset].enabled && msrs->counters[i].addr) {
- /* setup counter registers */
- wrmsrl(msrs->counters[i].addr, -(u64)reset_value[offset]);
-
- /* setup control registers */
- rdmsrl(msrs->controls[i].addr, val);
- val &= model->reserved;
- val |= op_x86_get_ctrl(model, &counter_config[offset]);
- wrmsrl(msrs->controls[i].addr, val);
- }
+ int virt = op_x86_phys_to_virt(i);
+ if (!counter_config[virt].enabled)
+ continue;
+ if (!msrs->counters[i].addr)
+ continue;
+
+ /* setup counter registers */
+ wrmsrl(msrs->counters[i].addr, -(u64)reset_value[virt]);
+
+ /* setup control registers */
+ rdmsrl(msrs->controls[i].addr, val);
+ val &= model->reserved;
+ val |= op_x86_get_ctrl(model, &counter_config[virt]);
+ wrmsrl(msrs->controls[i].addr, val);
}
}

@@ -170,14 +166,13 @@ static void op_amd_switch_ctrl(struct op_x86_model_spec const *model,

/* enable active counters */
for (i = 0; i < NUM_COUNTERS; ++i) {
- int offset = i + __get_cpu_var(switch_index);
- if (counter_config[offset].enabled) {
- /* setup control registers */
- rdmsrl(msrs->controls[i].addr, val);
- val &= model->reserved;
- val |= op_x86_get_ctrl(model, &counter_config[offset]);
- wrmsrl(msrs->controls[i].addr, val);
- }
+ int virt = op_x86_phys_to_virt(i);
+ if (!counter_config[virt].enabled)
+ continue;
+ rdmsrl(msrs->controls[i].addr, val);
+ val &= model->reserved;
+ val |= op_x86_get_ctrl(model, &counter_config[virt]);
+ wrmsrl(msrs->controls[i].addr, val);
}
}

@@ -292,19 +287,15 @@ static int op_amd_check_ctrs(struct pt_regs * const regs,
int i;

for (i = 0; i < NUM_COUNTERS; ++i) {
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- int offset = i + __get_cpu_var(switch_index);
-#else
- int offset = i;
-#endif
- if (!reset_value[offset])
+ int virt = op_x86_phys_to_virt(i);
+ if (!reset_value[virt])
continue;
rdmsrl(msrs->counters[i].addr, val);
/* bit is clear if overflowed: */
if (val & OP_CTR_OVERFLOW)
continue;
- oprofile_add_sample(regs, offset);
- wrmsrl(msrs->counters[i].addr, -(u64)reset_value[offset]);
+ oprofile_add_sample(regs, virt);
+ wrmsrl(msrs->counters[i].addr, -(u64)reset_value[virt]);
}

op_amd_handle_ibs(regs, msrs);
@@ -319,16 +310,11 @@ static void op_amd_start(struct op_msrs const * const msrs)
int i;

for (i = 0; i < NUM_COUNTERS; ++i) {
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- int offset = i + __get_cpu_var(switch_index);
-#else
- int offset = i;
-#endif
- if (reset_value[offset]) {
- rdmsrl(msrs->controls[i].addr, val);
- val |= ARCH_PERFMON_EVENTSEL0_ENABLE;
- wrmsrl(msrs->controls[i].addr, val);
- }
+ if (!reset_value[op_x86_phys_to_virt(i)])
+ continue;
+ rdmsrl(msrs->controls[i].addr, val);
+ val |= ARCH_PERFMON_EVENTSEL0_ENABLE;
+ wrmsrl(msrs->controls[i].addr, val);
}

op_amd_start_ibs();
@@ -344,11 +330,7 @@ static void op_amd_stop(struct op_msrs const * const msrs)
* pm callback
*/
for (i = 0; i < NUM_COUNTERS; ++i) {
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- if (!reset_value[i + per_cpu(switch_index, smp_processor_id())])
-#else
- if (!reset_value[i])
-#endif
+ if (!reset_value[op_x86_phys_to_virt(i)])
continue;
rdmsrl(msrs->controls[i].addr, val);
val &= ~ARCH_PERFMON_EVENTSEL0_ENABLE;
diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h
index 0d07d23..e874dc3 100644
--- a/arch/x86/oprofile/op_x86_model.h
+++ b/arch/x86/oprofile/op_x86_model.h
@@ -60,6 +60,7 @@ struct op_counter_config;

extern u64 op_x86_get_ctrl(struct op_x86_model_spec const *model,
struct op_counter_config *counter_config);
+extern int op_x86_phys_to_virt(int phys);

extern struct op_x86_model_spec const op_ppro_spec;
extern struct op_x86_model_spec const op_p4_spec;
--
1.6.3.3

Subject: [PATCH 17/26] x86/oprofile: Remove const qualifier from struct op_x86_model_spec

This patch removes the const qualifier from struct
op_x86_model_spec to make model parameters changable.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 4 ++--
arch/x86/oprofile/op_model_amd.c | 2 +-
arch/x86/oprofile/op_model_p4.c | 4 ++--
arch/x86/oprofile/op_model_ppro.c | 2 +-
arch/x86/oprofile/op_x86_model.h | 8 ++++----
5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 998c7dc..826f391 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -27,7 +27,7 @@
#include "op_counter.h"
#include "op_x86_model.h"

-static struct op_x86_model_spec const *model;
+static struct op_x86_model_spec *model;
static DEFINE_PER_CPU(struct op_msrs, cpu_msrs);
static DEFINE_PER_CPU(unsigned long, saved_lvtpc);

@@ -542,7 +542,7 @@ module_param_call(cpu_type, force_cpu_type, NULL, NULL, 0);
static int __init ppro_init(char **cpu_type)
{
__u8 cpu_model = boot_cpu_data.x86_model;
- struct op_x86_model_spec const *spec = &op_ppro_spec; /* default */
+ struct op_x86_model_spec *spec = &op_ppro_spec; /* default */

if (force_arch_perfmon && cpu_has_arch_perfmon)
return 0;
diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index 644980f..39604b4 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -526,7 +526,7 @@ static void op_amd_exit(void) {}

#endif /* CONFIG_OPROFILE_IBS */

-struct op_x86_model_spec const op_amd_spec = {
+struct op_x86_model_spec op_amd_spec = {
.num_counters = NUM_COUNTERS,
.num_controls = NUM_CONTROLS,
.num_virt_counters = NUM_VIRT_COUNTERS,
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 65b9237..40df028 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -695,7 +695,7 @@ static void p4_shutdown(struct op_msrs const * const msrs)


#ifdef CONFIG_SMP
-struct op_x86_model_spec const op_p4_ht2_spec = {
+struct op_x86_model_spec op_p4_ht2_spec = {
.num_counters = NUM_COUNTERS_HT2,
.num_controls = NUM_CONTROLS_HT2,
.num_virt_counters = NUM_COUNTERS_HT2,
@@ -709,7 +709,7 @@ struct op_x86_model_spec const op_p4_ht2_spec = {
};
#endif

-struct op_x86_model_spec const op_p4_spec = {
+struct op_x86_model_spec op_p4_spec = {
.num_counters = NUM_COUNTERS_NON_HT,
.num_controls = NUM_CONTROLS_NON_HT,
.num_virt_counters = NUM_COUNTERS_NON_HT,
diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index 098cbca..659f3b6 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -203,7 +203,7 @@ static void ppro_shutdown(struct op_msrs const * const msrs)
}


-struct op_x86_model_spec const op_ppro_spec = {
+struct op_x86_model_spec op_ppro_spec = {
.num_counters = 2,
.num_controls = 2,
.num_virt_counters = 2,
diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h
index e874dc3..0c886fa 100644
--- a/arch/x86/oprofile/op_x86_model.h
+++ b/arch/x86/oprofile/op_x86_model.h
@@ -62,10 +62,10 @@ extern u64 op_x86_get_ctrl(struct op_x86_model_spec const *model,
struct op_counter_config *counter_config);
extern int op_x86_phys_to_virt(int phys);

-extern struct op_x86_model_spec const op_ppro_spec;
-extern struct op_x86_model_spec const op_p4_spec;
-extern struct op_x86_model_spec const op_p4_ht2_spec;
-extern struct op_x86_model_spec const op_amd_spec;
+extern struct op_x86_model_spec op_ppro_spec;
+extern struct op_x86_model_spec op_p4_spec;
+extern struct op_x86_model_spec op_p4_ht2_spec;
+extern struct op_x86_model_spec op_amd_spec;
extern struct op_x86_model_spec op_arch_perfmon_spec;

#endif /* OP_X86_MODEL_H */
--
1.6.3.3

Subject: [PATCH 08/26] oprofile: Rename variable timeout_jiffies and move to oprofile_files.c

This patch renames timeout_jiffies into an oprofile specific name. The
macro MULTIPLEXING_TIMER_DEFAULT is changed too.

Also, since this variable is controlled using oprofilefs, its
definition is moved to oprofile_files.c.

Signed-off-by: Robert Richter <[email protected]>
---
drivers/oprofile/oprof.c | 9 ++++-----
drivers/oprofile/oprof.h | 3 ++-
drivers/oprofile/oprofile_files.c | 5 +++--
3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 42c9c76..2b33de7 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -33,8 +33,7 @@ static DEFINE_MUTEX(start_mutex);

static void switch_worker(struct work_struct *work);
static DECLARE_DELAYED_WORK(switch_work, switch_worker);
-unsigned long timeout_jiffies;
-#define MULTIPLEXING_TIMER_DEFAULT 1
+#define TIME_SLICE_DEFAULT 1

#endif

@@ -102,7 +101,7 @@ out:

static void start_switch_worker(void)
{
- schedule_delayed_work(&switch_work, timeout_jiffies);
+ schedule_delayed_work(&switch_work, oprofile_time_slice);
}

static void switch_worker(struct work_struct *work)
@@ -216,7 +215,7 @@ int oprofile_set_timeout(unsigned long val_msec)
goto out;
}

- timeout_jiffies = time_slice;
+ oprofile_time_slice = time_slice;

out:
mutex_unlock(&start_mutex);
@@ -253,7 +252,7 @@ out:

static void __init oprofile_multiplexing_init(void)
{
- timeout_jiffies = msecs_to_jiffies(MULTIPLEXING_TIMER_DEFAULT);
+ oprofile_time_slice = msecs_to_jiffies(TIME_SLICE_DEFAULT);
}

#endif
diff --git a/drivers/oprofile/oprof.h b/drivers/oprofile/oprof.h
index ee38abc..cb92f5c 100644
--- a/drivers/oprofile/oprof.h
+++ b/drivers/oprofile/oprof.h
@@ -24,10 +24,11 @@ struct oprofile_operations;
extern unsigned long oprofile_buffer_size;
extern unsigned long oprofile_cpu_buffer_size;
extern unsigned long oprofile_buffer_watershed;
+extern unsigned long oprofile_time_slice;
+
extern struct oprofile_operations oprofile_ops;
extern unsigned long oprofile_started;
extern unsigned long oprofile_backtrace_depth;
-extern unsigned long timeout_jiffies;

struct super_block;
struct dentry;
diff --git a/drivers/oprofile/oprofile_files.c b/drivers/oprofile/oprofile_files.c
index 468ec3e..4c5b947 100644
--- a/drivers/oprofile/oprofile_files.c
+++ b/drivers/oprofile/oprofile_files.c
@@ -22,14 +22,15 @@
unsigned long oprofile_buffer_size;
unsigned long oprofile_cpu_buffer_size;
unsigned long oprofile_buffer_watershed;
+unsigned long oprofile_time_slice;

#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX

static ssize_t timeout_read(struct file *file, char __user *buf,
size_t count, loff_t *offset)
{
- return oprofilefs_ulong_to_user(jiffies_to_msecs(timeout_jiffies),
- buf, count, offset);
+ return oprofilefs_ulong_to_user(jiffies_to_msecs(oprofile_time_slice),
+ buf, count, offset);
}


--
1.6.3.3

Subject: [PATCH 05/26] x86/oprofile: Use per_cpu() instead of __get_cpu_var()

__get_cpu_var() calls smp_processor_id(). When the cpu id is already
known, instead use per_cpu() to avoid generating the id again.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index e54f6a0..8cd4658 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -294,7 +294,7 @@ static void nmi_cpu_shutdown(void *dummy)
{
unsigned int v;
int cpu = smp_processor_id();
- struct op_msrs *msrs = &__get_cpu_var(cpu_msrs);
+ struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);

/* restoring APIC_LVTPC can trigger an apic error because the delivery
* mode and vector nr combination can be illegal. That's by design: on
@@ -307,7 +307,7 @@ static void nmi_cpu_shutdown(void *dummy)
apic_write(APIC_LVTERR, v);
nmi_cpu_restore_registers(msrs);
#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- __get_cpu_var(switch_index) = 0;
+ per_cpu(switch_index, cpu) = 0;
#endif
}

--
1.6.3.3

Subject: [PATCH 24/26] x86/oprofile: Implement op_x86_virt_to_phys()

This patch implements a common x86 function to convert virtual counter
numbers to physical.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 6 ++++++
arch/x86/oprofile/op_model_amd.c | 2 +-
arch/x86/oprofile/op_x86_model.h | 1 +
3 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 7b3362f..5856e61 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -129,6 +129,11 @@ inline int op_x86_phys_to_virt(int phys)
return __get_cpu_var(switch_index) + phys;
}

+inline int op_x86_virt_to_phys(int virt)
+{
+ return virt % model->num_counters;
+}
+
static void nmi_shutdown_mux(void)
{
int i;
@@ -270,6 +275,7 @@ static void mux_clone(int cpu)
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
+inline int op_x86_virt_to_phys(int virt) { return virt; }
static inline void nmi_shutdown_mux(void) { }
static inline int nmi_setup_mux(void) { return 1; }
static inline void
diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index dce69b5..1ea1982 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -81,7 +81,7 @@ static void op_mux_fill_in_addresses(struct op_msrs * const msrs)
int i;

for (i = 0; i < NUM_VIRT_COUNTERS; i++) {
- int hw_counter = i % NUM_COUNTERS;
+ int hw_counter = op_x86_virt_to_phys(i);
if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
msrs->multiplex[i].addr = MSR_K7_PERFCTR0 + hw_counter;
else
diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h
index 4e2e7c2..b837761 100644
--- a/arch/x86/oprofile/op_x86_model.h
+++ b/arch/x86/oprofile/op_x86_model.h
@@ -60,6 +60,7 @@ struct op_counter_config;
extern u64 op_x86_get_ctrl(struct op_x86_model_spec const *model,
struct op_counter_config *counter_config);
extern int op_x86_phys_to_virt(int phys);
+extern int op_x86_virt_to_phys(int virt);

extern struct op_x86_model_spec op_ppro_spec;
extern struct op_x86_model_spec op_p4_spec;
--
1.6.3.3

Subject: [PATCH 21/26] x86/oprofile: Enable multiplexing only if the model supports it

This patch checks if the model supports multiplexing. Only then
multiplexing will be enabled. The code is added to the common x86
initialization.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index dca7240..f0fb447 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -258,6 +258,12 @@ static int nmi_switch_event(void)
return 0;
}

+static inline void mux_init(struct oprofile_operations *ops)
+{
+ if (has_mux())
+ ops->switch_events = nmi_switch_event;
+}
+
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
@@ -265,6 +271,7 @@ static inline void nmi_shutdown_mux(void) { }
static inline int nmi_setup_mux(void) { return 1; }
static inline void
nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs) { }
+static inline void mux_init(struct oprofile_operations *ops) { }

#endif

@@ -682,9 +689,6 @@ int __init op_nmi_init(struct oprofile_operations *ops)
ops->start = nmi_start;
ops->stop = nmi_stop;
ops->cpu_type = cpu_type;
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- ops->switch_events = nmi_switch_event;
-#endif

if (model->init)
ret = model->init(ops);
@@ -694,6 +698,8 @@ int __init op_nmi_init(struct oprofile_operations *ops)
if (!model->num_virt_counters)
model->num_virt_counters = model->num_counters;

+ mux_init(ops);
+
init_sysfs();
using_nmi = 1;
printk(KERN_INFO "oprofile: using NMI interrupt.\n");
--
1.6.3.3

Subject: [PATCH 09/26] oprofile: Remove oprofile_multiplexing_init()

oprofile_multiplexing_init() can be removed when moving the
initialization of oprofile_time_slice to oprofile_create_files().

Signed-off-by: Robert Richter <[email protected]>
---
drivers/oprofile/oprof.c | 14 --------------
drivers/oprofile/oprofile_files.c | 2 ++
2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 2b33de7..fa6cccd 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -33,7 +33,6 @@ static DEFINE_MUTEX(start_mutex);

static void switch_worker(struct work_struct *work);
static DECLARE_DELAYED_WORK(switch_work, switch_worker);
-#define TIME_SLICE_DEFAULT 1

#endif

@@ -248,23 +247,10 @@ out:
return err;
}

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-static void __init oprofile_multiplexing_init(void)
-{
- oprofile_time_slice = msecs_to_jiffies(TIME_SLICE_DEFAULT);
-}
-
-#endif
-
static int __init oprofile_init(void)
{
int err;

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- oprofile_multiplexing_init();
-#endif
-
err = oprofile_arch_init(&oprofile_ops);

if (err < 0 || timer) {
diff --git a/drivers/oprofile/oprofile_files.c b/drivers/oprofile/oprofile_files.c
index 4c5b947..bbd7516 100644
--- a/drivers/oprofile/oprofile_files.c
+++ b/drivers/oprofile/oprofile_files.c
@@ -18,6 +18,7 @@
#define BUFFER_SIZE_DEFAULT 131072
#define CPU_BUFFER_SIZE_DEFAULT 8192
#define BUFFER_WATERSHED_DEFAULT 32768 /* FIXME: tune */
+#define TIME_SLICE_DEFAULT 1

unsigned long oprofile_buffer_size;
unsigned long oprofile_cpu_buffer_size;
@@ -170,6 +171,7 @@ void oprofile_create_files(struct super_block *sb, struct dentry *root)
oprofile_buffer_size = BUFFER_SIZE_DEFAULT;
oprofile_cpu_buffer_size = CPU_BUFFER_SIZE_DEFAULT;
oprofile_buffer_watershed = BUFFER_WATERSHED_DEFAULT;
+ oprofile_time_slice = msecs_to_jiffies(TIME_SLICE_DEFAULT);

oprofilefs_create_file(sb, root, "enable", &enable_fops);
oprofilefs_create_file_perm(sb, root, "dump", &dump_fops, 0666);
--
1.6.3.3

Subject: [PATCH 02/26] x86/oprofile: Whitespaces changes only

This patch fixes whitespace changes of code that will be touched in
follow-on patches.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 12 ++++++------
arch/x86/oprofile/op_model_amd.c | 12 ++++++------
arch/x86/oprofile/op_model_p4.c | 8 ++++----
arch/x86/oprofile/op_model_ppro.c | 8 ++++----
4 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 25da1e1..fca8dc9 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -516,12 +516,12 @@ int __init op_nmi_init(struct oprofile_operations *ops)
register_cpu_notifier(&oprofile_cpu_nb);
#endif
/* default values, can be overwritten by model */
- ops->create_files = nmi_create_files;
- ops->setup = nmi_setup;
- ops->shutdown = nmi_shutdown;
- ops->start = nmi_start;
- ops->stop = nmi_stop;
- ops->cpu_type = cpu_type;
+ ops->create_files = nmi_create_files;
+ ops->setup = nmi_setup;
+ ops->shutdown = nmi_shutdown;
+ ops->start = nmi_start;
+ ops->stop = nmi_stop;
+ ops->cpu_type = cpu_type;

if (model->init)
ret = model->init(ops);
diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index 7ca8306..f676f88 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -91,7 +91,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
int i;

/* clear all counters */
- for (i = 0 ; i < NUM_CONTROLS; ++i) {
+ for (i = 0; i < NUM_CONTROLS; ++i) {
if (unlikely(!msrs->controls[i].addr))
continue;
rdmsrl(msrs->controls[i].addr, val);
@@ -229,7 +229,7 @@ static int op_amd_check_ctrs(struct pt_regs * const regs,
u64 val;
int i;

- for (i = 0 ; i < NUM_COUNTERS; ++i) {
+ for (i = 0; i < NUM_COUNTERS; ++i) {
if (!reset_value[i])
continue;
rdmsrl(msrs->counters[i].addr, val);
@@ -250,7 +250,7 @@ static void op_amd_start(struct op_msrs const * const msrs)
{
u64 val;
int i;
- for (i = 0 ; i < NUM_COUNTERS ; ++i) {
+ for (i = 0; i < NUM_COUNTERS; ++i) {
if (reset_value[i]) {
rdmsrl(msrs->controls[i].addr, val);
val |= ARCH_PERFMON_EVENTSEL0_ENABLE;
@@ -270,7 +270,7 @@ static void op_amd_stop(struct op_msrs const * const msrs)
* Subtle: stop on all counters to avoid race with setting our
* pm callback
*/
- for (i = 0 ; i < NUM_COUNTERS ; ++i) {
+ for (i = 0; i < NUM_COUNTERS; ++i) {
if (!reset_value[i])
continue;
rdmsrl(msrs->controls[i].addr, val);
@@ -285,11 +285,11 @@ static void op_amd_shutdown(struct op_msrs const * const msrs)
{
int i;

- for (i = 0 ; i < NUM_COUNTERS ; ++i) {
+ for (i = 0; i < NUM_COUNTERS; ++i) {
if (msrs->counters[i].addr)
release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
}
- for (i = 0 ; i < NUM_CONTROLS ; ++i) {
+ for (i = 0; i < NUM_CONTROLS; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
}
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 9db9e36..5921b7f 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -558,7 +558,7 @@ static void p4_setup_ctrs(struct op_x86_model_spec const *model,
}

/* clear the cccrs we will use */
- for (i = 0 ; i < num_counters ; i++) {
+ for (i = 0; i < num_counters; i++) {
if (unlikely(!msrs->controls[i].addr))
continue;
rdmsr(p4_counters[VIRT_CTR(stag, i)].cccr_address, low, high);
@@ -575,7 +575,7 @@ static void p4_setup_ctrs(struct op_x86_model_spec const *model,
}

/* setup all counters */
- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (counter_config[i].enabled && msrs->controls[i].addr) {
reset_value[i] = counter_config[i].count;
pmc_setup_one_p4_counter(i);
@@ -678,7 +678,7 @@ static void p4_shutdown(struct op_msrs const * const msrs)
{
int i;

- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (msrs->counters[i].addr)
release_perfctr_nmi(msrs->counters[i].addr);
}
@@ -687,7 +687,7 @@ static void p4_shutdown(struct op_msrs const * const msrs)
* conjunction with the counter registers (hence the starting offset).
* This saves a few bits.
*/
- for (i = num_counters ; i < num_controls ; ++i) {
+ for (i = num_counters; i < num_controls; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(msrs->controls[i].addr);
}
diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index cd72d5c..570d717 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -81,7 +81,7 @@ static void ppro_setup_ctrs(struct op_x86_model_spec const *model,
}

/* clear all counters */
- for (i = 0 ; i < num_counters; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (unlikely(!msrs->controls[i].addr))
continue;
rdmsrl(msrs->controls[i].addr, val);
@@ -125,7 +125,7 @@ static int ppro_check_ctrs(struct pt_regs * const regs,
if (unlikely(!reset_value))
goto out;

- for (i = 0 ; i < num_counters; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (!reset_value[i])
continue;
rdmsrl(msrs->counters[i].addr, val);
@@ -188,11 +188,11 @@ static void ppro_shutdown(struct op_msrs const * const msrs)
{
int i;

- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (msrs->counters[i].addr)
release_perfctr_nmi(MSR_P6_PERFCTR0 + i);
}
- for (i = 0 ; i < num_counters ; ++i) {
+ for (i = 0; i < num_counters; ++i) {
if (msrs->controls[i].addr)
release_evntsel_nmi(MSR_P6_EVNTSEL0 + i);
}
--
1.6.3.3

Subject: [PATCH 18/26] x86/oprofile: Remove unused num_virt_controls from struct op_x86_model_spec

The member num_virt_controls of struct op_x86_model_spec is not
used. This patch removes it.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/op_model_amd.c | 1 -
arch/x86/oprofile/op_model_p4.c | 2 --
arch/x86/oprofile/op_model_ppro.c | 1 -
arch/x86/oprofile/op_x86_model.h | 1 -
4 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index 39604b4..dce69b5 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -530,7 +530,6 @@ struct op_x86_model_spec op_amd_spec = {
.num_counters = NUM_COUNTERS,
.num_controls = NUM_CONTROLS,
.num_virt_counters = NUM_VIRT_COUNTERS,
- .num_virt_controls = NUM_VIRT_CONTROLS,
.reserved = MSR_AMD_EVENTSEL_RESERVED,
.event_mask = OP_EVENT_MASK,
.init = op_amd_init,
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 40df028..0a4f2de 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -699,7 +699,6 @@ struct op_x86_model_spec op_p4_ht2_spec = {
.num_counters = NUM_COUNTERS_HT2,
.num_controls = NUM_CONTROLS_HT2,
.num_virt_counters = NUM_COUNTERS_HT2,
- .num_virt_controls = NUM_CONTROLS_HT2,
.fill_in_addresses = &p4_fill_in_addresses,
.setup_ctrs = &p4_setup_ctrs,
.check_ctrs = &p4_check_ctrs,
@@ -713,7 +712,6 @@ struct op_x86_model_spec op_p4_spec = {
.num_counters = NUM_COUNTERS_NON_HT,
.num_controls = NUM_CONTROLS_NON_HT,
.num_virt_counters = NUM_COUNTERS_NON_HT,
- .num_virt_controls = NUM_CONTROLS_NON_HT,
.fill_in_addresses = &p4_fill_in_addresses,
.setup_ctrs = &p4_setup_ctrs,
.check_ctrs = &p4_check_ctrs,
diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c
index 659f3b6..753a02a 100644
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -207,7 +207,6 @@ struct op_x86_model_spec op_ppro_spec = {
.num_counters = 2,
.num_controls = 2,
.num_virt_counters = 2,
- .num_virt_controls = 2,
.reserved = MSR_PPRO_EVENTSEL_RESERVED,
.fill_in_addresses = &ppro_fill_in_addresses,
.setup_ctrs = &ppro_setup_ctrs,
diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h
index 0c886fa..4e2e7c2 100644
--- a/arch/x86/oprofile/op_x86_model.h
+++ b/arch/x86/oprofile/op_x86_model.h
@@ -37,7 +37,6 @@ struct op_x86_model_spec {
unsigned int num_counters;
unsigned int num_controls;
unsigned int num_virt_counters;
- unsigned int num_virt_controls;
u64 reserved;
u16 event_mask;
int (*init)(struct oprofile_operations *ops);
--
1.6.3.3

Subject: [PATCH 13/26] x86/oprofile: Implement multiplexing setup/shutdown functions

This patch implements nmi_setup_mux() and nmi_shutdown_mux() functions
to setup/shutdown multiplexing. Multiplexing code in nmi_int.c is now
much more separated.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 76 ++++++++++++++++++++++--------------------
1 files changed, 40 insertions(+), 36 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 02b57b8..674fa37 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -106,9 +106,35 @@ inline int op_x86_phys_to_virt(int phys)
return __get_cpu_var(switch_index) + phys;
}

+static void nmi_shutdown_mux(void)
+{
+ int i;
+ for_each_possible_cpu(i) {
+ kfree(per_cpu(cpu_msrs, i).multiplex);
+ per_cpu(cpu_msrs, i).multiplex = NULL;
+ per_cpu(switch_index, i) = 0;
+ }
+}
+
+static int nmi_setup_mux(void)
+{
+ size_t multiplex_size =
+ sizeof(struct op_msr) * model->num_virt_counters;
+ int i;
+ for_each_possible_cpu(i) {
+ per_cpu(cpu_msrs, i).multiplex =
+ kmalloc(multiplex_size, GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).multiplex)
+ return 0;
+ }
+ return 1;
+}
+
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
+static inline void nmi_shutdown_mux(void) { }
+static inline int nmi_setup_mux(void) { return 1; }

#endif

@@ -120,51 +146,27 @@ static void free_msrs(void)
per_cpu(cpu_msrs, i).counters = NULL;
kfree(per_cpu(cpu_msrs, i).controls);
per_cpu(cpu_msrs, i).controls = NULL;
-
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- kfree(per_cpu(cpu_msrs, i).multiplex);
- per_cpu(cpu_msrs, i).multiplex = NULL;
-#endif
}
}

static int allocate_msrs(void)
{
- int success = 1;
size_t controls_size = sizeof(struct op_msr) * model->num_controls;
size_t counters_size = sizeof(struct op_msr) * model->num_counters;
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- size_t multiplex_size = sizeof(struct op_msr) * model->num_virt_counters;
-#endif

int i;
for_each_possible_cpu(i) {
per_cpu(cpu_msrs, i).counters = kmalloc(counters_size,
- GFP_KERNEL);
- if (!per_cpu(cpu_msrs, i).counters) {
- success = 0;
- break;
- }
+ GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).counters)
+ return 0;
per_cpu(cpu_msrs, i).controls = kmalloc(controls_size,
- GFP_KERNEL);
- if (!per_cpu(cpu_msrs, i).controls) {
- success = 0;
- break;
- }
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- per_cpu(cpu_msrs, i).multiplex =
- kmalloc(multiplex_size, GFP_KERNEL);
- if (!per_cpu(cpu_msrs, i).multiplex) {
- success = 0;
- break;
- }
-#endif
+ GFP_KERNEL);
+ if (!per_cpu(cpu_msrs, i).controls)
+ return 0;
}

- if (!success)
- free_msrs();
-
- return success;
+ return 1;
}

#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
@@ -218,11 +220,15 @@ static int nmi_setup(void)
int cpu;

if (!allocate_msrs())
- return -ENOMEM;
+ err = -ENOMEM;
+ else if (!nmi_setup_mux())
+ err = -ENOMEM;
+ else
+ err = register_die_notifier(&profile_exceptions_nb);

- err = register_die_notifier(&profile_exceptions_nb);
if (err) {
free_msrs();
+ nmi_shutdown_mux();
return err;
}

@@ -314,9 +320,6 @@ static void nmi_cpu_shutdown(void *dummy)
apic_write(APIC_LVTPC, per_cpu(saved_lvtpc, cpu));
apic_write(APIC_LVTERR, v);
nmi_cpu_restore_registers(msrs);
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- per_cpu(switch_index, cpu) = 0;
-#endif
}

static void nmi_shutdown(void)
@@ -326,6 +329,7 @@ static void nmi_shutdown(void)
nmi_enabled = 0;
on_each_cpu(nmi_cpu_shutdown, NULL, 1);
unregister_die_notifier(&profile_exceptions_nb);
+ nmi_shutdown_mux();
msrs = &get_cpu_var(cpu_msrs);
model->shutdown(msrs);
free_msrs();
--
1.6.3.3

Subject: [PATCH 26/26] x86/oprofile: Small coding style fixes

Some small coding style fixes.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/op_model_amd.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index 1ea1982..827beec 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -144,11 +144,10 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,

/* setup reset_value */
for (i = 0; i < NUM_VIRT_COUNTERS; ++i) {
- if (counter_config[i].enabled) {
+ if (counter_config[i].enabled)
reset_value[i] = counter_config[i].count;
- } else {
+ else
reset_value[i] = 0;
- }
}

/* clear all counters */
--
1.6.3.3

Subject: [PATCH 16/26] x86/oprofile: Moving nmi_cpu_switch() in nmi_int.c

This patch moves some code in nmi_int.c to get a single separate
multiplexing code section.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 144 +++++++++++++++++++++----------------------
1 files changed, 70 insertions(+), 74 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index f38c5cf..998c7dc 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -97,6 +97,29 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs)
}
}

+static void nmi_cpu_start(void *dummy)
+{
+ struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+ model->start(msrs);
+}
+
+static int nmi_start(void)
+{
+ on_each_cpu(nmi_cpu_start, NULL, 1);
+ return 0;
+}
+
+static void nmi_cpu_stop(void *dummy)
+{
+ struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+ model->stop(msrs);
+}
+
+static void nmi_stop(void)
+{
+ on_each_cpu(nmi_cpu_stop, NULL, 1);
+}
+
#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX

static DEFINE_PER_CPU(int, switch_index);
@@ -171,6 +194,53 @@ static void nmi_cpu_restore_mpx_registers(struct op_msrs *msrs)
}
}

+static void nmi_cpu_switch(void *dummy)
+{
+ int cpu = smp_processor_id();
+ int si = per_cpu(switch_index, cpu);
+ struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
+
+ nmi_cpu_stop(NULL);
+ nmi_cpu_save_mpx_registers(msrs);
+
+ /* move to next set */
+ si += model->num_counters;
+ if ((si > model->num_virt_counters) || (counter_config[si].count == 0))
+ per_cpu(switch_index, cpu) = 0;
+ else
+ per_cpu(switch_index, cpu) = si;
+
+ model->switch_ctrl(model, msrs);
+ nmi_cpu_restore_mpx_registers(msrs);
+
+ nmi_cpu_start(NULL);
+}
+
+
+/*
+ * Quick check to see if multiplexing is necessary.
+ * The check should be sufficient since counters are used
+ * in ordre.
+ */
+static int nmi_multiplex_on(void)
+{
+ return counter_config[model->num_counters].count ? 0 : -EINVAL;
+}
+
+static int nmi_switch_event(void)
+{
+ if (!model->switch_ctrl)
+ return -ENOSYS; /* not implemented */
+ if (nmi_multiplex_on() < 0)
+ return -EINVAL; /* not necessary */
+
+ on_each_cpu(nmi_cpu_switch, NULL, 1);
+
+ atomic_inc(&multiplex_counter);
+
+ return 0;
+}
+
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
@@ -325,29 +395,6 @@ static void nmi_shutdown(void)
put_cpu_var(cpu_msrs);
}

-static void nmi_cpu_start(void *dummy)
-{
- struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
- model->start(msrs);
-}
-
-static int nmi_start(void)
-{
- on_each_cpu(nmi_cpu_start, NULL, 1);
- return 0;
-}
-
-static void nmi_cpu_stop(void *dummy)
-{
- struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
- model->stop(msrs);
-}
-
-static void nmi_stop(void)
-{
- on_each_cpu(nmi_cpu_stop, NULL, 1);
-}
-
static int nmi_create_files(struct super_block *sb, struct dentry *root)
{
unsigned int i;
@@ -379,57 +426,6 @@ static int nmi_create_files(struct super_block *sb, struct dentry *root)
return 0;
}

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-static void nmi_cpu_switch(void *dummy)
-{
- int cpu = smp_processor_id();
- int si = per_cpu(switch_index, cpu);
- struct op_msrs *msrs = &per_cpu(cpu_msrs, cpu);
-
- nmi_cpu_stop(NULL);
- nmi_cpu_save_mpx_registers(msrs);
-
- /* move to next set */
- si += model->num_counters;
- if ((si > model->num_virt_counters) || (counter_config[si].count == 0))
- per_cpu(switch_index, cpu) = 0;
- else
- per_cpu(switch_index, cpu) = si;
-
- model->switch_ctrl(model, msrs);
- nmi_cpu_restore_mpx_registers(msrs);
-
- nmi_cpu_start(NULL);
-}
-
-
-/*
- * Quick check to see if multiplexing is necessary.
- * The check should be sufficient since counters are used
- * in ordre.
- */
-static int nmi_multiplex_on(void)
-{
- return counter_config[model->num_counters].count ? 0 : -EINVAL;
-}
-
-static int nmi_switch_event(void)
-{
- if (!model->switch_ctrl)
- return -ENOSYS; /* not implemented */
- if (nmi_multiplex_on() < 0)
- return -EINVAL; /* not necessary */
-
- on_each_cpu(nmi_cpu_switch, NULL, 1);
-
- atomic_inc(&multiplex_counter);
-
- return 0;
-}
-
-#endif
-
#ifdef CONFIG_SMP
static int oprofile_cpu_notifier(struct notifier_block *b, unsigned long action,
void *data)
--
1.6.3.3

Subject: [PATCH 23/26] oprofile: Adding switch counter to oprofile statistic variables

This patch moves the multiplexing switch counter from x86 code to
common oprofile statistic variables. Now the value will be available
and usable for all architectures. The initialization and
incrementation also moved to common code.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 7 -------
drivers/oprofile/oprof.c | 7 +++++--
drivers/oprofile/oprofile_stats.c | 9 ++-------
drivers/oprofile/oprofile_stats.h | 1 +
4 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index da6d2ab..7b3362f 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -34,11 +34,6 @@ static DEFINE_PER_CPU(unsigned long, saved_lvtpc);
/* 0 == registered but off, 1 == registered and on */
static int nmi_enabled = 0;

-
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-extern atomic_t multiplex_counter;
-#endif
-
struct op_counter_config counter_config[OP_MAX_COUNTER];

/* common functions */
@@ -253,8 +248,6 @@ static int nmi_switch_event(void)

on_each_cpu(nmi_cpu_switch, NULL, 1);

- atomic_inc(&multiplex_counter);
-
return 0;
}

diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index a48294a..dc8a042 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -107,8 +107,11 @@ static void stop_switch_worker(void)

static void switch_worker(struct work_struct *work)
{
- if (!oprofile_ops.switch_events())
- start_switch_worker();
+ if (oprofile_ops.switch_events())
+ return;
+
+ atomic_inc(&oprofile_stats.multiplex_counter);
+ start_switch_worker();
}

/* User inputs in ms, converts to jiffies */
diff --git a/drivers/oprofile/oprofile_stats.c b/drivers/oprofile/oprofile_stats.c
index 77a57a6..61689e8 100644
--- a/drivers/oprofile/oprofile_stats.c
+++ b/drivers/oprofile/oprofile_stats.c
@@ -16,9 +16,6 @@
#include "cpu_buffer.h"

struct oprofile_stat_struct oprofile_stats;
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-atomic_t multiplex_counter;
-#endif

void oprofile_reset_stats(void)
{
@@ -37,9 +34,7 @@ void oprofile_reset_stats(void)
atomic_set(&oprofile_stats.sample_lost_no_mapping, 0);
atomic_set(&oprofile_stats.event_lost_overflow, 0);
atomic_set(&oprofile_stats.bt_lost_no_mapping, 0);
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- atomic_set(&multiplex_counter, 0);
-#endif
+ atomic_set(&oprofile_stats.multiplex_counter, 0);
}


@@ -84,6 +79,6 @@ void oprofile_create_stats_files(struct super_block *sb, struct dentry *root)
&oprofile_stats.bt_lost_no_mapping);
#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
oprofilefs_create_ro_atomic(sb, dir, "multiplex_counter",
- &multiplex_counter);
+ &oprofile_stats.multiplex_counter);
#endif
}
diff --git a/drivers/oprofile/oprofile_stats.h b/drivers/oprofile/oprofile_stats.h
index 3da0d08..0b54e46 100644
--- a/drivers/oprofile/oprofile_stats.h
+++ b/drivers/oprofile/oprofile_stats.h
@@ -17,6 +17,7 @@ struct oprofile_stat_struct {
atomic_t sample_lost_no_mapping;
atomic_t bt_lost_no_mapping;
atomic_t event_lost_overflow;
+ atomic_t multiplex_counter;
};

extern struct oprofile_stat_struct oprofile_stats;
--
1.6.3.3

Subject: [PATCH 14/26] x86/oprofile: Moving nmi_setup_cpu_mux() in nmi_int.c

This patch moves some code in nmi_int.c to get a single separate
multiplexing code section.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 45 ++++++++++++++++++------------------------
1 files changed, 19 insertions(+), 26 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 674fa37..b1edfc9 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -130,11 +130,30 @@ static int nmi_setup_mux(void)
return 1;
}

+static void nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs)
+{
+ int i;
+ struct op_msr *multiplex = msrs->multiplex;
+
+ for (i = 0; i < model->num_virt_counters; ++i) {
+ if (counter_config[i].enabled) {
+ multiplex[i].saved = -(u64)counter_config[i].count;
+ } else {
+ multiplex[i].addr = 0;
+ multiplex[i].saved = 0;
+ }
+ }
+
+ per_cpu(switch_index, cpu) = 0;
+}
+
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
static inline void nmi_shutdown_mux(void) { }
static inline int nmi_setup_mux(void) { return 1; }
+static inline void
+nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs) { }

#endif

@@ -169,32 +188,6 @@ static int allocate_msrs(void)
return 1;
}

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-static void nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs)
-{
- int i;
- struct op_msr *multiplex = msrs->multiplex;
-
- for (i = 0; i < model->num_virt_counters; ++i) {
- if (counter_config[i].enabled) {
- multiplex[i].saved = -(u64)counter_config[i].count;
- } else {
- multiplex[i].addr = 0;
- multiplex[i].saved = 0;
- }
- }
-
- per_cpu(switch_index, cpu) = 0;
-}
-
-#else
-
-static inline void
-nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs) { }
-
-#endif
-
static void nmi_cpu_setup(void *dummy)
{
int cpu = smp_processor_id();
--
1.6.3.3

Subject: [PATCH 10/26] oprofile: Grouping multiplexing code in oprof.c

This patch moves multiplexing code to a single section of code. This
reduces the use of #ifdefs especially within functions.

Signed-off-by: Robert Richter <[email protected]>
---
drivers/oprofile/oprof.c | 100 ++++++++++++++++++++++-----------------------
1 files changed, 49 insertions(+), 51 deletions(-)

diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index fa6cccd..a48294a 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -29,13 +29,6 @@ unsigned long oprofile_backtrace_depth;
static unsigned long is_setup;
static DEFINE_MUTEX(start_mutex);

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-static void switch_worker(struct work_struct *work);
-static DECLARE_DELAYED_WORK(switch_work, switch_worker);
-
-#endif
-
/* timer
0 - use performance monitoring hardware if available
1 - use the timer int mechanism regardless
@@ -98,9 +91,18 @@ out:

#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX

+static void switch_worker(struct work_struct *work);
+static DECLARE_DELAYED_WORK(switch_work, switch_worker);
+
static void start_switch_worker(void)
{
- schedule_delayed_work(&switch_work, oprofile_time_slice);
+ if (oprofile_ops.switch_events)
+ schedule_delayed_work(&switch_work, oprofile_time_slice);
+}
+
+static void stop_switch_worker(void)
+{
+ cancel_delayed_work_sync(&switch_work);
}

static void switch_worker(struct work_struct *work)
@@ -109,6 +111,43 @@ static void switch_worker(struct work_struct *work)
start_switch_worker();
}

+/* User inputs in ms, converts to jiffies */
+int oprofile_set_timeout(unsigned long val_msec)
+{
+ int err = 0;
+ unsigned long time_slice;
+
+ mutex_lock(&start_mutex);
+
+ if (oprofile_started) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ if (!oprofile_ops.switch_events) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ time_slice = msecs_to_jiffies(val_msec);
+ if (time_slice == MAX_JIFFY_OFFSET) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ oprofile_time_slice = time_slice;
+
+out:
+ mutex_unlock(&start_mutex);
+ return err;
+
+}
+
+#else
+
+static inline void start_switch_worker(void) { }
+static inline void stop_switch_worker(void) { }
+
#endif

/* Actually start profiling (echo 1>/dev/oprofile/enable) */
@@ -131,10 +170,7 @@ int oprofile_start(void)
if ((err = oprofile_ops.start()))
goto out;

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- if (oprofile_ops.switch_events)
- start_switch_worker();
-#endif
+ start_switch_worker();

oprofile_started = 1;
out:
@@ -152,9 +188,7 @@ void oprofile_stop(void)
oprofile_ops.stop();
oprofile_started = 0;

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- cancel_delayed_work_sync(&switch_work);
-#endif
+ stop_switch_worker();

/* wake up the daemon to read what remains */
wake_up_buffer_waiter();
@@ -188,42 +222,6 @@ post_sync:
mutex_unlock(&start_mutex);
}

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-/* User inputs in ms, converts to jiffies */
-int oprofile_set_timeout(unsigned long val_msec)
-{
- int err = 0;
- unsigned long time_slice;
-
- mutex_lock(&start_mutex);
-
- if (oprofile_started) {
- err = -EBUSY;
- goto out;
- }
-
- if (!oprofile_ops.switch_events) {
- err = -EINVAL;
- goto out;
- }
-
- time_slice = msecs_to_jiffies(val_msec);
- if (time_slice == MAX_JIFFY_OFFSET) {
- err = -EINVAL;
- goto out;
- }
-
- oprofile_time_slice = time_slice;
-
-out:
- mutex_unlock(&start_mutex);
- return err;
-
-}
-
-#endif
-
int oprofile_set_backtrace(unsigned long val)
{
int err = 0;
--
1.6.3.3

Subject: [PATCH 07/26] oprofile: oprofile_set_timeout(), return with error for invalid args

Return with -EINVAL for invalid parameters instead of setting the
default value in oprofile_set_timeout().

Signed-off-by: Robert Richter <[email protected]>
---
drivers/oprofile/oprof.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 7bc64af..42c9c76 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -196,6 +196,7 @@ post_sync:
int oprofile_set_timeout(unsigned long val_msec)
{
int err = 0;
+ unsigned long time_slice;

mutex_lock(&start_mutex);

@@ -209,9 +210,13 @@ int oprofile_set_timeout(unsigned long val_msec)
goto out;
}

- timeout_jiffies = msecs_to_jiffies(val_msec);
- if (timeout_jiffies == MAX_JIFFY_OFFSET)
- timeout_jiffies = msecs_to_jiffies(MULTIPLEXING_TIMER_DEFAULT);
+ time_slice = msecs_to_jiffies(val_msec);
+ if (time_slice == MAX_JIFFY_OFFSET) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ timeout_jiffies = time_slice;

out:
mutex_unlock(&start_mutex);
--
1.6.3.3

Subject: [PATCH 20/26] x86/oprofile: Add function has_mux() to check multiplexing support

The check is used to prevent running multiplexing code for models not
supporting multiplexing. Before, the code was running but without
effect.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 19 ++++++++++++++++++-
1 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 82ee295..dca7240 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -124,6 +124,11 @@ static void nmi_stop(void)

static DEFINE_PER_CPU(int, switch_index);

+static inline int has_mux(void)
+{
+ return !!model->switch_ctrl;
+}
+
inline int op_x86_phys_to_virt(int phys)
{
return __get_cpu_var(switch_index) + phys;
@@ -132,6 +137,10 @@ inline int op_x86_phys_to_virt(int phys)
static void nmi_shutdown_mux(void)
{
int i;
+
+ if (!has_mux())
+ return;
+
for_each_possible_cpu(i) {
kfree(per_cpu(cpu_msrs, i).multiplex);
per_cpu(cpu_msrs, i).multiplex = NULL;
@@ -144,12 +153,17 @@ static int nmi_setup_mux(void)
size_t multiplex_size =
sizeof(struct op_msr) * model->num_virt_counters;
int i;
+
+ if (!has_mux())
+ return 1;
+
for_each_possible_cpu(i) {
per_cpu(cpu_msrs, i).multiplex =
kmalloc(multiplex_size, GFP_KERNEL);
if (!per_cpu(cpu_msrs, i).multiplex)
return 0;
}
+
return 1;
}

@@ -158,6 +172,9 @@ static void nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs)
int i;
struct op_msr *multiplex = msrs->multiplex;

+ if (!has_mux())
+ return;
+
for (i = 0; i < model->num_virt_counters; ++i) {
if (counter_config[i].enabled) {
multiplex[i].saved = -(u64)counter_config[i].count;
@@ -229,7 +246,7 @@ static int nmi_multiplex_on(void)

static int nmi_switch_event(void)
{
- if (!model->switch_ctrl)
+ if (!has_mux())
return -ENOSYS; /* not implemented */
if (nmi_multiplex_on() < 0)
return -EINVAL; /* not necessary */
--
1.6.3.3

Subject: [PATCH 25/26] x86/oprofile: Add counter reservation check for virtual counters

This patch adds a check for the availability of a counter. A virtual
counter is used only if its physical counter is not reserved.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 5856e61..cb88b1a 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -435,15 +435,13 @@ static int nmi_create_files(struct super_block *sb, struct dentry *root)
struct dentry *dir;
char buf[4];

-#ifndef CONFIG_OPROFILE_EVENT_MULTIPLEX
/* quick little hack to _not_ expose a counter if it is not
* available for use. This should protect userspace app.
* NOTE: assumes 1:1 mapping here (that counters are organized
* sequentially in their struct assignment).
*/
- if (unlikely(!avail_to_resrv_perfctr_nmi_bit(i)))
+ if (!avail_to_resrv_perfctr_nmi_bit(op_x86_virt_to_phys(i)))
continue;
-#endif /* CONFIG_OPROFILE_EVENT_MULTIPLEX */

snprintf(buf, sizeof(buf), "%d", i);
dir = oprofilefs_mkdir(sb, root, buf);
--
1.6.3.3

Subject: [PATCH 22/26] x86/oprofile: Implement mux_clone()

To setup a counter for all cpus, its structure is cloned from cpu
0. This patch implements mux_clone() to do this part for multiplexing
data.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/nmi_int.c | 37 +++++++++++++++++++++++--------------
1 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index f0fb447..da6d2ab 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -264,6 +264,16 @@ static inline void mux_init(struct oprofile_operations *ops)
ops->switch_events = nmi_switch_event;
}

+static void mux_clone(int cpu)
+{
+ if (!has_mux())
+ return;
+
+ memcpy(per_cpu(cpu_msrs, cpu).multiplex,
+ per_cpu(cpu_msrs, 0).multiplex,
+ sizeof(struct op_msr) * model->num_virt_counters);
+}
+
#else

inline int op_x86_phys_to_virt(int phys) { return phys; }
@@ -272,6 +282,7 @@ static inline int nmi_setup_mux(void) { return 1; }
static inline void
nmi_cpu_setup_mux(int cpu, struct op_msrs const * const msrs) { }
static inline void mux_init(struct oprofile_operations *ops) { }
+static void mux_clone(int cpu) { }

#endif

@@ -350,20 +361,18 @@ static int nmi_setup(void)
/* Assume saved/restored counters are the same on all CPUs */
model->fill_in_addresses(&per_cpu(cpu_msrs, 0));
for_each_possible_cpu(cpu) {
- if (cpu != 0) {
- memcpy(per_cpu(cpu_msrs, cpu).counters,
- per_cpu(cpu_msrs, 0).counters,
- sizeof(struct op_msr) * model->num_counters);
-
- memcpy(per_cpu(cpu_msrs, cpu).controls,
- per_cpu(cpu_msrs, 0).controls,
- sizeof(struct op_msr) * model->num_controls);
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- memcpy(per_cpu(cpu_msrs, cpu).multiplex,
- per_cpu(cpu_msrs, 0).multiplex,
- sizeof(struct op_msr) * model->num_virt_counters);
-#endif
- }
+ if (!cpu)
+ continue;
+
+ memcpy(per_cpu(cpu_msrs, cpu).counters,
+ per_cpu(cpu_msrs, 0).counters,
+ sizeof(struct op_msr) * model->num_counters);
+
+ memcpy(per_cpu(cpu_msrs, cpu).controls,
+ per_cpu(cpu_msrs, 0).controls,
+ sizeof(struct op_msr) * model->num_controls);
+
+ mux_clone(cpu);
}
on_each_cpu(nmi_cpu_setup, NULL, 1);
nmi_enabled = 1;
--
1.6.3.3

Subject: [PATCH 12/26] oprofile: Grouping multiplexing code in op_model_amd.c

This patch moves some multiplexing code to the new function
op_mux_fill_in_addresses(). Also, the whole multiplexing code is now
at a single location.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/oprofile/op_model_amd.c | 75 +++++++++++++++++++++-----------------
1 files changed, 41 insertions(+), 34 deletions(-)

diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c
index 67f830d..644980f 100644
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -74,6 +74,45 @@ static struct op_ibs_config ibs_config;

#endif

+#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
+
+static void op_mux_fill_in_addresses(struct op_msrs * const msrs)
+{
+ int i;
+
+ for (i = 0; i < NUM_VIRT_COUNTERS; i++) {
+ int hw_counter = i % NUM_COUNTERS;
+ if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
+ msrs->multiplex[i].addr = MSR_K7_PERFCTR0 + hw_counter;
+ else
+ msrs->multiplex[i].addr = 0;
+ }
+}
+
+static void op_mux_switch_ctrl(struct op_x86_model_spec const *model,
+ struct op_msrs const * const msrs)
+{
+ u64 val;
+ int i;
+
+ /* enable active counters */
+ for (i = 0; i < NUM_COUNTERS; ++i) {
+ int virt = op_x86_phys_to_virt(i);
+ if (!counter_config[virt].enabled)
+ continue;
+ rdmsrl(msrs->controls[i].addr, val);
+ val &= model->reserved;
+ val |= op_x86_get_ctrl(model, &counter_config[virt]);
+ wrmsrl(msrs->controls[i].addr, val);
+ }
+}
+
+#else
+
+static inline void op_mux_fill_in_addresses(struct op_msrs * const msrs) { }
+
+#endif
+
/* functions for op_amd_spec */

static void op_amd_fill_in_addresses(struct op_msrs * const msrs)
@@ -94,15 +133,7 @@ static void op_amd_fill_in_addresses(struct op_msrs * const msrs)
msrs->controls[i].addr = 0;
}

-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- for (i = 0; i < NUM_VIRT_COUNTERS; i++) {
- int hw_counter = i % NUM_COUNTERS;
- if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
- msrs->multiplex[i].addr = MSR_K7_PERFCTR0 + hw_counter;
- else
- msrs->multiplex[i].addr = 0;
- }
-#endif
+ op_mux_fill_in_addresses(msrs);
}

static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
@@ -155,30 +186,6 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
}
}

-
-#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
-
-static void op_amd_switch_ctrl(struct op_x86_model_spec const *model,
- struct op_msrs const * const msrs)
-{
- u64 val;
- int i;
-
- /* enable active counters */
- for (i = 0; i < NUM_COUNTERS; ++i) {
- int virt = op_x86_phys_to_virt(i);
- if (!counter_config[virt].enabled)
- continue;
- rdmsrl(msrs->controls[i].addr, val);
- val &= model->reserved;
- val |= op_x86_get_ctrl(model, &counter_config[virt]);
- wrmsrl(msrs->controls[i].addr, val);
- }
-}
-
-#endif
-
-
#ifdef CONFIG_OPROFILE_IBS

static inline int
@@ -535,6 +542,6 @@ struct op_x86_model_spec const op_amd_spec = {
.stop = &op_amd_stop,
.shutdown = &op_amd_shutdown,
#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
- .switch_ctrl = &op_amd_switch_ctrl,
+ .switch_ctrl = &op_mux_switch_ctrl,
#endif
};
--
1.6.3.3

2009-07-28 18:45:10

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 0/26] oprofile: Performance counter multiplexing

Robert Richter <[email protected]> writes:
> If unsure, say N.
>
> +config OPROFILE_EVENT_MULTIPLEX
> + bool "OProfile multiplexing support (EXPERIMENTAL)"
> + default n
> + depends on OPROFILE && X86
> + help
> + The number of hardware counters is limited. The multiplexing
> + feature enables OProfile to gather more events than counters
> + are provided by the hardware. This is realized by switching
> + between events at an user specified time interval.
> +

I would suggest to drop that CONFIG and have the code unconditional.
It's not a lot code and should be always available.

Will do more detailed review later and will look at Intel P6 support.

Where's the userland support for this?

-Andi

--
[email protected] -- Speaking for myself only.

2009-07-28 22:18:41

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: RE: [PATCH 0/26] oprofile: Performance counter multiplexing

Hi,

The user-space support is currently in AMD CodeAnalyst utility. I am
working on the more generic user-space support (non-AMD specific) and
will send the patches to OProfile for review.

Suravee

-----Original Message-----
From: Andi Kleen [mailto:[email protected]]
Sent: Tuesday, July 28, 2009 1:45 PM
To: Richter, Robert
Cc: LKML; oprofile-list
Subject: Re: [PATCH 0/26] oprofile: Performance counter multiplexing

Robert Richter <[email protected]> writes:
> If unsure, say N.
>
> +config OPROFILE_EVENT_MULTIPLEX
> + bool "OProfile multiplexing support (EXPERIMENTAL)"
> + default n
> + depends on OPROFILE && X86
> + help
> + The number of hardware counters is limited. The multiplexing
> + feature enables OProfile to gather more events than counters
> + are provided by the hardware. This is realized by switching
> + between events at an user specified time interval.
> +

I would suggest to drop that CONFIG and have the code unconditional.
It's not a lot code and should be always available.

Will do more detailed review later and will look at Intel P6 support.

Where's the userland support for this?

-Andi

--
[email protected] -- Speaking for myself only.

------------------------------------------------------------------------
------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008
30-Day
trial. Simplify your report design, integration and deployment - and
focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
oprofile-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oprofile-list

2010-02-26 14:51:12

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 0/26] oprofile: Performance counter multiplexing

Ingo Molnar <[email protected]> writes:
>
> As you note it below, in terms of development it's quite a
> distraction to have active development in both facilities, when
> oprofile is arguably on the to-be-obsoleted side of the equation.

I don't see how it is distracting (for whom anyways?) at all to develop both.
It's just like two ethernet/SCSI/... drivers being developed in parallel
by different people. Happens all the time. The Linux development model
can handle it.

That said the biggest problem with oprofile right now that the
new buffer it's using is quite a lot less reliable and drops
events left and right on any non trivial load. That makes
oprofile very unreliable, especially in call graph mode.

So if there was a "legacy oprofile" it would be better to have
one that does not use the ring buffer, but using the old buffer which
at least worked. This would drop IBS support unfortunately,
but perhaps that could be just done in perf only (depends on
how big the IBS user base for current oprofile is)

> And yes, AFAIK oprofile user-space is pretty much the only
> user-space app that relies on the oprofile ABI - at least in the OSS
> space. Robert, is there perhaps some bin-only oprofile based tool
> that you implied before? Which one is it?

There are multiple, e.g. Zoom or the old HP tool whose name
escapes me now. Also I think there are a couple of minor forks around
in various distro trees.

-Andi

--
[email protected] -- Speaking for myself only.