2014-02-11 15:20:32

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 0/8] perf/x86/uncore: add support for SNB/IVB/HSW integrated memory controller PMU

This patch series adds support for SandyBridge, IvyBridge, Haswell client (desktop/mobile)
processor PCI-based integrated memory controller PMU. This PMU provides a few free running
32-bit counters which can be used to determine memory bandwidth utilization.

The code is based on the documentation at:
http://software.intel.com/en-us/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel

The patches implement a new uncore PMU called uncore_imc.
It exports its format and events in sysfs as usual.

The following events are currently defined:
- name: uncore_imc/data_reads/
- code: 0x1
- unit: 64 bytes
- number of full cacheline (64 bytes) read requests to the IMC

- name: uncore_imc/data_writes/
- code: 0x2
- unit: 64 bytes
- number of full cacheline (64 bytes) write requests to the IMC

The unit and scale of each event are also exposed in sysfs and
are picked up by perf stat (needs v3.13).

The uncore_imc PMU is by construction system-wide only and counting
mode only. There is no priv level filtering. The kernel enforces
those restrictions. Counters are 32-bit and do not generate overflow
interrupts, therefore the kernel uses a hrtimer to poll the counters
and avoid missing an overflow.

The series includes an optional patch to alter the unit of the event
to be Mebibytes in perf stat. The kernel still exports counts as
64 bytes increments (raw value of the counter).

To use the PMU with perf stat:
# perf stat -a -e uncore_imc/data_reads/,uncore_imc/data_writes/ -I 1000 sleep 100
# time counts unit events
1.000169151 180.62 MiB uncore_imc/data_reads/
1.000169151 0.14 MiB uncore_imc/data_writes/
2.000506913 180.37 MiB uncore_imc/data_reads/
2.000506913 0.02 MiB uncore_imc/data_writes/
3.000748105 180.32 MiB uncore_imc/data_reads/
3.000748105 0.02 MiB uncore_imc/data_writes/
4.000991441 180.30 MiB uncore_imc/data_reads/

In v2, we drop a couple of patches because they were not needed.
We rebased to 3.14-rc2 from tip.git.

Signed-off-by: Stephane Eranian <[email protected]>


Stephane Eranian (8):
perf/x86/uncore: fix initialization of cpumask
perf/x86/uncore: add ability to customize pmu callbacks
perf/x86/uncore: add PCI ids for SNB/IVB/HSW IMC
perf/x86/uncore: make hrtimer timeout configurable per box
perf/x86/uncore: move uncore_event_to_box() and uncore_pmu_to_box()
perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller
support
perf/x86/uncore: add hrtimer to SNB uncore IMC PMU
perf/x86/uncore: use MiB unit for events for SNB/IVB/HSW IMC

arch/x86/kernel/cpu/perf_event_intel_uncore.c | 540 +++++++++++++++++++++----
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 5 +
include/linux/pci_ids.h | 3 +
3 files changed, 477 insertions(+), 71 deletions(-)

--
1.7.9.5


2014-02-11 15:20:38

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 1/8] perf/x86/uncore: fix initialization of cpumask

On certain processors, the uncore PMU boxes may only be
msr-bsed or PCI-based. But in both cases, the cpumask,
suggesting on which CPUs to monitor to get full coverage
of the particular PMU, must be created.

However with the current code base, the cpumask was only
created on processor which had at least one MSR-based
uncore PMU. This patch removes that restriction and
ensures the cpumask is created even when there is no
msr-based PMU. For instance, on SNB client where only
a PCI-based memory controller PMU is supported.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 61 +++++++++++++++----------
1 file changed, 37 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 29c2487..fe4255b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -3764,7 +3764,7 @@ static void __init uncore_cpu_setup(void *dummy)

static int __init uncore_cpu_init(void)
{
- int ret, cpu, max_cores;
+ int ret, max_cores;

max_cores = boot_cpu_data.x86_max_cores;
switch (boot_cpu_data.x86_model) {
@@ -3808,29 +3808,6 @@ static int __init uncore_cpu_init(void)
if (ret)
return ret;

- get_online_cpus();
-
- for_each_online_cpu(cpu) {
- int i, phys_id = topology_physical_package_id(cpu);
-
- for_each_cpu(i, &uncore_cpu_mask) {
- if (phys_id == topology_physical_package_id(i)) {
- phys_id = -1;
- break;
- }
- }
- if (phys_id < 0)
- continue;
-
- uncore_cpu_prepare(cpu, phys_id);
- uncore_event_init_cpu(cpu);
- }
- on_each_cpu(uncore_cpu_setup, NULL, 1);
-
- register_cpu_notifier(&uncore_cpu_nb);
-
- put_online_cpus();
-
return 0;
}

@@ -3859,6 +3836,41 @@ static int __init uncore_pmus_register(void)
return 0;
}

+static void uncore_cpumask_init(void)
+{
+ int cpu;
+
+ /*
+ * ony invoke once from msr or pci init code
+ */
+ if (!cpumask_empty(&uncore_cpu_mask))
+ return;
+
+ get_online_cpus();
+
+ for_each_online_cpu(cpu) {
+ int i, phys_id = topology_physical_package_id(cpu);
+
+ for_each_cpu(i, &uncore_cpu_mask) {
+ if (phys_id == topology_physical_package_id(i)) {
+ phys_id = -1;
+ break;
+ }
+ }
+ if (phys_id < 0)
+ continue;
+
+ uncore_cpu_prepare(cpu, phys_id);
+ uncore_event_init_cpu(cpu);
+ }
+ on_each_cpu(uncore_cpu_setup, NULL, 1);
+
+ register_cpu_notifier(&uncore_cpu_nb);
+
+ put_online_cpus();
+}
+
+
static int __init intel_uncore_init(void)
{
int ret;
@@ -3877,6 +3889,7 @@ static int __init intel_uncore_init(void)
uncore_pci_exit();
goto fail;
}
+ uncore_cpumask_init();

uncore_pmus_register();
return 0;
--
1.7.9.5

2014-02-11 15:20:49

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 5/8] perf/x86/uncore: move uncore_event_to_box() and uncore_pmu_to_box()

Move a couple of functions around to avoid forward declarations
when we add code later on.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 72 ++++++++++++-------------
1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index ea823b8..acbbdde 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -66,6 +66,42 @@ DEFINE_UNCORE_FORMAT_ATTR(mask_vnw, mask_vnw, "config2:3-4");
DEFINE_UNCORE_FORMAT_ATTR(mask0, mask0, "config2:0-31");
DEFINE_UNCORE_FORMAT_ATTR(mask1, mask1, "config2:32-63");

+static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
+{
+ return container_of(event->pmu, struct intel_uncore_pmu, pmu);
+}
+
+static struct intel_uncore_box *
+uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
+{
+ struct intel_uncore_box *box;
+
+ box = *per_cpu_ptr(pmu->box, cpu);
+ if (box)
+ return box;
+
+ raw_spin_lock(&uncore_box_lock);
+ list_for_each_entry(box, &pmu->box_list, list) {
+ if (box->phys_id == topology_physical_package_id(cpu)) {
+ atomic_inc(&box->refcnt);
+ *per_cpu_ptr(pmu->box, cpu) = box;
+ break;
+ }
+ }
+ raw_spin_unlock(&uncore_box_lock);
+
+ return *per_cpu_ptr(pmu->box, cpu);
+}
+
+static struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
+{
+ /*
+ * perf core schedules event on the basis of cpu, uncore events are
+ * collected by one of the cpus inside a physical package.
+ */
+ return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
+}
+
static u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
{
u64 count;
@@ -2845,42 +2881,6 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
return box;
}

-static struct intel_uncore_box *
-uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
-{
- struct intel_uncore_box *box;
-
- box = *per_cpu_ptr(pmu->box, cpu);
- if (box)
- return box;
-
- raw_spin_lock(&uncore_box_lock);
- list_for_each_entry(box, &pmu->box_list, list) {
- if (box->phys_id == topology_physical_package_id(cpu)) {
- atomic_inc(&box->refcnt);
- *per_cpu_ptr(pmu->box, cpu) = box;
- break;
- }
- }
- raw_spin_unlock(&uncore_box_lock);
-
- return *per_cpu_ptr(pmu->box, cpu);
-}
-
-static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
-{
- return container_of(event->pmu, struct intel_uncore_pmu, pmu);
-}
-
-static struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
-{
- /*
- * perf core schedules event on the basis of cpu, uncore events are
- * collected by one of the cpus inside a physical package.
- */
- return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
-}
-
static int
uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader, bool dogrp)
{
--
1.7.9.5

2014-02-11 15:20:54

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 6/8] perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

This patch adds a new uncore PMU for Intel SNB/IVB/HSW client
CPUs. It adds the Integrated Memory Controller (IMC) PMU. This
new PMU provides a set of events to measure memory bandwidth utilization.

The IMC on those processor is PCI-space based. This patch
exposes a new uncore PMU on those processor: uncore_imc

Two new events are defined:
- name: data_reads
- code: 0x1
- unit: 64 bytes
- number of full cacheline read requests to the IMC

- name: data_writes
- code: 0x2
- unit: 64 bytes
- number of full cacheline write requests to the IMC

Documentation available at:
http://software.intel.com/en-us/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 365 +++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 1 +
2 files changed, 366 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index acbbdde..2c7693a 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -66,6 +66,11 @@ DEFINE_UNCORE_FORMAT_ATTR(mask_vnw, mask_vnw, "config2:3-4");
DEFINE_UNCORE_FORMAT_ATTR(mask0, mask0, "config2:0-31");
DEFINE_UNCORE_FORMAT_ATTR(mask1, mask1, "config2:32-63");

+static void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
+static void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
+static void uncore_perf_event_update(struct intel_uncore_box *box, struct perf_event *event);
+static void uncore_pmu_event_read(struct perf_event *event);
+
static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
{
return container_of(event->pmu, struct intel_uncore_pmu, pmu);
@@ -1667,6 +1672,344 @@ static struct intel_uncore_type *snb_msr_uncores[] = {
&snb_uncore_cbox,
NULL,
};
+
+enum {
+ SNB_PCI_UNCORE_IMC,
+};
+
+static struct uncore_event_desc snb_uncore_imc_events[] = {
+ INTEL_UNCORE_EVENT_DESC(data_reads, "event=0x01"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.scale, "64"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.unit, "bytes"),
+
+ INTEL_UNCORE_EVENT_DESC(data_writes, "event=0x02"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.scale, "64"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.unit, "bytes"),
+
+ { /* end: all zeroes */ },
+};
+
+#define SNB_UNCORE_PCI_IMC_EVENT_MASK 0xff
+#define SNB_UNCORE_PCI_IMC_BAR_OFFSET 0x48
+
+/* page size multiple covering all config regs */
+#define SNB_UNCORE_PCI_IMC_MAP_SIZE 0x6000
+
+#define SNB_UNCORE_PCI_IMC_DATA_READS 0x1
+#define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
+#define SNB_UNCORE_PCI_IMC_DATA_WRITES 0x2
+#define SNB_UNCORE_PCI_IMC_DATA_WRITES_BASE 0x5054
+#define SNB_UNCORE_PCI_IMC_CTR_BASE SNB_UNCORE_PCI_IMC_DATA_READS_BASE
+
+static struct attribute *snb_uncore_imc_formats_attr[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+
+static struct attribute_group snb_uncore_imc_format_group = {
+ .name = "format",
+ .attrs = snb_uncore_imc_formats_attr,
+};
+
+static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ u32 addr_lo, addr_hi;
+ resource_size_t addr;
+
+ pci_read_config_dword(pdev, SNB_UNCORE_PCI_IMC_BAR_OFFSET, &addr_lo);
+ addr = addr_lo;
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+ pci_read_config_dword(pdev, SNB_UNCORE_PCI_IMC_BAR_OFFSET+4, &addr_hi);
+ addr = ((resource_size_t)addr_hi << 32) | addr_lo;
+#endif
+
+ addr &= ~(PAGE_SIZE - 1);
+
+ box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+}
+
+static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
+{}
+
+static void snb_uncore_imc_disable_box(struct intel_uncore_box *box)
+{}
+
+static void snb_uncore_imc_enable_event(struct intel_uncore_box *box, struct perf_event *event)
+{}
+
+static void snb_uncore_imc_disable_event(struct intel_uncore_box *box, struct perf_event *event)
+{}
+
+static u64 snb_uncore_imc_read_counter(struct intel_uncore_box *box, struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ return (u64)*(unsigned int *)(box->io_addr + hwc->event_base);
+}
+
+/*
+ * custom event_init() function because we define our own fixed, free
+ * running counters, so we do not want to conflict with generic uncore
+ * logic. Also simplifies processing
+ */
+static int snb_uncore_imc_event_init(struct perf_event *event)
+{
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ struct hw_perf_event *hwc = &event->hw;
+ u64 cfg = event->attr.config & SNB_UNCORE_PCI_IMC_EVENT_MASK;
+ int idx, base;
+
+ if (event->attr.type != event->pmu->type)
+ return -ENOENT;
+
+ pmu = uncore_event_to_pmu(event);
+ /* no device found for this pmu */
+ if (pmu->func_id < 0)
+ return -ENOENT;
+
+ /* Sampling not supported yet */
+ if (hwc->sample_period)
+ return -EINVAL;
+
+ /* unsupported modes and filters */
+ if (event->attr.exclude_user ||
+ event->attr.exclude_kernel ||
+ event->attr.exclude_hv ||
+ event->attr.exclude_idle ||
+ event->attr.exclude_host ||
+ event->attr.exclude_guest ||
+ event->attr.sample_period) /* no sampling */
+ return -EINVAL;
+
+ /*
+ * Place all uncore events for a particular physical package
+ * onto a single cpu
+ */
+ if (event->cpu < 0)
+ return -EINVAL;
+
+ /* check only supported bits are set */
+ if (event->attr.config & ~SNB_UNCORE_PCI_IMC_EVENT_MASK)
+ return -EINVAL;
+
+ box = uncore_pmu_to_box(pmu, event->cpu);
+ if (!box || box->cpu < 0)
+ return -EINVAL;
+
+ event->cpu = box->cpu;
+
+ event->hw.idx = -1;
+ event->hw.last_tag = ~0ULL;
+ event->hw.extra_reg.idx = EXTRA_REG_NONE;
+ event->hw.branch_reg.idx = EXTRA_REG_NONE;
+ /*
+ * check event is known (whitelist, determines counter)
+ */
+ switch (cfg) {
+ case SNB_UNCORE_PCI_IMC_DATA_READS:
+ base = SNB_UNCORE_PCI_IMC_DATA_READS_BASE;
+ idx = UNCORE_PMC_IDX_FIXED;
+ break;
+ case SNB_UNCORE_PCI_IMC_DATA_WRITES:
+ base = SNB_UNCORE_PCI_IMC_DATA_WRITES_BASE;
+ idx = UNCORE_PMC_IDX_FIXED + 1;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* must be done before validate_group */
+ event->hw.event_base = base;
+ event->hw.config = cfg;
+ event->hw.idx = idx;
+
+ /* no group validation needed, we have free running counters */
+
+ return 0;
+}
+
+static int snb_uncore_imc_hw_config(struct intel_uncore_box *box, struct perf_event *event)
+{
+ return 0;
+}
+
+static void snb_uncore_imc_event_start(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ u64 count;
+
+ if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
+ return;
+
+ event->hw.state = 0;
+ box->n_active++;
+
+ list_add_tail(&event->active_entry, &box->active_list);
+
+ count = snb_uncore_imc_read_counter(box, event);
+ local64_set(&event->hw.prev_count, count);
+
+ if (box->n_active == 1)
+ uncore_pmu_start_hrtimer(box);
+}
+
+static void snb_uncore_imc_event_stop(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (!(hwc->state & PERF_HES_STOPPED)) {
+ box->n_active--;
+
+ WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+ hwc->state |= PERF_HES_STOPPED;
+
+ list_del(&event->active_entry);
+
+ if (box->n_active == 0)
+ uncore_pmu_cancel_hrtimer(box);
+ }
+
+ if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+ /*
+ * Drain the remaining delta count out of a event
+ * that we are disabling:
+ */
+ uncore_perf_event_update(box, event);
+ hwc->state |= PERF_HES_UPTODATE;
+ }
+}
+
+static int snb_uncore_imc_event_add(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (!box)
+ return -ENODEV;
+
+ hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+ if (!(flags & PERF_EF_START))
+ hwc->state |= PERF_HES_ARCH;
+
+ snb_uncore_imc_event_start(event, 0);
+
+ box->n_events++;
+
+ return 0;
+}
+
+static void snb_uncore_imc_event_del(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ int i;
+
+ snb_uncore_imc_event_stop(event, PERF_EF_UPDATE);
+
+ for (i = 0; i < box->n_events; i++) {
+ if (event == box->event_list[i]) {
+ --box->n_events;
+ break;
+ }
+ }
+}
+
+static int snb_pci2phy_map_init(int devid)
+{
+ struct pci_dev *dev = NULL;
+ int bus;
+
+ dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, dev);
+ if (!dev)
+ return -ENOTTY;
+
+ bus = dev->bus->number;
+
+ pcibus_to_physid[bus] = 0;
+
+ pci_dev_put(dev);
+
+ return 0;
+}
+
+static struct pmu snb_uncore_imc_pmu = {
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = snb_uncore_imc_event_init,
+ .add = snb_uncore_imc_event_add,
+ .del = snb_uncore_imc_event_del,
+ .start = snb_uncore_imc_event_start,
+ .stop = snb_uncore_imc_event_stop,
+ .read = uncore_pmu_event_read,
+};
+
+static struct intel_uncore_ops snb_uncore_imc_ops = {
+ .init_box = snb_uncore_imc_init_box,
+ .enable_box = snb_uncore_imc_enable_box,
+ .disable_box = snb_uncore_imc_disable_box,
+ .disable_event = snb_uncore_imc_disable_event,
+ .enable_event = snb_uncore_imc_enable_event,
+ .hw_config = snb_uncore_imc_hw_config,
+ .read_counter = snb_uncore_imc_read_counter,
+};
+
+static struct intel_uncore_type snb_uncore_imc = {
+ .name = "imc",
+ .num_counters = 2,
+ .num_boxes = 1,
+ .fixed_ctr_bits = 32,
+ .fixed_ctr = SNB_UNCORE_PCI_IMC_CTR_BASE,
+ .event_descs = snb_uncore_imc_events,
+ .format_group = &snb_uncore_imc_format_group,
+ .perf_ctr = SNB_UNCORE_PCI_IMC_DATA_READS_BASE,
+ .event_mask = SNB_UNCORE_PCI_IMC_EVENT_MASK,
+ .ops = &snb_uncore_imc_ops,
+ .pmu = &snb_uncore_imc_pmu,
+};
+
+static struct intel_uncore_type *snb_pci_uncores[] = {
+ [SNB_PCI_UNCORE_IMC] = &snb_uncore_imc,
+ NULL,
+};
+
+static DEFINE_PCI_DEVICE_TABLE(snb_uncore_pci_ids) = {
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SNB_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+};
+
+static DEFINE_PCI_DEVICE_TABLE(ivb_uncore_pci_ids) = {
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+};
+
+static DEFINE_PCI_DEVICE_TABLE(hsw_uncore_pci_ids) = {
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_HSW_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+};
+
+static struct pci_driver snb_uncore_pci_driver = {
+ .name = "snb_uncore",
+ .id_table = snb_uncore_pci_ids,
+};
+
+static struct pci_driver ivb_uncore_pci_driver = {
+ .name = "ivb_uncore",
+ .id_table = ivb_uncore_pci_ids,
+};
+
+static struct pci_driver hsw_uncore_pci_driver = {
+ .name = "hsw_uncore",
+ .id_table = hsw_uncore_pci_ids,
+};
+
/* end of Sandy Bridge uncore support */

/* Nehalem uncore support */
@@ -3501,6 +3844,28 @@ static int __init uncore_pci_init(void)
pci_uncores = ivt_pci_uncores;
uncore_pci_driver = &ivt_uncore_pci_driver;
break;
+ case 42: /* Sandy Bridge */
+ ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_SNB_IMC);
+ if (ret)
+ return ret;
+ pci_uncores = snb_pci_uncores;
+ uncore_pci_driver = &snb_uncore_pci_driver;
+ break;
+ case 60: /* Haswell */
+ case 69: /* Haswell Celeron */
+ ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);
+ if (ret)
+ return ret;
+ pci_uncores = snb_pci_uncores;
+ uncore_pci_driver = &hsw_uncore_pci_driver;
+ break;
+ case 58: /* Ivy Bridge */
+ ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
+ if (ret)
+ return ret;
+ pci_uncores = snb_pci_uncores;
+ uncore_pci_driver = &ivb_uncore_pci_driver;
+ break;
default:
return 0;
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index 7efd298..fbf45a0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -492,6 +492,7 @@ struct intel_uncore_box {
u64 hrtimer_duration; /* hrtimer timeout for this box */
struct hrtimer hrtimer;
struct list_head list;
+ void *io_addr;
struct intel_uncore_extra_reg shared_regs[0];
};

--
1.7.9.5

2014-02-11 15:20:44

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 3/8] perf/x86/uncore: add PCI ids for SNB/IVB/HSW IMC

This patch adds the PCI ids for the Intel SandyBridge,
IvyBridge, Haswell Client memory controller (IMC).

Signed-off-by: Stephane Eranian <[email protected]>
---
include/linux/pci_ids.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 97fbecd..7399e6a 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2531,6 +2531,9 @@

#define PCI_VENDOR_ID_INTEL 0x8086
#define PCI_DEVICE_ID_INTEL_EESSC 0x0008
+#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100
+#define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
+#define PCI_DEVICE_ID_INTEL_HSW_IMC 0x0c00
#define PCI_DEVICE_ID_INTEL_PXHD_0 0x0320
#define PCI_DEVICE_ID_INTEL_PXHD_1 0x0321
#define PCI_DEVICE_ID_INTEL_PXH_0 0x0329
--
1.7.9.5

2014-02-11 15:20:59

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 8/8] perf/x86/uncore: use MiB unit for events for SNB/IVB/HSW IMC

This patch makes perf use Mebibytes to display the counts
of uncore_imc/data_reads/ and uncore_imc/data_writes.

1MiB = 1024*1024 bytes.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 20a8061..fbaf19a 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1679,12 +1679,12 @@ enum {

static struct uncore_event_desc snb_uncore_imc_events[] = {
INTEL_UNCORE_EVENT_DESC(data_reads, "event=0x01"),
- INTEL_UNCORE_EVENT_DESC(data_reads.scale, "64"),
- INTEL_UNCORE_EVENT_DESC(data_reads.unit, "bytes"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.scale, "6.103515625e-5"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.unit, "MiB"),

INTEL_UNCORE_EVENT_DESC(data_writes, "event=0x02"),
- INTEL_UNCORE_EVENT_DESC(data_writes.scale, "64"),
- INTEL_UNCORE_EVENT_DESC(data_writes.unit, "bytes"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.scale, "6.103515625e-5"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.unit, "MiB"),

{ /* end: all zeroes */ },
};
--
1.7.9.5

2014-02-11 15:21:26

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 7/8] perf/x86/uncore: add hrtimer to SNB uncore IMC PMU

This patch is needed because that PMU uses 32-bit free
running counters with no interrupt capabilities.

On SNB/IVB/HSW, we used 20GB/s theoretical peak to calculate
the hrtimer timeout necessary to avoid missing an overflow.
That delay is set to 5s to be on the cautious side.

The SNB IMC uses free running counters, which are handled
via pseudo fixed counters. The SNB IMC PMU implementation
supports an arbitrary number of events, because the counters
are read-only. Therefore it is not possible to track active
counters. Instead we put active events on a linked list which
is then used by the hrtimer handler to update the SW counts.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 12 ++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 2 ++
2 files changed, 14 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 2c7693a..20a8061 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1728,6 +1728,7 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
addr &= ~(PAGE_SIZE - 1);

box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+ box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
}

static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
@@ -3160,6 +3161,7 @@ static void uncore_perf_event_update(struct intel_uncore_box *box, struct perf_e
static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
{
struct intel_uncore_box *box;
+ struct perf_event *event;
unsigned long flags;
int bit;

@@ -3172,6 +3174,14 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
*/
local_irq_save(flags);

+ /*
+ * handle boxes with an active event list as opposed to active
+ * counters
+ */
+ list_for_each_entry(event, &box->active_list, active_entry) {
+ uncore_perf_event_update(box, event);
+ }
+
for_each_set_bit(bit, box->active_mask, UNCORE_PMC_IDX_MAX)
uncore_perf_event_update(box, box->events[bit]);

@@ -3221,6 +3231,8 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
/* set default hrtimer timeout */
box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;

+ INIT_LIST_HEAD(&box->active_list);
+
return box;
}

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index fbf45a0..90236f0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -6,6 +6,7 @@

#define UNCORE_PMU_NAME_LEN 32
#define UNCORE_PMU_HRTIMER_INTERVAL (60LL * NSEC_PER_SEC)
+#define UNCORE_SNB_IMC_HRTIMER_INTERVAL (5ULL * NSEC_PER_SEC)

#define UNCORE_FIXED_EVENT 0xff
#define UNCORE_PMC_IDX_MAX_GENERIC 8
@@ -492,6 +493,7 @@ struct intel_uncore_box {
u64 hrtimer_duration; /* hrtimer timeout for this box */
struct hrtimer hrtimer;
struct list_head list;
+ struct list_head active_list;
void *io_addr;
struct intel_uncore_extra_reg shared_regs[0];
};
--
1.7.9.5

2014-02-11 15:22:08

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 4/8] perf/x86/uncore: make hrtimer timeout configurable per box

This patch makes the hrtimer timeout configurable per PMU
box. Not all counters have necessarily the same width and
rate, thus the default timeout of 60s may need to be adjusted.

This patch adds box->hrtimer_duration. It is set to default
when the box is allocated. It can be overriden when the box
is initialized.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 7 +++++--
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 1 +
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index e6f32b3..ea823b8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -2798,14 +2798,14 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)

local_irq_restore(flags);

- hrtimer_forward_now(hrtimer, ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL));
+ hrtimer_forward_now(hrtimer, ns_to_ktime(box->hrtimer_duration));
return HRTIMER_RESTART;
}

static void uncore_pmu_start_hrtimer(struct intel_uncore_box *box)
{
__hrtimer_start_range_ns(&box->hrtimer,
- ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL), 0,
+ ns_to_ktime(box->hrtimer_duration), 0,
HRTIMER_MODE_REL_PINNED, 0);
}

@@ -2839,6 +2839,9 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
box->cpu = -1;
box->phys_id = -1;

+ /* set default hrtimer timeout */
+ box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;
+
return box;
}

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index 77dc9a5..7efd298 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -489,6 +489,7 @@ struct intel_uncore_box {
u64 tags[UNCORE_PMC_IDX_MAX];
struct pci_dev *pci_dev;
struct intel_uncore_pmu *pmu;
+ u64 hrtimer_duration; /* hrtimer timeout for this box */
struct hrtimer hrtimer;
struct list_head list;
struct intel_uncore_extra_reg shared_regs[0];
--
1.7.9.5

2014-02-11 15:22:35

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 2/8] perf/x86/uncore: add ability to customize pmu callbacks

This patch enables custom struct pmu callbacks per uncore
PMU types. This feature may be used to simplify counter
setup for certain uncore PMUs which have free running
counters for instance. It becomes possible to bypass
the event scheduling phase of the configuration.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 25 +++++++++++++++----------
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 1 +
2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index fe4255b..e6f32b3 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -3271,16 +3271,21 @@ static int __init uncore_pmu_register(struct intel_uncore_pmu *pmu)
{
int ret;

- pmu->pmu = (struct pmu) {
- .attr_groups = pmu->type->attr_groups,
- .task_ctx_nr = perf_invalid_context,
- .event_init = uncore_pmu_event_init,
- .add = uncore_pmu_event_add,
- .del = uncore_pmu_event_del,
- .start = uncore_pmu_event_start,
- .stop = uncore_pmu_event_stop,
- .read = uncore_pmu_event_read,
- };
+ if (!pmu->type->pmu) {
+ pmu->pmu = (struct pmu) {
+ .attr_groups = pmu->type->attr_groups,
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = uncore_pmu_event_init,
+ .add = uncore_pmu_event_add,
+ .del = uncore_pmu_event_del,
+ .start = uncore_pmu_event_start,
+ .stop = uncore_pmu_event_stop,
+ .read = uncore_pmu_event_read,
+ };
+ } else {
+ pmu->pmu = *pmu->type->pmu;
+ pmu->pmu.attr_groups = pmu->type->attr_groups;
+ }

if (pmu->type->num_boxes == 1) {
if (strlen(pmu->type->name) > 0)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index a80ab71..77dc9a5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -440,6 +440,7 @@ struct intel_uncore_type {
struct intel_uncore_ops *ops;
struct uncore_event_desc *event_descs;
const struct attribute_group *attr_groups[4];
+ struct pmu *pmu; /* for custom pmu ops */
};

#define pmu_group attr_groups[0]
--
1.7.9.5

2014-02-11 16:19:18

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

On Tue, Feb 11, 2014 at 04:20:12PM +0100, Stephane Eranian wrote:
> This patch adds a new uncore PMU for Intel SNB/IVB/HSW client


> @@ -3501,6 +3844,28 @@ static int __init uncore_pci_init(void)
> pci_uncores = ivt_pci_uncores;
> uncore_pci_driver = &ivt_uncore_pci_driver;
> break;
> + case 42: /* Sandy Bridge */
> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_SNB_IMC);
> + if (ret)
> + return ret;
> + pci_uncores = snb_pci_uncores;
> + uncore_pci_driver = &snb_uncore_pci_driver;
> + break;
> + case 60: /* Haswell */
> + case 69: /* Haswell Celeron */
> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);
> + if (ret)
> + return ret;
> + pci_uncores = snb_pci_uncores;
> + uncore_pci_driver = &hsw_uncore_pci_driver;
> + break;
> + case 58: /* Ivy Bridge */
> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
> + if (ret)
> + return ret;
> + pci_uncores = snb_pci_uncores;
> + uncore_pci_driver = &ivb_uncore_pci_driver;
> + break;
> default:
> return 0;
> }

I reorderd that list; but looking at perf_event_intel.c we have a lot
more HSW clients listed there. Plz as to make it consistent.

2014-02-11 16:25:43

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

On Tue, Feb 11, 2014 at 5:19 PM, Peter Zijlstra <[email protected]> wrote:
> On Tue, Feb 11, 2014 at 04:20:12PM +0100, Stephane Eranian wrote:
>> This patch adds a new uncore PMU for Intel SNB/IVB/HSW client
>
>
>> @@ -3501,6 +3844,28 @@ static int __init uncore_pci_init(void)
>> pci_uncores = ivt_pci_uncores;
>> uncore_pci_driver = &ivt_uncore_pci_driver;
>> break;
>> + case 42: /* Sandy Bridge */
>> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_SNB_IMC);
>> + if (ret)
>> + return ret;
>> + pci_uncores = snb_pci_uncores;
>> + uncore_pci_driver = &snb_uncore_pci_driver;
>> + break;
>> + case 60: /* Haswell */
>> + case 69: /* Haswell Celeron */
>> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);
>> + if (ret)
>> + return ret;
>> + pci_uncores = snb_pci_uncores;
>> + uncore_pci_driver = &hsw_uncore_pci_driver;
>> + break;
>> + case 58: /* Ivy Bridge */
>> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
>> + if (ret)
>> + return ret;
>> + pci_uncores = snb_pci_uncores;
>> + uncore_pci_driver = &ivb_uncore_pci_driver;
>> + break;
>> default:
>> return 0;
>> }
>
> I reorderd that list; but looking at perf_event_intel.c we have a lot
> more HSW clients listed there. Plz as to make it consistent.

I don't have all of them, so no testing possible. I doubt they have so
many clients model numbers.

2014-02-11 16:50:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

On Tue, Feb 11, 2014 at 05:25:39PM +0100, Stephane Eranian wrote:
> On Tue, Feb 11, 2014 at 5:19 PM, Peter Zijlstra <[email protected]> wrote:
> > On Tue, Feb 11, 2014 at 04:20:12PM +0100, Stephane Eranian wrote:
> >> This patch adds a new uncore PMU for Intel SNB/IVB/HSW client
> >
> >
> >> @@ -3501,6 +3844,28 @@ static int __init uncore_pci_init(void)
> >> pci_uncores = ivt_pci_uncores;
> >> uncore_pci_driver = &ivt_uncore_pci_driver;
> >> break;
> >> + case 42: /* Sandy Bridge */
> >> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_SNB_IMC);
> >> + if (ret)
> >> + return ret;
> >> + pci_uncores = snb_pci_uncores;
> >> + uncore_pci_driver = &snb_uncore_pci_driver;
> >> + break;
> >> + case 60: /* Haswell */
> >> + case 69: /* Haswell Celeron */
> >> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);
> >> + if (ret)
> >> + return ret;
> >> + pci_uncores = snb_pci_uncores;
> >> + uncore_pci_driver = &hsw_uncore_pci_driver;
> >> + break;
> >> + case 58: /* Ivy Bridge */
> >> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
> >> + if (ret)
> >> + return ret;
> >> + pci_uncores = snb_pci_uncores;
> >> + uncore_pci_driver = &ivb_uncore_pci_driver;
> >> + break;
> >> default:
> >> return 0;
> >> }
> >
> > I reorderd that list; but looking at perf_event_intel.c we have a lot
> > more HSW clients listed there. Plz as to make it consistent.
>
> I don't have all of them, so no testing possible. I doubt they have so
> many clients model numbers.

Yeah, I don't have any of those chips.. last I have is WSM-EP.

Anyway, perf_event_intel.c lists: 60,63,69,70,71 as being haswell
clients. Andi did all that, so if its wrong its on Intel anyway.

I'm still thinking we ought to make a big Intel classification function;
something that returns something like:

struct intel_part {
enum { client, ep, ex } type;
enum { core, core2, nhm, wsm, snb, ivb, hsw } gen;
};

And we can do things like:

if (ip.type == client && ip.gen >= snb)

And then we only have to fix up the one classification function with all
those stupid model numbers, instead of having them duplicated all over
the stinking place.

2014-02-11 18:32:02

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

On Tue, Feb 11, 2014 at 5:50 PM, Peter Zijlstra <[email protected]> wrote:
> On Tue, Feb 11, 2014 at 05:25:39PM +0100, Stephane Eranian wrote:
>> On Tue, Feb 11, 2014 at 5:19 PM, Peter Zijlstra <[email protected]> wrote:
>> > On Tue, Feb 11, 2014 at 04:20:12PM +0100, Stephane Eranian wrote:
>> >> This patch adds a new uncore PMU for Intel SNB/IVB/HSW client
>> >
>> >
>> >> @@ -3501,6 +3844,28 @@ static int __init uncore_pci_init(void)
>> >> pci_uncores = ivt_pci_uncores;
>> >> uncore_pci_driver = &ivt_uncore_pci_driver;
>> >> break;
>> >> + case 42: /* Sandy Bridge */
>> >> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_SNB_IMC);
>> >> + if (ret)
>> >> + return ret;
>> >> + pci_uncores = snb_pci_uncores;
>> >> + uncore_pci_driver = &snb_uncore_pci_driver;
>> >> + break;
>> >> + case 60: /* Haswell */
>> >> + case 69: /* Haswell Celeron */
>> >> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);
>> >> + if (ret)
>> >> + return ret;
>> >> + pci_uncores = snb_pci_uncores;
>> >> + uncore_pci_driver = &hsw_uncore_pci_driver;
>> >> + break;
>> >> + case 58: /* Ivy Bridge */
>> >> + ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
>> >> + if (ret)
>> >> + return ret;
>> >> + pci_uncores = snb_pci_uncores;
>> >> + uncore_pci_driver = &ivb_uncore_pci_driver;
>> >> + break;
>> >> default:
>> >> return 0;
>> >> }
>> >
>> > I reorderd that list; but looking at perf_event_intel.c we have a lot
>> > more HSW clients listed there. Plz as to make it consistent.
>>
>> I don't have all of them, so no testing possible. I doubt they have so
>> many clients model numbers.
>
> Yeah, I don't have any of those chips.. last I have is WSM-EP.
>
> Anyway, perf_event_intel.c lists: 60,63,69,70,71 as being haswell
> clients. Andi did all that, so if its wrong its on Intel anyway.
>
What he has is okay for core. They all have the same core PMU at 99%.
But the uncore can be radically different. Look at IVB (58) vs IVB-EP (62).
So this is more risky there. That is why I did not add the other model numbers.

> I'm still thinking we ought to make a big Intel classification function;
> something that returns something like:
>
> struct intel_part {
> enum { client, ep, ex } type;
> enum { core, core2, nhm, wsm, snb, ivb, hsw } gen;
> };
>
> And we can do things like:
>
> if (ip.type == client && ip.gen >= snb)
>
> And then we only have to fix up the one classification function with all
> those stupid model numbers, instead of having them duplicated all over
> the stinking place.

Yeah, I am not opposed to that idea for PMU purposes.

2014-02-11 20:34:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

On Tue, Feb 11, 2014 at 07:31:59PM +0100, Stephane Eranian wrote:
> > Anyway, perf_event_intel.c lists: 60,63,69,70,71 as being haswell
> > clients. Andi did all that, so if its wrong its on Intel anyway.
> >
> What he has is okay for core. They all have the same core PMU at 99%.
> But the uncore can be radically different. Look at IVB (58) vs IVB-EP (62).
> So this is more risky there. That is why I did not add the other model numbers.

Well, if any of those is a !client part; then still his fault for not
putting in a comment.

What's the worst that can happen? Failing to find the hardware I suppose
seeing as this is all discoverable PCI space?

Subject: [tip:perf/core] perf/x86/uncore: fix initialization of cpumask

Commit-ID: 411cf180fa00521f9bfb1d022e3ebf059a2d299f
Gitweb: http://git.kernel.org/tip/411cf180fa00521f9bfb1d022e3ebf059a2d299f
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:07 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:07 +0100

perf/x86/uncore: fix initialization of cpumask

On certain processors, the uncore PMU boxes may only be
msr-bsed or PCI-based. But in both cases, the cpumask,
suggesting on which CPUs to monitor to get full coverage
of the particular PMU, must be created.

However with the current code base, the cpumask was only
created on processor which had at least one MSR-based
uncore PMU. This patch removes that restriction and
ensures the cpumask is created even when there is no
msr-based PMU. For instance, on SNB client where only
a PCI-based memory controller PMU is supported.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 61 ++++++++++++++++-----------
1 file changed, 37 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 29c2487..fe4255b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -3764,7 +3764,7 @@ static void __init uncore_cpu_setup(void *dummy)

static int __init uncore_cpu_init(void)
{
- int ret, cpu, max_cores;
+ int ret, max_cores;

max_cores = boot_cpu_data.x86_max_cores;
switch (boot_cpu_data.x86_model) {
@@ -3808,29 +3808,6 @@ static int __init uncore_cpu_init(void)
if (ret)
return ret;

- get_online_cpus();
-
- for_each_online_cpu(cpu) {
- int i, phys_id = topology_physical_package_id(cpu);
-
- for_each_cpu(i, &uncore_cpu_mask) {
- if (phys_id == topology_physical_package_id(i)) {
- phys_id = -1;
- break;
- }
- }
- if (phys_id < 0)
- continue;
-
- uncore_cpu_prepare(cpu, phys_id);
- uncore_event_init_cpu(cpu);
- }
- on_each_cpu(uncore_cpu_setup, NULL, 1);
-
- register_cpu_notifier(&uncore_cpu_nb);
-
- put_online_cpus();
-
return 0;
}

@@ -3859,6 +3836,41 @@ static int __init uncore_pmus_register(void)
return 0;
}

+static void uncore_cpumask_init(void)
+{
+ int cpu;
+
+ /*
+ * ony invoke once from msr or pci init code
+ */
+ if (!cpumask_empty(&uncore_cpu_mask))
+ return;
+
+ get_online_cpus();
+
+ for_each_online_cpu(cpu) {
+ int i, phys_id = topology_physical_package_id(cpu);
+
+ for_each_cpu(i, &uncore_cpu_mask) {
+ if (phys_id == topology_physical_package_id(i)) {
+ phys_id = -1;
+ break;
+ }
+ }
+ if (phys_id < 0)
+ continue;
+
+ uncore_cpu_prepare(cpu, phys_id);
+ uncore_event_init_cpu(cpu);
+ }
+ on_each_cpu(uncore_cpu_setup, NULL, 1);
+
+ register_cpu_notifier(&uncore_cpu_nb);
+
+ put_online_cpus();
+}
+
+
static int __init intel_uncore_init(void)
{
int ret;
@@ -3877,6 +3889,7 @@ static int __init intel_uncore_init(void)
uncore_pci_exit();
goto fail;
}
+ uncore_cpumask_init();

uncore_pmus_register();
return 0;

Subject: [tip:perf/core] perf/x86/uncore: add ability to customize pmu callbacks

Commit-ID: d64b25b6a08d95e973160689320504faed62be4a
Gitweb: http://git.kernel.org/tip/d64b25b6a08d95e973160689320504faed62be4a
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:08 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:07 +0100

perf/x86/uncore: add ability to customize pmu callbacks

This patch enables custom struct pmu callbacks per uncore
PMU types. This feature may be used to simplify counter
setup for certain uncore PMUs which have free running
counters for instance. It becomes possible to bypass
the event scheduling phase of the configuration.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 25 +++++++++++++++----------
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 1 +
2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index fe4255b..e6f32b3 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -3271,16 +3271,21 @@ static int __init uncore_pmu_register(struct intel_uncore_pmu *pmu)
{
int ret;

- pmu->pmu = (struct pmu) {
- .attr_groups = pmu->type->attr_groups,
- .task_ctx_nr = perf_invalid_context,
- .event_init = uncore_pmu_event_init,
- .add = uncore_pmu_event_add,
- .del = uncore_pmu_event_del,
- .start = uncore_pmu_event_start,
- .stop = uncore_pmu_event_stop,
- .read = uncore_pmu_event_read,
- };
+ if (!pmu->type->pmu) {
+ pmu->pmu = (struct pmu) {
+ .attr_groups = pmu->type->attr_groups,
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = uncore_pmu_event_init,
+ .add = uncore_pmu_event_add,
+ .del = uncore_pmu_event_del,
+ .start = uncore_pmu_event_start,
+ .stop = uncore_pmu_event_stop,
+ .read = uncore_pmu_event_read,
+ };
+ } else {
+ pmu->pmu = *pmu->type->pmu;
+ pmu->pmu.attr_groups = pmu->type->attr_groups;
+ }

if (pmu->type->num_boxes == 1) {
if (strlen(pmu->type->name) > 0)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index a80ab71..77dc9a5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -440,6 +440,7 @@ struct intel_uncore_type {
struct intel_uncore_ops *ops;
struct uncore_event_desc *event_descs;
const struct attribute_group *attr_groups[4];
+ struct pmu *pmu; /* for custom pmu ops */
};

#define pmu_group attr_groups[0]

Subject: [tip:perf/core] perf/x86/uncore: add PCI ids for SNB/IVB/HSW IMC

Commit-ID: ae58faf90050ea72c289f62c4bc9a5bb8ae7f0bd
Gitweb: http://git.kernel.org/tip/ae58faf90050ea72c289f62c4bc9a5bb8ae7f0bd
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:09 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:07 +0100

perf/x86/uncore: add PCI ids for SNB/IVB/HSW IMC

This patch adds the PCI ids for the Intel SandyBridge,
IvyBridge, Haswell Client memory controller (IMC).

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
include/linux/pci_ids.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 97fbecd..7399e6a 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2531,6 +2531,9 @@

#define PCI_VENDOR_ID_INTEL 0x8086
#define PCI_DEVICE_ID_INTEL_EESSC 0x0008
+#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100
+#define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
+#define PCI_DEVICE_ID_INTEL_HSW_IMC 0x0c00
#define PCI_DEVICE_ID_INTEL_PXHD_0 0x0320
#define PCI_DEVICE_ID_INTEL_PXHD_1 0x0321
#define PCI_DEVICE_ID_INTEL_PXH_0 0x0329

Subject: [tip:perf/core] perf/x86/uncore: make hrtimer timeout configurable per box

Commit-ID: 79859cce5a551c3bcd8d36a6287f9d83568ccedb
Gitweb: http://git.kernel.org/tip/79859cce5a551c3bcd8d36a6287f9d83568ccedb
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:10 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:07 +0100

perf/x86/uncore: make hrtimer timeout configurable per box

This patch makes the hrtimer timeout configurable per PMU
box. Not all counters have necessarily the same width and
rate, thus the default timeout of 60s may need to be adjusted.

This patch adds box->hrtimer_duration. It is set to default
when the box is allocated. It can be overriden when the box
is initialized.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 7 +++++--
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 1 +
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index e6f32b3..ea823b8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -2798,14 +2798,14 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)

local_irq_restore(flags);

- hrtimer_forward_now(hrtimer, ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL));
+ hrtimer_forward_now(hrtimer, ns_to_ktime(box->hrtimer_duration));
return HRTIMER_RESTART;
}

static void uncore_pmu_start_hrtimer(struct intel_uncore_box *box)
{
__hrtimer_start_range_ns(&box->hrtimer,
- ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL), 0,
+ ns_to_ktime(box->hrtimer_duration), 0,
HRTIMER_MODE_REL_PINNED, 0);
}

@@ -2839,6 +2839,9 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
box->cpu = -1;
box->phys_id = -1;

+ /* set default hrtimer timeout */
+ box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;
+
return box;
}

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index 77dc9a5..7efd298 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -489,6 +489,7 @@ struct intel_uncore_box {
u64 tags[UNCORE_PMC_IDX_MAX];
struct pci_dev *pci_dev;
struct intel_uncore_pmu *pmu;
+ u64 hrtimer_duration; /* hrtimer timeout for this box */
struct hrtimer hrtimer;
struct list_head list;
struct intel_uncore_extra_reg shared_regs[0];

Subject: [tip:perf/core] perf/x86/uncore: move uncore_event_to_box() and uncore_pmu_to_box()

Commit-ID: 001e413f7e7a4a68dc1c3231f72b5be173939c8f
Gitweb: http://git.kernel.org/tip/001e413f7e7a4a68dc1c3231f72b5be173939c8f
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:11 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:07 +0100

perf/x86/uncore: move uncore_event_to_box() and uncore_pmu_to_box()

Move a couple of functions around to avoid forward declarations
when we add code later on.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 72 +++++++++++++--------------
1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index ea823b8..acbbdde 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -66,6 +66,42 @@ DEFINE_UNCORE_FORMAT_ATTR(mask_vnw, mask_vnw, "config2:3-4");
DEFINE_UNCORE_FORMAT_ATTR(mask0, mask0, "config2:0-31");
DEFINE_UNCORE_FORMAT_ATTR(mask1, mask1, "config2:32-63");

+static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
+{
+ return container_of(event->pmu, struct intel_uncore_pmu, pmu);
+}
+
+static struct intel_uncore_box *
+uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
+{
+ struct intel_uncore_box *box;
+
+ box = *per_cpu_ptr(pmu->box, cpu);
+ if (box)
+ return box;
+
+ raw_spin_lock(&uncore_box_lock);
+ list_for_each_entry(box, &pmu->box_list, list) {
+ if (box->phys_id == topology_physical_package_id(cpu)) {
+ atomic_inc(&box->refcnt);
+ *per_cpu_ptr(pmu->box, cpu) = box;
+ break;
+ }
+ }
+ raw_spin_unlock(&uncore_box_lock);
+
+ return *per_cpu_ptr(pmu->box, cpu);
+}
+
+static struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
+{
+ /*
+ * perf core schedules event on the basis of cpu, uncore events are
+ * collected by one of the cpus inside a physical package.
+ */
+ return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
+}
+
static u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
{
u64 count;
@@ -2845,42 +2881,6 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
return box;
}

-static struct intel_uncore_box *
-uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
-{
- struct intel_uncore_box *box;
-
- box = *per_cpu_ptr(pmu->box, cpu);
- if (box)
- return box;
-
- raw_spin_lock(&uncore_box_lock);
- list_for_each_entry(box, &pmu->box_list, list) {
- if (box->phys_id == topology_physical_package_id(cpu)) {
- atomic_inc(&box->refcnt);
- *per_cpu_ptr(pmu->box, cpu) = box;
- break;
- }
- }
- raw_spin_unlock(&uncore_box_lock);
-
- return *per_cpu_ptr(pmu->box, cpu);
-}
-
-static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
-{
- return container_of(event->pmu, struct intel_uncore_pmu, pmu);
-}
-
-static struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
-{
- /*
- * perf core schedules event on the basis of cpu, uncore events are
- * collected by one of the cpus inside a physical package.
- */
- return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
-}
-
static int
uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader, bool dogrp)
{

Subject: [tip:perf/core] perf/x86/uncore: add SNB/IVB/ HSW client uncore memory controller support

Commit-ID: b9e1ab6d4c0582cad97699285a6b3cf992251b00
Gitweb: http://git.kernel.org/tip/b9e1ab6d4c0582cad97699285a6b3cf992251b00
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:12 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:08 +0100

perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

This patch adds a new uncore PMU for Intel SNB/IVB/HSW client
CPUs. It adds the Integrated Memory Controller (IMC) PMU. This
new PMU provides a set of events to measure memory bandwidth utilization.

The IMC on those processor is PCI-space based. This patch
exposes a new uncore PMU on those processor: uncore_imc

Two new events are defined:
- name: data_reads
- code: 0x1
- unit: 64 bytes
- number of full cacheline read requests to the IMC

- name: data_writes
- code: 0x2
- unit: 64 bytes
- number of full cacheline write requests to the IMC

Documentation available at:
http://software.intel.com/en-us/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 365 ++++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 1 +
2 files changed, 366 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index acbbdde..66ce5e5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -66,6 +66,11 @@ DEFINE_UNCORE_FORMAT_ATTR(mask_vnw, mask_vnw, "config2:3-4");
DEFINE_UNCORE_FORMAT_ATTR(mask0, mask0, "config2:0-31");
DEFINE_UNCORE_FORMAT_ATTR(mask1, mask1, "config2:32-63");

+static void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
+static void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
+static void uncore_perf_event_update(struct intel_uncore_box *box, struct perf_event *event);
+static void uncore_pmu_event_read(struct perf_event *event);
+
static struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
{
return container_of(event->pmu, struct intel_uncore_pmu, pmu);
@@ -1667,6 +1672,344 @@ static struct intel_uncore_type *snb_msr_uncores[] = {
&snb_uncore_cbox,
NULL,
};
+
+enum {
+ SNB_PCI_UNCORE_IMC,
+};
+
+static struct uncore_event_desc snb_uncore_imc_events[] = {
+ INTEL_UNCORE_EVENT_DESC(data_reads, "event=0x01"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.scale, "64"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.unit, "bytes"),
+
+ INTEL_UNCORE_EVENT_DESC(data_writes, "event=0x02"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.scale, "64"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.unit, "bytes"),
+
+ { /* end: all zeroes */ },
+};
+
+#define SNB_UNCORE_PCI_IMC_EVENT_MASK 0xff
+#define SNB_UNCORE_PCI_IMC_BAR_OFFSET 0x48
+
+/* page size multiple covering all config regs */
+#define SNB_UNCORE_PCI_IMC_MAP_SIZE 0x6000
+
+#define SNB_UNCORE_PCI_IMC_DATA_READS 0x1
+#define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
+#define SNB_UNCORE_PCI_IMC_DATA_WRITES 0x2
+#define SNB_UNCORE_PCI_IMC_DATA_WRITES_BASE 0x5054
+#define SNB_UNCORE_PCI_IMC_CTR_BASE SNB_UNCORE_PCI_IMC_DATA_READS_BASE
+
+static struct attribute *snb_uncore_imc_formats_attr[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+
+static struct attribute_group snb_uncore_imc_format_group = {
+ .name = "format",
+ .attrs = snb_uncore_imc_formats_attr,
+};
+
+static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ u32 addr_lo, addr_hi;
+ resource_size_t addr;
+
+ pci_read_config_dword(pdev, SNB_UNCORE_PCI_IMC_BAR_OFFSET, &addr_lo);
+ addr = addr_lo;
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+ pci_read_config_dword(pdev, SNB_UNCORE_PCI_IMC_BAR_OFFSET+4, &addr_hi);
+ addr = ((resource_size_t)addr_hi << 32) | addr_lo;
+#endif
+
+ addr &= ~(PAGE_SIZE - 1);
+
+ box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+}
+
+static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
+{}
+
+static void snb_uncore_imc_disable_box(struct intel_uncore_box *box)
+{}
+
+static void snb_uncore_imc_enable_event(struct intel_uncore_box *box, struct perf_event *event)
+{}
+
+static void snb_uncore_imc_disable_event(struct intel_uncore_box *box, struct perf_event *event)
+{}
+
+static u64 snb_uncore_imc_read_counter(struct intel_uncore_box *box, struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ return (u64)*(unsigned int *)(box->io_addr + hwc->event_base);
+}
+
+/*
+ * custom event_init() function because we define our own fixed, free
+ * running counters, so we do not want to conflict with generic uncore
+ * logic. Also simplifies processing
+ */
+static int snb_uncore_imc_event_init(struct perf_event *event)
+{
+ struct intel_uncore_pmu *pmu;
+ struct intel_uncore_box *box;
+ struct hw_perf_event *hwc = &event->hw;
+ u64 cfg = event->attr.config & SNB_UNCORE_PCI_IMC_EVENT_MASK;
+ int idx, base;
+
+ if (event->attr.type != event->pmu->type)
+ return -ENOENT;
+
+ pmu = uncore_event_to_pmu(event);
+ /* no device found for this pmu */
+ if (pmu->func_id < 0)
+ return -ENOENT;
+
+ /* Sampling not supported yet */
+ if (hwc->sample_period)
+ return -EINVAL;
+
+ /* unsupported modes and filters */
+ if (event->attr.exclude_user ||
+ event->attr.exclude_kernel ||
+ event->attr.exclude_hv ||
+ event->attr.exclude_idle ||
+ event->attr.exclude_host ||
+ event->attr.exclude_guest ||
+ event->attr.sample_period) /* no sampling */
+ return -EINVAL;
+
+ /*
+ * Place all uncore events for a particular physical package
+ * onto a single cpu
+ */
+ if (event->cpu < 0)
+ return -EINVAL;
+
+ /* check only supported bits are set */
+ if (event->attr.config & ~SNB_UNCORE_PCI_IMC_EVENT_MASK)
+ return -EINVAL;
+
+ box = uncore_pmu_to_box(pmu, event->cpu);
+ if (!box || box->cpu < 0)
+ return -EINVAL;
+
+ event->cpu = box->cpu;
+
+ event->hw.idx = -1;
+ event->hw.last_tag = ~0ULL;
+ event->hw.extra_reg.idx = EXTRA_REG_NONE;
+ event->hw.branch_reg.idx = EXTRA_REG_NONE;
+ /*
+ * check event is known (whitelist, determines counter)
+ */
+ switch (cfg) {
+ case SNB_UNCORE_PCI_IMC_DATA_READS:
+ base = SNB_UNCORE_PCI_IMC_DATA_READS_BASE;
+ idx = UNCORE_PMC_IDX_FIXED;
+ break;
+ case SNB_UNCORE_PCI_IMC_DATA_WRITES:
+ base = SNB_UNCORE_PCI_IMC_DATA_WRITES_BASE;
+ idx = UNCORE_PMC_IDX_FIXED + 1;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* must be done before validate_group */
+ event->hw.event_base = base;
+ event->hw.config = cfg;
+ event->hw.idx = idx;
+
+ /* no group validation needed, we have free running counters */
+
+ return 0;
+}
+
+static int snb_uncore_imc_hw_config(struct intel_uncore_box *box, struct perf_event *event)
+{
+ return 0;
+}
+
+static void snb_uncore_imc_event_start(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ u64 count;
+
+ if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
+ return;
+
+ event->hw.state = 0;
+ box->n_active++;
+
+ list_add_tail(&event->active_entry, &box->active_list);
+
+ count = snb_uncore_imc_read_counter(box, event);
+ local64_set(&event->hw.prev_count, count);
+
+ if (box->n_active == 1)
+ uncore_pmu_start_hrtimer(box);
+}
+
+static void snb_uncore_imc_event_stop(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (!(hwc->state & PERF_HES_STOPPED)) {
+ box->n_active--;
+
+ WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+ hwc->state |= PERF_HES_STOPPED;
+
+ list_del(&event->active_entry);
+
+ if (box->n_active == 0)
+ uncore_pmu_cancel_hrtimer(box);
+ }
+
+ if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+ /*
+ * Drain the remaining delta count out of a event
+ * that we are disabling:
+ */
+ uncore_perf_event_update(box, event);
+ hwc->state |= PERF_HES_UPTODATE;
+ }
+}
+
+static int snb_uncore_imc_event_add(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (!box)
+ return -ENODEV;
+
+ hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+ if (!(flags & PERF_EF_START))
+ hwc->state |= PERF_HES_ARCH;
+
+ snb_uncore_imc_event_start(event, 0);
+
+ box->n_events++;
+
+ return 0;
+}
+
+static void snb_uncore_imc_event_del(struct perf_event *event, int flags)
+{
+ struct intel_uncore_box *box = uncore_event_to_box(event);
+ int i;
+
+ snb_uncore_imc_event_stop(event, PERF_EF_UPDATE);
+
+ for (i = 0; i < box->n_events; i++) {
+ if (event == box->event_list[i]) {
+ --box->n_events;
+ break;
+ }
+ }
+}
+
+static int snb_pci2phy_map_init(int devid)
+{
+ struct pci_dev *dev = NULL;
+ int bus;
+
+ dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, dev);
+ if (!dev)
+ return -ENOTTY;
+
+ bus = dev->bus->number;
+
+ pcibus_to_physid[bus] = 0;
+
+ pci_dev_put(dev);
+
+ return 0;
+}
+
+static struct pmu snb_uncore_imc_pmu = {
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = snb_uncore_imc_event_init,
+ .add = snb_uncore_imc_event_add,
+ .del = snb_uncore_imc_event_del,
+ .start = snb_uncore_imc_event_start,
+ .stop = snb_uncore_imc_event_stop,
+ .read = uncore_pmu_event_read,
+};
+
+static struct intel_uncore_ops snb_uncore_imc_ops = {
+ .init_box = snb_uncore_imc_init_box,
+ .enable_box = snb_uncore_imc_enable_box,
+ .disable_box = snb_uncore_imc_disable_box,
+ .disable_event = snb_uncore_imc_disable_event,
+ .enable_event = snb_uncore_imc_enable_event,
+ .hw_config = snb_uncore_imc_hw_config,
+ .read_counter = snb_uncore_imc_read_counter,
+};
+
+static struct intel_uncore_type snb_uncore_imc = {
+ .name = "imc",
+ .num_counters = 2,
+ .num_boxes = 1,
+ .fixed_ctr_bits = 32,
+ .fixed_ctr = SNB_UNCORE_PCI_IMC_CTR_BASE,
+ .event_descs = snb_uncore_imc_events,
+ .format_group = &snb_uncore_imc_format_group,
+ .perf_ctr = SNB_UNCORE_PCI_IMC_DATA_READS_BASE,
+ .event_mask = SNB_UNCORE_PCI_IMC_EVENT_MASK,
+ .ops = &snb_uncore_imc_ops,
+ .pmu = &snb_uncore_imc_pmu,
+};
+
+static struct intel_uncore_type *snb_pci_uncores[] = {
+ [SNB_PCI_UNCORE_IMC] = &snb_uncore_imc,
+ NULL,
+};
+
+static DEFINE_PCI_DEVICE_TABLE(snb_uncore_pci_ids) = {
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SNB_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+};
+
+static DEFINE_PCI_DEVICE_TABLE(ivb_uncore_pci_ids) = {
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+};
+
+static DEFINE_PCI_DEVICE_TABLE(hsw_uncore_pci_ids) = {
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_HSW_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+};
+
+static struct pci_driver snb_uncore_pci_driver = {
+ .name = "snb_uncore",
+ .id_table = snb_uncore_pci_ids,
+};
+
+static struct pci_driver ivb_uncore_pci_driver = {
+ .name = "ivb_uncore",
+ .id_table = ivb_uncore_pci_ids,
+};
+
+static struct pci_driver hsw_uncore_pci_driver = {
+ .name = "hsw_uncore",
+ .id_table = hsw_uncore_pci_ids,
+};
+
/* end of Sandy Bridge uncore support */

/* Nehalem uncore support */
@@ -3501,6 +3844,28 @@ static int __init uncore_pci_init(void)
pci_uncores = ivt_pci_uncores;
uncore_pci_driver = &ivt_uncore_pci_driver;
break;
+ case 42: /* Sandy Bridge */
+ ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_SNB_IMC);
+ if (ret)
+ return ret;
+ pci_uncores = snb_pci_uncores;
+ uncore_pci_driver = &snb_uncore_pci_driver;
+ break;
+ case 58: /* Ivy Bridge */
+ ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
+ if (ret)
+ return ret;
+ pci_uncores = snb_pci_uncores;
+ uncore_pci_driver = &ivb_uncore_pci_driver;
+ break;
+ case 60: /* Haswell */
+ case 69: /* Haswell Celeron */
+ ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);
+ if (ret)
+ return ret;
+ pci_uncores = snb_pci_uncores;
+ uncore_pci_driver = &hsw_uncore_pci_driver;
+ break;
default:
return 0;
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index 7efd298..fbf45a0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -492,6 +492,7 @@ struct intel_uncore_box {
u64 hrtimer_duration; /* hrtimer timeout for this box */
struct hrtimer hrtimer;
struct list_head list;
+ void *io_addr;
struct intel_uncore_extra_reg shared_regs[0];
};

Subject: [tip:perf/core] perf/x86/uncore: add hrtimer to SNB uncore IMC PMU

Commit-ID: ced2efb099f4f24ea7030c22b064e5ddd65cd764
Gitweb: http://git.kernel.org/tip/ced2efb099f4f24ea7030c22b064e5ddd65cd764
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:13 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:08 +0100

perf/x86/uncore: add hrtimer to SNB uncore IMC PMU

This patch is needed because that PMU uses 32-bit free
running counters with no interrupt capabilities.

On SNB/IVB/HSW, we used 20GB/s theoretical peak to calculate
the hrtimer timeout necessary to avoid missing an overflow.
That delay is set to 5s to be on the cautious side.

The SNB IMC uses free running counters, which are handled
via pseudo fixed counters. The SNB IMC PMU implementation
supports an arbitrary number of events, because the counters
are read-only. Therefore it is not possible to track active
counters. Instead we put active events on a linked list which
is then used by the hrtimer handler to update the SW counts.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 12 ++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 2 ++
2 files changed, 14 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 66ce5e5..7760aee 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1728,6 +1728,7 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
addr &= ~(PAGE_SIZE - 1);

box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+ box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
}

static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
@@ -3160,6 +3161,7 @@ again:
static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
{
struct intel_uncore_box *box;
+ struct perf_event *event;
unsigned long flags;
int bit;

@@ -3172,6 +3174,14 @@ static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
*/
local_irq_save(flags);

+ /*
+ * handle boxes with an active event list as opposed to active
+ * counters
+ */
+ list_for_each_entry(event, &box->active_list, active_entry) {
+ uncore_perf_event_update(box, event);
+ }
+
for_each_set_bit(bit, box->active_mask, UNCORE_PMC_IDX_MAX)
uncore_perf_event_update(box, box->events[bit]);

@@ -3221,6 +3231,8 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
/* set default hrtimer timeout */
box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;

+ INIT_LIST_HEAD(&box->active_list);
+
return box;
}

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index fbf45a0..90236f0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -6,6 +6,7 @@

#define UNCORE_PMU_NAME_LEN 32
#define UNCORE_PMU_HRTIMER_INTERVAL (60LL * NSEC_PER_SEC)
+#define UNCORE_SNB_IMC_HRTIMER_INTERVAL (5ULL * NSEC_PER_SEC)

#define UNCORE_FIXED_EVENT 0xff
#define UNCORE_PMC_IDX_MAX_GENERIC 8
@@ -492,6 +493,7 @@ struct intel_uncore_box {
u64 hrtimer_duration; /* hrtimer timeout for this box */
struct hrtimer hrtimer;
struct list_head list;
+ struct list_head active_list;
void *io_addr;
struct intel_uncore_extra_reg shared_regs[0];
};

Subject: [tip:perf/core] perf/x86/uncore: use MiB unit for events for SNB/ IVB/HSW IMC

Commit-ID: e9d9768824a18212712ae3afbebd5bfef05176f4
Gitweb: http://git.kernel.org/tip/e9d9768824a18212712ae3afbebd5bfef05176f4
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 11 Feb 2014 16:20:14 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:49:08 +0100

perf/x86/uncore: use MiB unit for events for SNB/IVB/HSW IMC

This patch makes perf use Mebibytes to display the counts
of uncore_imc/data_reads/ and uncore_imc/data_writes.

1MiB = 1024*1024 bytes.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 7760aee..de5c568 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1679,12 +1679,12 @@ enum {

static struct uncore_event_desc snb_uncore_imc_events[] = {
INTEL_UNCORE_EVENT_DESC(data_reads, "event=0x01"),
- INTEL_UNCORE_EVENT_DESC(data_reads.scale, "64"),
- INTEL_UNCORE_EVENT_DESC(data_reads.unit, "bytes"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.scale, "6.103515625e-5"),
+ INTEL_UNCORE_EVENT_DESC(data_reads.unit, "MiB"),

INTEL_UNCORE_EVENT_DESC(data_writes, "event=0x02"),
- INTEL_UNCORE_EVENT_DESC(data_writes.scale, "64"),
- INTEL_UNCORE_EVENT_DESC(data_writes.unit, "bytes"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.scale, "6.103515625e-5"),
+ INTEL_UNCORE_EVENT_DESC(data_writes.unit, "MiB"),

{ /* end: all zeroes */ },
};