Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp3973007ybl; Mon, 13 Jan 2020 05:57:23 -0800 (PST) X-Google-Smtp-Source: APXvYqxwdhCRdehrR/0uhSbi2mdiq9LZdJlA32KL7hJOwo2oM9DZ4kyhe4kNrUQh4cOZ5aSgnn47 X-Received: by 2002:a9d:588d:: with SMTP id x13mr12357083otg.6.1578923843138; Mon, 13 Jan 2020 05:57:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578923843; cv=none; d=google.com; s=arc-20160816; b=vrr12TrK+Xg5U3r8WZUpmI//a7FezvJBU9BeMIweXReEOWg7ofo4TKJjAa2UI+s1OT pT58145R+xdzJwV3RYHGeNzBD+pUzQw4Ityj0D/eSn53+kLNq2OOJ0BkGMChtiupe+St eoTkiTPLzvt/Y7BXEMTn7RoP+4DVuexXucleyzSPD3v09P8R1G0hup1sZ/WeBwgvf6y+ KCULkuNuduyooRcy6OJccsx2eGGillx3+dq7qq6JsWnszM+VVJESA9oGv0UQ8nshEXNo IW9LA9LLZ+WFySIj9N8YEY95WREdWq/XVz8E2MN2N7vn6Zn6eB+PIM+Ui7SwauaP/umg hx1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=BSVQv3k64gtTCGMIFua9kTBlEhbF7ZSFoqFmK7ic4y4=; b=tgI17rXewB1e5CyAII56ct/hol3OCviSsUDuIzodFMnmvVeVMH4dyzgOIAehivUp1g 1WcDHvfNnQRraea5SOiJ1WDeyKvdCZRE+fU0FnBeJJwDP4un4vkAM6yvu/ct9rPoz6dy APldaQE2d2luIWXd+uX8JNWnTu6SvMfaQF+BgsZFCBC9Y+ELopvpidCwLu1HfeVTCmoQ kWrTZMItC1jCWgxNowYI63Q3j0DmOiW61Hw6j+qBGaxGAAb4Lg8cFvm9qVpSZJtLHoHv R/HtgvJb4QPkhASZ5UysFp9epQmcrkfZsmhcsh206dd5dzgdsyedwP00jdTodM9mDCVL ea8g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c2si7050876oto.207.2020.01.13.05.57.11; Mon, 13 Jan 2020 05:57:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728800AbgAMNy6 (ORCPT + 99 others); Mon, 13 Jan 2020 08:54:58 -0500 Received: from mga03.intel.com ([134.134.136.65]:23348 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726074AbgAMNy5 (ORCPT ); Mon, 13 Jan 2020 08:54:57 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Jan 2020 05:54:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,429,1571727600"; d="scan'208";a="218670957" Received: from nntpdsd52-183.inn.intel.com ([10.125.52.183]) by fmsmga007.fm.intel.com with ESMTP; 13 Jan 2020 05:54:53 -0800 From: roman.sudarikov@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, linux-kernel@vger.kernel.org, eranian@google.com, bgregg@netflix.com, ak@linux.intel.com, kan.liang@linux.intel.com, gregkh@linuxfoundation.org Cc: alexander.antonov@intel.com, roman.sudarikov@linux.intel.com Subject: [PATCH v3 2/2] =?UTF-8?q?perf=20x86:=20Exposing=20an=20Uncore=20u?= =?UTF-8?q?nit=20to=20PMON=20for=20Intel=20Xeon=C2=AE=20server=20platform?= Date: Mon, 13 Jan 2020 16:54:44 +0300 Message-Id: <20200113135444.12027-3-roman.sudarikov@linux.intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20200113135444.12027-1-roman.sudarikov@linux.intel.com> References: <20200113135444.12027-1-roman.sudarikov@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Roman Sudarikov Current version supports a server line starting Intel® Xeon® Processor Scalable Family and introduces mapping for IIO Uncore units only. Other units can be added on demand. IIO stack to PMON mapping is exposed through: /sys/devices/uncore_iio_/platform_mapping in the following format: domain:bus For example, on a 4-die Intel Xeon® server platform: $ cat /sys/devices/uncore_iio_0/platform_mapping 0000:00,0000:40,0000:80,0000:c0 Which means: IIO PMON block 0 on die 0 belongs to IIO stack on bus 0x00, domain 0x0000 IIO PMON block 0 on die 1 belongs to IIO stack on bus 0x40, domain 0x0000 IIO PMON block 0 on die 2 belongs to IIO stack on bus 0x80, domain 0x0000 IIO PMON block 0 on die 3 belongs to IIO stack on bus 0xc0, domain 0x0000 Co-developed-by: Alexander Antonov Reviewed-by: Kan Liang Signed-off-by: Alexander Antonov Signed-off-by: Roman Sudarikov --- arch/x86/events/intel/uncore.c | 2 +- arch/x86/events/intel/uncore.h | 1 + arch/x86/events/intel/uncore_snbep.c | 162 +++++++++++++++++++++++++++ 3 files changed, 164 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index 2c53ad44b51f..c0d86bc8e786 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -16,7 +16,7 @@ struct pci_driver *uncore_pci_driver; DEFINE_RAW_SPINLOCK(pci2phy_map_lock); struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head); struct pci_extra_dev *uncore_extra_pci_dev; -static int max_dies; +int max_dies; /* mask of cpus that collect uncore events */ static cpumask_t uncore_cpu_mask; diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h index f52dd3f112a7..94eacca6f485 100644 --- a/arch/x86/events/intel/uncore.h +++ b/arch/x86/events/intel/uncore.h @@ -523,6 +523,7 @@ extern raw_spinlock_t pci2phy_map_lock; extern struct list_head pci2phy_map_head; extern struct pci_extra_dev *uncore_extra_pci_dev; extern struct event_constraint uncore_constraint_empty; +extern int max_dies; /* uncore_snb.c */ int snb_uncore_pci_init(void); diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index b10a5ec79e48..2562fde2e5b8 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -273,6 +273,30 @@ #define SKX_CPUNODEID 0xc0 #define SKX_GIDNIDMAP 0xd4 +/* + * The CPU_BUS_NUMBER MSR returns the values of the respective CPUBUSNO CSR + * that BIOS programmed. MSR has package scope. + * | Bit | Default | Description + * | [63] | 00h | VALID - When set, indicates the CPU bus + * numbers have been initialized. (RO) + * |[62:48]| --- | Reserved + * |[47:40]| 00h | BUS_NUM_5 — Return the bus number BIOS assigned + * CPUBUSNO(5). (RO) + * |[39:32]| 00h | BUS_NUM_4 — Return the bus number BIOS assigned + * CPUBUSNO(4). (RO) + * |[31:24]| 00h | BUS_NUM_3 — Return the bus number BIOS assigned + * CPUBUSNO(3). (RO) + * |[23:16]| 00h | BUS_NUM_2 — Return the bus number BIOS assigned + * CPUBUSNO(2). (RO) + * |[15:8] | 00h | BUS_NUM_1 — Return the bus number BIOS assigned + * CPUBUSNO(1). (RO) + * | [7:0] | 00h | BUS_NUM_0 — Return the bus number BIOS assigned + * CPUBUSNO(0). (RO) + */ +#define SKX_MSR_CPU_BUS_NUMBER 0x300 +#define SKX_MSR_CPU_BUS_VALID_BIT (1ULL << 63) +#define BUS_NUM_STRIDE 8 + /* SKX CHA */ #define SKX_CHA_MSR_PMON_BOX_FILTER_TID (0x1ffULL << 0) #define SKX_CHA_MSR_PMON_BOX_FILTER_LINK (0xfULL << 9) @@ -3580,6 +3604,9 @@ static struct intel_uncore_ops skx_uncore_iio_ops = { .read_counter = uncore_msr_read_counter, }; +static int skx_iio_get_topology(struct intel_uncore_type *type); +static int skx_iio_set_mapping(struct intel_uncore_type *type); + static struct intel_uncore_type skx_uncore_iio = { .name = "iio", .num_counters = 4, @@ -3594,6 +3621,8 @@ static struct intel_uncore_type skx_uncore_iio = { .constraints = skx_uncore_iio_constraints, .ops = &skx_uncore_iio_ops, .format_group = &skx_uncore_iio_format_group, + .get_topology = skx_iio_get_topology, + .set_mapping = skx_iio_set_mapping, }; enum perf_uncore_iio_freerunning_type_id { @@ -3780,6 +3809,139 @@ static int skx_count_chabox(void) return hweight32(val); } +static inline int skx_msr_cpu_bus_read(int cpu, u64 *topology) +{ + u64 msr_value; + + if (rdmsrl_on_cpu(cpu, SKX_MSR_CPU_BUS_NUMBER, &msr_value) || + !(msr_value & SKX_MSR_CPU_BUS_VALID_BIT)) + return -1; + + *topology = msr_value; + + return 0; +} + +static int skx_iio_get_topology(struct intel_uncore_type *type) +{ + int ret, cpu, die, current_die; + struct pci_bus *bus = NULL; + + /* + * Verified single-segment environments only; disabled for multiple + * segment topologies for now. + */ + while ((bus = pci_find_next_bus(bus)) && !pci_domain_nr(bus)) + ; + if (bus) { + pr_info("I/O stack mapping is not supported for multi-seg\n"); + return -1; + } + + type->topology = kzalloc(max_dies * sizeof(u64), GFP_KERNEL); + if (!type->topology) + return -ENOMEM; + + /* + * Using cpus_read_lock() to ensure cpu is not going down between + * looking at cpu_online_mask. + */ + cpus_read_lock(); + /* Invalid value to start loop.*/ + current_die = -1; + for_each_online_cpu(cpu) { + die = topology_logical_die_id(cpu); + if (current_die == die) + continue; + ret = skx_msr_cpu_bus_read(cpu, &type->topology[die]); + if (ret) { + kfree(type->topology); + break; + } + current_die = die; + } + cpus_read_unlock(); + + return ret; +} + +static inline u8 skx_iio_stack_bus(struct intel_uncore_pmu *pmu, int die) +{ + return pmu->type->topology[die] >> (pmu->pmu_idx * BUS_NUM_STRIDE); +} + +static int skx_iio_set_box_mapping(struct intel_uncore_pmu *pmu) +{ + char *buf; + int die = 0; + /* Length of template "%04x:%02x," without null character. */ + const int template_len = 8; + + /* + * Root bus 0x00 is valid only for die 0 AND pmu_idx = 0. + * Set "0" platform mapping for PMUs which have zero stack bus and + * non-zero index. + */ + if (!skx_iio_stack_bus(pmu, die) && pmu->pmu_idx) { + pmu->mapping = kzalloc(2, GFP_KERNEL); + if (!pmu->mapping) + return -ENOMEM; + sprintf(pmu->mapping, "0"); + return 0; + } + + pmu->mapping = kzalloc(max_dies * template_len + 1, GFP_KERNEL); + if (!pmu->mapping) + return -ENOMEM; + + buf = pmu->mapping; + for (; die < max_dies; die++) { + buf += snprintf(buf, template_len + 1, "%04x:%02x,", 0, + skx_iio_stack_bus(pmu, die)); + } + + *(--buf) = '\0'; + + return 0; +} + +static int skx_iio_set_mapping(struct intel_uncore_type *type) +{ + /* + * Each IIO stack (PCIe root port) has its own IIO PMON block, so each + * platform_mapping holds bus number(s) of PCIe root port(s), which can + * be monitored by that IIO PMON block. + * + * For example, on 4-die Xeon platform with up to 6 IIO stacks per die + * and, therefore, 6 IIO PMON blocks per die, the platform_mapping + * of IIO PMON block 0 holds "0000:00,0000:40,0000:80,0000:c0": + * + * $ cat /sys/devices/uncore_iio_0/platform_mapping + * 0000:00,0000:40,0000:80,0000:c0 + * + * Which means: + * IIO PMON 0 on die 0 belongs to PCIe RP on bus 0x00, domain 0x0000 + * IIO PMON 0 on die 1 belongs to PCIe RP on bus 0x40, domain 0x0000 + * IIO PMON 0 on die 2 belongs to PCIe RP on bus 0x80, domain 0x0000 + * IIO PMON 0 on die 3 belongs to PCIe RP on bus 0xc0, domain 0x0000 + */ + + int ret; + struct intel_uncore_pmu *pmu = type->pmus; + + for (; pmu - type->pmus < type->num_boxes; pmu++) { + ret = skx_iio_set_box_mapping(pmu); + if (ret) { + for (; pmu->pmu_idx > 0; --pmu) + kfree(pmu->mapping); + break; + } + } + + kfree(type->topology); + return ret; +} + void skx_uncore_cpu_init(void) { skx_uncore_chabox.num_boxes = skx_count_chabox(); -- 2.19.1