Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755148AbcKKKqG (ORCPT ); Fri, 11 Nov 2016 05:46:06 -0500 Received: from mail-bl2nam02on0083.outbound.protection.outlook.com ([104.47.38.83]:42307 "EHLO NAM02-BL2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752576AbcKKKqD (ORCPT ); Fri, 11 Nov 2016 05:46:03 -0500 X-Greylist: delayed 921 seconds by postgrey-1.27 at vger.kernel.org; Fri, 11 Nov 2016 05:46:03 EST Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jan.Glauber@cavium.com; Date: Fri, 11 Nov 2016 11:30:29 +0100 From: Jan Glauber To: Will Deacon CC: Mark Rutland , , Subject: Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC Message-ID: <20161111103029.GD16907@hardcore> References: <73173d6ad2430eead5e9da40564a90a60961b6d9.1477741719.git.jglauber@cavium.com> <20161108235010.GC17771@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20161108235010.GC17771@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Originating-IP: [46.223.65.110] X-ClientProxiedBy: DB5PR04CA0005.eurprd04.prod.outlook.com (10.164.34.143) To CY1PR07MB2588.namprd07.prod.outlook.com (10.167.16.138) X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2588;2:NpSdz9N+iMmoVvHCyC5W6AEwnZpQoXLh5NxqoL7dye8E67BDm9lx7LiZKe1VYOtMXkVHWv0HZ2rVu2e47oFagOY8AwTBFo2HZg1jo3J4INcJ3c/yWOnsdkCTSltW5CoHPdQpaG21KVqrmmN8AX6U19yrS8gc2E+kxCKk5JrMkV4=;3:KWiqruyCCB1O705VsYgsPBJmcHKnkGla5fjnMkJJaChg28nkw4sMJrmFoUor3ATUgzWaDx4fptqAvot+jICpcaS5Eg3qS4CmBzX9ajSBLseWHCx+MLSzWrN8O2XZ7bS0QaN5bkyV9aqJRtmI5Z8QkMIcOFLRimACPGwfRSKWCw4=;25:IoEjUfbYkBzqJ3A1UHpmdQwlOC8g3A0MLwprE/xUudtPtDRQl5a0SWRmqd2eLmLw2RpqB9WBEB6duOwEcTIgpheehHum5jKFAU4it9n9TsNvxEgN76odJ3jEz137WWGQ6xKqALaCNMVdeZAqmMKpVV78+/JFl+auaa4b3ocRjW1wErDpi6SckZm7BODosgLHwWLlkrn21AXPOE+nvzQT3HNfJT++sOCUxI4chhGLKLSvbNnrUej8YnVxzQKuBWfolQPpoBIj7Cu1mXataSINoOqoAkZiK8e44wLkyA7gsb5B/7jgnsSbOKny1xS3E+a9Selh2aSUjmoHsiJnCl86u1O33UJFQUFfivjpL3QsFBo+ysyddBXMTi0UD0szA/mQZiwX+Odi0wrFeHUltMqRcRDreOc5rL/3Dv4LWF+eHOvDgLBCqtJRHPoqWSKMmDcF X-MS-Office365-Filtering-Correlation-Id: 159e55e1-2317-49ba-d4a5-08d40a1dc876 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:CY1PR07MB2588; X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2588;31:z0pjRku8TnihSJ7HDNUhvLHJG4qE7DQXe5fwSHme+KG0WDwKtgW9j2vdpVdbwsUB2N/uMzeLP7cyjtomdv+BOTlltd6iTgWMMOnfJBTqAWSehmQcGfi66mdHteHnvs9fCS3BpNuDtuZGV/qepZoqCWk/jsl8LZvQb/U2Xsphi+bUfnTgWEfzHLfHpl2F6dHz+Bm6nUL3HWn5HSfz9qQCZAvNGLmjiZwsTqEah+03jZosxD85cPh8maGt+1Kpcleyp1n9/KNShfA1ym7j0CVgLg==;20:zD2/l41uccbG9p8Qi2eIpZn0mmuxPg3dZ/TIczP/yG/xeOGnsVooXBCCgkGNr6Opq/DO63P/xclfg9OT2IUwD7xW3iaII/utS68qulSPLew/obP9kMnDNIkD+nE10QRl+QnNQXSNnss9BfemKmjJ6zpErgUNJu3e9utRPgNvxsCY6wddeMVzZgq53oAbxfZAtKzY+fWqXdrr3wTqWlLyc6InITXEUwgqzni8CxT+p0qfGzhljfb8nWNjJ/FE6SBvaj19ikwNhIgOx3B8lZl/FNOMRP5CA6JOF1W0yW2uPdzBMgsjL+4vGxB5cE33Q9Y/N4Ojr0kyRMjieQ30GbxkhL4hcQo+iemTHw0+CfoOXFYcUMO8UBrj1aS2Ny+jivPbXNDYSEMXM3NbKWOyHEn1eLcyJXPYlgn9JpO5yKqEAyzI9iH7G8FUbZVwIw7n9fhQxofSjsIkYCNSMYnfzzkQz98+9xvk6zLVt7HAV8M/TLQW4LV/MUA/FEHFXEayC3fLQyeQv6Zea/BsS5ujCrn5r5jERAC+Q8Bu92DCCKFctGbIYi63gdj6ldHnE8IFC0LrOGvyydNVl6ScR8hLGLDJcU+xwW91v/AFntN4RdFq7vg= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(166708455590820)(209352067349851); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001);SRVR:CY1PR07MB2588;BCL:0;PCL:0;RULEID:;SRVR:CY1PR07MB2588; X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2588;4:cFkMcVuB1ci2DNumETk8a593Jlm6xgCR+CW+W+NrPBji5tQtoFBNu5vMxrNPcwSnl5q7ZAw8sGI4K7jg8JNLy84T4SwUSTHlF13an38sd+g9pfrcVyo34s/OUR95mvy82Ryl0XlBJvuZhyvMiZhynF+HM7OHKdzm6kYN0VN9xQgvI8tUVUl3xeRKzalwhAeyEyLMoa7Aw/3odNyq9MaWSy3yU0r9EWqLFCKW/VOSeb/9m4Ihhjtlm8dsk3XDQKgHQjzjOiteNgW2+n4AIlKW0tSZwE01QEnExYKb+IKo0v6w5Gm0Xycybc04U4ns2SRAuOUBypY5Q6DXnHYgTRz0nYc75XH7bUQoIP9IyHBufzKvzGmkfzi7P/M64nV8nlToVbePDVkohRNCbPbQKk2TNJzgW1/JQU/rxGgA8UkJauyXgdb2yCCZbHvUr3jQm6Wz0OC2bDAPZwZZioG9VdUWvHX7himo1PREbepQc1ZnGos= X-Forefront-PRVS: 012349AD1C X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6009001)(7916002)(51914003)(189002)(24454002)(199003)(305945005)(66066001)(97736004)(8676002)(7736002)(4001350100001)(92566002)(68736007)(81156014)(47776003)(50986999)(33716001)(1076002)(83506001)(586003)(6916009)(42882006)(229853002)(6116002)(101416001)(7846002)(3846002)(6666003)(23726003)(2950100002)(110136003)(5660300001)(76176999)(33656002)(81166006)(4326007)(54356999)(105586002)(50466002)(2906002)(77096005)(97756001)(46406003)(106356001)(9686002)(189998001)(42186005)(341764005)(18370500001);DIR:OUT;SFP:1101;SCL:1;SRVR:CY1PR07MB2588;H:hardcore;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;CY1PR07MB2588;23:VIHSX126uPthT1pM5Ff8WP0oRMDDPpzEU5mdd0a0p?= =?us-ascii?Q?LBu7yVb1J5cQKKxQRV31rLY3HPRypCQQ5kXjfAKrpyzvZYXgK7fA8i2AB2wA?= =?us-ascii?Q?JsCGYmUozneaFeMrD+5eopQZB76JlKaTFHK64WTtWQwvVZmbOj5CqKp/bTQu?= =?us-ascii?Q?Q6t2F9fNUEGTOqPC6XF3JXfE0hM9QA0AeZyaPuahTu4psAJy/zp7HmeDdjLI?= =?us-ascii?Q?BxOSJjwI/1FY9YDu7X/Sl5jiQ9SRLcFzC6dBbnGY5Deyy6K2peEj8ekELhNX?= =?us-ascii?Q?U0Ye7rhXIhsZCePRVvgjl8gtnBMtu7o46vvnzVe3wORT6of3WxGZCrP0nQ6X?= =?us-ascii?Q?TELeFbXLfELXGAunVe13LnIIlVQJlLO7lvzf9+iBuQD9IkmDPiwJMJZyO+Ri?= =?us-ascii?Q?03jNWccpPQLWlPycZeD0mCFwXvG29BzQeKeWiofblkwU0S5tX6Z0m/66AN/p?= =?us-ascii?Q?ZCwYSIGquxf68/vvr1P6wE/o17AcK/bSvpagHCU0Ig0ZgBsdyZrji8ONdP+k?= =?us-ascii?Q?ltsjWlJDv64zO+Pq4FbpLL6MyrDhMu/KlSlyPWQzW1VlqywZe7EBZBVPuEfs?= =?us-ascii?Q?JTo/GCs795ReDp1G2duJC/+bIiCcFkra2IvBd/XEaFcU+jjkVzwYAB/4isi0?= =?us-ascii?Q?5f6T2N2w5DVe2yz0zhK/fGxLkiPu0scjaqsSJpr0G2ckgdvJp1kuJmCpZs9V?= =?us-ascii?Q?iVe+NvpQvsmGE5ZOxO2AjJmbzue88RllhJD377yxLvexsz/JThdtyGvUhOaJ?= =?us-ascii?Q?H2JMi3FNUFpIy3SixhMn6abHiG49MJ9pM1UU3rEsWvvVmyOJVvKYCpFBq758?= =?us-ascii?Q?vbbrjB/FQYY6a1sUrReQ+oiKmj+xvvurbtZOc7tisjZhZEjo1XMMIIPCvzdm?= =?us-ascii?Q?jG8dxO+ywRMlx9MD89EwDkH8obD39mUv4na9yZgodkvHBlMUmyIfTov0AQWn?= =?us-ascii?Q?c34gurxlVrwqgrt9ZjzYOEVpEfK0H6R5/zJOYUNFcIIno8hxEqGmK4Y2Gcbm?= =?us-ascii?Q?E3rrwJW/KRDI7tc1TLL8B5ChZ6YvYZFuAGanlgE4kmi+AH3mRU2XEh3e6Jd/?= =?us-ascii?Q?UhO8ZcTaFV6xBK0V8SSLQpw/+TmMkNCZhDL4h7r3lT9bIeUJRHjc/oSsJT3T?= =?us-ascii?Q?tHmErvJsUySYR94C34uqSHFVgUbrMJMf/puuveYzQSYE0jeEh//BCgOZhfEB?= =?us-ascii?Q?WXm7RYeNupP9VPRz8Q00S8sOqKX48GWq6faiX7mP/tP/pAPp+pSpNPrMTQH7?= =?us-ascii?Q?gvMHRpNZvmUoZjXzPI=3D?= X-Microsoft-Exchange-Diagnostics: 1;CY1PR07MB2588;6:D+qS/xPwOqK/MjctWUDj/LVsgcMzghBLJjedXV6EUue9ajC1q7EjXqbEeL9K88EfVTheKiX11VaxPM1hPvBs1Dh0ZBRMVs0gaz8/UC8fPsjV+zFcaphsixg17xRzqiildtlZkbOxOU7+0NqKH8aoGPOSH8XwtkoWLBRgLm4mTLjM0ggjNExiJyJhjj18svo3QmDgnPpWGsPatOF7E7sLRUFz+0HUVAVGMHMR2DE35ZJJ4xJZy7xyLbtAJeihmsVxqG7FjWuDpBJ3M9Ko1/Z8/iSPG5wWIaXYfP8PW1OcGzZma61Qq65YUzBM/fvHEJnO;5:X3pBjgYuu/42Tzt2xA3XfZpV4UynaY+WfDwd4dfrn54NifnjlExoM+41HNsvet8ydLZFC6IgdTN0HkaLOtRo31xwTwUWlLOHp2nVVo8ohIMH/STvqOOlqwm2NiyDrM5ZxM+WKIGj1xEReM8Hk8VRuoVLoKsXJNyL44gN7UvEJgY=;24:2zqVWhyhplBHPhEFSYmkbmixmLpvQ8Hx5aYpulCs0Oy4EVU4Gz+JEUrHi9Nb5Zau9DvK4X62YAhcEg89v4zxf2WQgg27ZpNWCB+/CbcTAzM=;7:112jIfpNm/bByV/dTcuau4l2NT6yvVI5Kl8RNHQk06QMrMFZ3kn5KH+867Sl70/2Qj09bBT7iSb5cpTvUosZ9ZR7GpSRstvgt7Q9kWfD6LCtA6GTSEaFdTgeu8Td+A5VZiQ/YdEQxk4MEUFDxOY+4LGp34dIJ9lRdcOZVYYvo9NICHSKi7hjUehh42mMJps4Psp5mzlrH6NyARGj6SjoPJKz7GoPhbpu34BeV7FncpujzvS/CDKlGYQeq8KWB0ekAGazdKFMGemRz/pvtxBALQTzMrhG7BaPUiGkB9lmJrfzy5Nczb6Cxri400gdxopzBtgBrGxRdHP+7jUlKU93JNsBBH+PnHoHUqReZHq+WnQ= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Nov 2016 10:30:38.8383 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR07MB2588 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 15661 Lines: 494 Hi Will, thanks for the review! On Tue, Nov 08, 2016 at 11:50:10PM +0000, Will Deacon wrote: > Hi Jan, > > Thanks for posting an updated series. I have a few minor comments, which > we can hopefully address in time for 4.10. > > Also, have you run the perf fuzzer with this series applied? No, that's new to me. Will try it. > https://github.com/deater/perf_event_tests > > (build the tests and look under the fuzzer/ directory for the tool) > > On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote: > > Provide "uncore" facilities for different non-CPU performance > > counter units. > > > > The uncore PMUs can be found under /sys/bus/event_source/devices. > > All counters are exported via sysfs in the corresponding events > > files under the PMU directory so the perf tool can list the event names. > > > > There are some points that are special in this implementation: > > > > 1) The PMU detection relies on PCI device detection. If a > > matching PCI device is found the PMU is created. The code can deal > > with multiple units of the same type, e.g. more than one memory > > controller. > > > > 2) Counters are summarized across different units of the same type > > on one NUMA node but not across NUMA nodes. > > For instance L2C TAD 0..7 are presented as a single counter > > (adding the values from TAD 0 to 7). Although losing the ability > > to read a single value the merged values are easier to use. > > > > 3) The counters are not CPU related. A random CPU is picked regardless > > of the NUMA node. There is a small performance penalty for accessing > > counters on a remote note but reading a performance counter is a > > slow operation anyway. > > > > Signed-off-by: Jan Glauber > > --- > > drivers/perf/Kconfig | 13 ++ > > drivers/perf/Makefile | 1 + > > drivers/perf/uncore/Makefile | 1 + > > drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++ > > drivers/perf/uncore/uncore_cavium.h | 71 ++++++++ > > We already have "uncore" PMUs under drivers/perf, so I'd prefer that we > renamed this a bit to reflect better what's going on. How about: > > drivers/perf/cavium/ > > and then > > drivers/perf/cavium/uncore_thunder.[ch] > > ? OK, agreed. > > include/linux/cpuhotplug.h | 1 + > > 6 files changed, 438 insertions(+) > > create mode 100644 drivers/perf/uncore/Makefile > > create mode 100644 drivers/perf/uncore/uncore_cavium.c > > create mode 100644 drivers/perf/uncore/uncore_cavium.h > > > > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig > > index 4d5c5f9..3266c87 100644 > > --- a/drivers/perf/Kconfig > > +++ b/drivers/perf/Kconfig > > @@ -19,4 +19,17 @@ config XGENE_PMU > > help > > Say y if you want to use APM X-Gene SoC performance monitors. > > > > +config UNCORE_PMU > > + bool > > This isn't needed. I thought about a symbol for all uncore pmus. But when drivers/perf is already that it can be removed. > > + > > +config UNCORE_PMU_CAVIUM > > + depends on PERF_EVENTS && NUMA && ARM64 > > + bool "Cavium uncore PMU support" > > Please mentioned Thunder somewhere, since that's the SoC being supported. OK. > > + select UNCORE_PMU > > + default y > > + help > > + Say y if you want to access performance counters of subsystems > > + on a Cavium SOC like cache controller, memory controller or > > + processor interconnect. > > + > > endmenu > > diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile > > index b116e98..d6c02c9 100644 > > --- a/drivers/perf/Makefile > > +++ b/drivers/perf/Makefile > > @@ -1,2 +1,3 @@ > > obj-$(CONFIG_ARM_PMU) += arm_pmu.o > > obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o > > +obj-y += uncore/ > > diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile > > new file mode 100644 > > index 0000000..6130e18 > > --- /dev/null > > +++ b/drivers/perf/uncore/Makefile > > @@ -0,0 +1 @@ > > +obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o > > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c > > new file mode 100644 > > index 0000000..a7b4277 > > --- /dev/null > > +++ b/drivers/perf/uncore/uncore_cavium.c > > @@ -0,0 +1,351 @@ > > +/* > > + * Cavium Thunder uncore PMU support. > > + * > > + * Copyright (C) 2015,2016 Cavium Inc. > > + * Author: Jan Glauber > > + */ > > + > > +#include > > +#include > > +#include > > + > > +#include "uncore_cavium.h" > > + > > +/* > > + * Some notes about the various counters supported by this "uncore" PMU > > + * and the design: > > + * > > + * All counters are 64 bit long. > > + * There are no overflow interrupts. > > + * Counters are summarized per node/socket. > > + * Most devices appear as separate PCI devices per socket with the exception > > + * of OCX TLK which appears as one PCI device per socket and contains several > > + * units with counters that are merged. > > + * Some counters are selected via a control register (L2C TAD) and read by > > + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have > > + * one dedicated counter per event. > > + * Some counters are not stoppable (L2C CBC & LMC). > > + * Some counters are read-only (LMC). > > + * All counters belong to PCI devices, the devices may have additional > > + * drivers but we assume we are the only user of the counter registers. > > + * We map the whole PCI BAR so we must be careful to forbid access to > > + * addresses that contain neither counters nor counter control registers. > > + */ > > + > > +void thunder_uncore_read(struct perf_event *event) > > +{ > > Rather than have a bunch of global symbols that are called from the > individual drivers, why don't you pass a struct of function pointers to > their respective init functions and keep the internals private? Most of these functions are already in struct pmu. What I can do is set the shared functions in thunder_uncore_setup as default, and only override them as needed (like thunder_uncore_read_ocx_tlk) or the other way around (use default if not set already). Then I can get rid of them. > > + struct thunder_uncore *uncore = to_uncore(event->pmu); > > + struct hw_perf_event *hwc = &event->hw; > > + struct thunder_uncore_node *node; > > + struct thunder_uncore_unit *unit; > > + u64 prev, delta, new = 0; > > + > > + node = get_node(hwc->config, uncore); > > + > > + /* read counter values from all units on the node */ > > + list_for_each_entry(unit, &node->unit_list, entry) > > + new += readq(hwc->event_base + unit->map); > > + > > + prev = local64_read(&hwc->prev_count); > > + local64_set(&hwc->prev_count, new); > > + delta = new - prev; > > + local64_add(delta, &event->count); > > +} > > + > > +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base, > > + u64 event_base) > > +{ > > + struct thunder_uncore *uncore = to_uncore(event->pmu); > > + struct hw_perf_event *hwc = &event->hw; > > + struct thunder_uncore_node *node; > > + int id; > > + > > + node = get_node(hwc->config, uncore); > > + id = get_id(hwc->config); > > + > > + if (!cmpxchg(&node->events[id], NULL, event)) > > + hwc->idx = id; > > Does this need to be a full-fat cmpxchg? Who are you racing with? Just copy'n'paste from the existing drivers. I guess it can be relaxed. > > + > > + if (hwc->idx == -1) > > + return -EBUSY; > > This would be much clearer as an else statement after the cmpxchg. Agreed. > > + > > + hwc->config_base = config_base; > > + hwc->event_base = event_base; > > + hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; > > + > > + if (flags & PERF_EF_START) > > + uncore->pmu.start(event, PERF_EF_RELOAD); > > + > > + return 0; > > +} > > + > > +void thunder_uncore_del(struct perf_event *event, int flags) > > +{ > > + struct thunder_uncore *uncore = to_uncore(event->pmu); > > + struct hw_perf_event *hwc = &event->hw; > > + struct thunder_uncore_node *node; > > + int i; > > + > > + event->pmu->stop(event, PERF_EF_UPDATE); > > + > > + /* > > + * For programmable counters we need to check where we installed it. > > + * To keep this function generic always test the more complicated > > + * case (free running counters won't need the loop). > > + */ > > + node = get_node(hwc->config, uncore); > > + for (i = 0; i < node->num_counters; i++) { > > + if (cmpxchg(&node->events[i], event, NULL) == event) > > + break; > > + } > > + hwc->idx = -1; > > +} > > + > > +void thunder_uncore_start(struct perf_event *event, int flags) > > +{ > > + struct thunder_uncore *uncore = to_uncore(event->pmu); > > + struct hw_perf_event *hwc = &event->hw; > > + struct thunder_uncore_node *node; > > + struct thunder_uncore_unit *unit; > > + u64 new = 0; > > + > > + /* read counter values from all units on the node */ > > + node = get_node(hwc->config, uncore); > > + list_for_each_entry(unit, &node->unit_list, entry) > > + new += readq(hwc->event_base + unit->map); > > + local64_set(&hwc->prev_count, new); > > + > > + hwc->state = 0; > > + perf_event_update_userpage(event); > > +} > > + > > +void thunder_uncore_stop(struct perf_event *event, int flags) > > +{ > > + struct hw_perf_event *hwc = &event->hw; > > + > > + hwc->state |= PERF_HES_STOPPED; > > + > > + if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) { > > + thunder_uncore_read(event); > > + hwc->state |= PERF_HES_UPTODATE; > > + } > > +} > > + > > +int thunder_uncore_event_init(struct perf_event *event) > > +{ > > + struct hw_perf_event *hwc = &event->hw; > > + struct thunder_uncore_node *node; > > + struct thunder_uncore *uncore; > > + > > + if (event->attr.type != event->pmu->type) > > + return -ENOENT; > > + > > + /* we do not support sampling */ > > + if (is_sampling_event(event)) > > + return -EINVAL; > > + > > + /* counters do not have these bits */ > > + if (event->attr.exclude_user || > > + event->attr.exclude_kernel || > > + event->attr.exclude_host || > > + event->attr.exclude_guest || > > + event->attr.exclude_hv || > > + event->attr.exclude_idle) > > + return -EINVAL; > > + > > + uncore = to_uncore(event->pmu); > > + if (!uncore) > > + return -ENODEV; > > + if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK)) > > + return -EINVAL; > > + > > + /* check NUMA node */ > > + node = get_node(event->attr.config, uncore); > > + if (!node) { > > + pr_debug("Invalid NUMA node selected\n"); > > + return -EINVAL; > > + } > > + > > + hwc->config = event->attr.config; > > + hwc->idx = -1; > > + return 0; > > +} > > + > > +static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev, > > + struct device_attribute *attr, > > + char *buf) > > +{ > > + struct pmu *pmu = dev_get_drvdata(dev); > > + struct thunder_uncore *uncore = > > + container_of(pmu, struct thunder_uncore, pmu); > > + > > + return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask); > > +} > > +static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL); > > + > > +static struct attribute *thunder_uncore_attrs[] = { > > + &dev_attr_cpumask.attr, > > + NULL, > > +}; > > + > > +struct attribute_group thunder_uncore_attr_group = { > > + .attrs = thunder_uncore_attrs, > > +}; > > + > > +ssize_t thunder_events_sysfs_show(struct device *dev, > > + struct device_attribute *attr, > > + char *page) > > +{ > > + struct perf_pmu_events_attr *pmu_attr = > > + container_of(attr, struct perf_pmu_events_attr, attr); > > + > > + if (pmu_attr->event_str) > > + return sprintf(page, "%s", pmu_attr->event_str); > > + > > + return 0; > > +} > > + > > +/* node attribute depending on number of NUMA nodes */ > > +static ssize_t node_show(struct device *dev, struct device_attribute *attr, > > + char *page) > > +{ > > + if (NODES_SHIFT) > > + return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1); > > If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse > userspace. So should I use "config:16" in that case? Is it OK to use this also for NODES_SHIFT=0 ? > > + else > > + return sprintf(page, "config:16\n"); > > +} > > + > > +struct device_attribute format_attr_node = __ATTR_RO(node); > > + > > +/* > > + * Thunder uncore events are independent from CPUs. Provide a cpumask > > + * nevertheless to prevent perf from adding the event per-cpu and just > > + * set the mask to one online CPU. Use the same cpumask for all uncore > > + * devices. > > + * > > + * There is a performance penalty for accessing a device from a CPU on > > + * another socket, but we do not care (yet). > > + */ > > +static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node) > > +{ > > + struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node); > > Why _safe? Not required, will remove. > > + int new_cpu; > > + > > + if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask)) > > + return 0; > > + new_cpu = cpumask_any_but(cpu_online_mask, old_cpu); > > + if (new_cpu >= nr_cpu_ids) > > + return 0; > > + perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu); > > + cpumask_set_cpu(new_cpu, &uncore->active_mask); > > + return 0; > > +} > > + > > +static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore, > > + int node_id, int counters) > > +{ > > + struct thunder_uncore_node *node; > > + > > + node = kzalloc(sizeof(*node), GFP_KERNEL); > > + if (!node) > > + return NULL; > > + node->num_counters = counters; > > + INIT_LIST_HEAD(&node->unit_list); > > + return node; > > +} > > + > > +int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id, > > + struct pmu *pmu, int counters) > > +{ > > + unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM; > > + struct thunder_uncore_unit *unit, *tmp; > > + struct thunder_uncore_node *node; > > + struct pci_dev *pdev = NULL; > > + int ret, node_id, found = 0; > > + > > + /* detect PCI devices */ > > + while ((pdev = pci_get_device(vendor_id, device_id, pdev))) { > > Redundant brackets? OK > > + if (!pdev) > > + break; > > Redundant check? OK > > + node_id = dev_to_node(&pdev->dev); > > + > > + /* allocate node if necessary */ > > + if (!uncore->nodes[node_id]) > > + uncore->nodes[node_id] = alloc_node(uncore, node_id, counters); > > + > > + node = uncore->nodes[node_id]; > > + if (!node) { > > + ret = -ENOMEM; > > + goto fail; > > + } > > + > > + unit = kzalloc(sizeof(*unit), GFP_KERNEL); > > + if (!unit) { > > + ret = -ENOMEM; > > + goto fail; > > + } > > + > > + unit->pdev = pdev; > > + unit->map = ioremap(pci_resource_start(pdev, 0), > > + pci_resource_len(pdev, 0)); > > + list_add(&unit->entry, &node->unit_list); > > + node->nr_units++; > > + found++; > > + } > > + > > + if (!found) > > + return -ENODEV; > > + > > + cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE, > > + &uncore->node); > > + > > + /* > > + * perf PMU is CPU dependent in difference to our uncore devices. > > + * Just pick a CPU and migrate away if it goes offline. > > + */ > > + cpumask_set_cpu(smp_processor_id(), &uncore->active_mask); > > + > > + uncore->pmu = *pmu; > > + ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1); > > + if (ret) > > + goto fail; > > + > > + return 0; > > + > > +fail: > > + node_id = 0; > > + while (uncore->nodes[node_id]) { > > + node = uncore->nodes[node_id]; > > + > > + list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) { > > Why do you need the _safe variant? OK, not needed > Will