Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1206268rwd; Sat, 27 May 2023 13:09:31 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7yJYxzjea9oQmeibPNnT2eHvPJOhD/RVzaNKnzUCJOALalPqlJX3OhqlSIOczxmknzGGTq X-Received: by 2002:a17:903:2442:b0:1a9:bdf8:f551 with SMTP id l2-20020a170903244200b001a9bdf8f551mr6366644pls.69.1685218170900; Sat, 27 May 2023 13:09:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685218170; cv=none; d=google.com; s=arc-20160816; b=ED3xEzE63Go8z82PFJE8W60qvqQGl4BIvcwIVhRiV0nB/zksh95JfefVZKqApZxLpW J3rAK4B8AD21tog8uqxWyJ0rLQ1/r9IFCesOSS2yNtxlMCBn8wNSjB0nQDwrp8hI4W5e pF5MRzONpKoHHMEprPBpwGUpLSxtpyC0KxU5KuyZI7J8Rb9B7113NglJRRsqbRCkcUZo 9ZOzARMYprVYsESncrV6Zp4O8y/M3wRqp5yhLrXEklsTz0Ijfu59F3W/vh+IVvmWybhR 7xIP/0WaZyxXhfKvfeSa61iUoJOWiy+qVRCY3vo0HpnvKaY/dqVYpcpGIEM38tuPK+sB IiLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date:dkim-signature; bh=jSKpvocl9Psoooq0Kj8nSOglzeNA5wNRvJAdjGyd94c=; b=xBOO3rRz2bXuG1SsdiivhuUcQ0quPlYheZnmoCIE6l1JBsVa9QE416eDHKRL+GwPiQ WfqHSJwkDs6vcU7vIPuC5aOHTVoz+ZGZ8dQA7Q2gsg59HP2esoRmfsJzKX/LrUuXJbB9 GE/x7oVfoGxMH456hjl755RDDJOPgBLmG/xuMnrL6EmPND8G2s5gcUFAWH39lZB6sUDN WxUR7/6j8E5+vfjMmOehMZjD96EyvIg+4sWy335AXbQimFrWy5e/rwSySJA4Es5xu99y AMeYu4o/2/stzTWYclmqsIO8L+Uz7iYfL2KXlTBF100SriXJ6lp8i8ShTlYXHO3IJCLk cmsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Fqj08L5E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u17-20020a170902e5d100b001a69cb5f7besi6881227plf.525.2023.05.27.13.09.11; Sat, 27 May 2023 13:09:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Fqj08L5E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229548AbjE0SjB (ORCPT + 99 others); Sat, 27 May 2023 14:39:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229483AbjE0SjA (ORCPT ); Sat, 27 May 2023 14:39:00 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE91CD8; Sat, 27 May 2023 11:38:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 64CF260BA3; Sat, 27 May 2023 18:38:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8A83C433D2; Sat, 27 May 2023 18:38:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685212737; bh=RV3hL6ReTtFRGgRYsgsgbKIZ9umnkgmFml3kwdYCvbs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Fqj08L5EHpJLOVXvKFMY3IYczPPMRtolrgwlH5dWqYZ7wVLikfDfRmuWg3cvJZB52 iE5uUBWvhA5po64mRBGn4Q23gwSa8eg7yiBnKTAz7X4md1nRoZ3pts79sr+thPb99M BqAGXCvuVvfzn+hdQvTBJ9ojn1plRocMDHrwCw8Mi/YIs4T6a8tu8tuvjMyTdfJyjT 0npIFxowaElV5Opv+70DUfgGz926T9srspfbr7PQbDt9CfGuR/jJMNNC4QgQjFnv4O mpcDX90HmgociOgXJ0+lU/8f/nTFPVeWK4+OHW3mlgrMkAayxfnXt1kQ4v04TrVqyO siREY0fAcdhNA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q2yoZ-000mV3-An; Sat, 27 May 2023 19:38:55 +0100 Date: Sat, 27 May 2023 19:38:54 +0100 Message-ID: <86h6rxd3gh.wl-maz@kernel.org> From: Marc Zyngier To: Ian Rogers Cc: Oliver Upton , Peter Zijlstra , Ravi Bangoria , Nathan Chancellor , namhyung@kernel.org, eranian@google.com, acme@kernel.org, mark.rutland@arm.com, jolsa@kernel.org, bp@alien8.de, kan.liang@linux.intel.com, adrian.hunter@intel.com, maddy@linux.ibm.com, x86@kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, sandipan.das@amd.com, ananth.narayan@amd.com, santosh.shukla@amd.com, kvmarm@lists.linux.dev Subject: Re: [PATCH v4 3/4] perf/core: Remove pmu linear searching code In-Reply-To: References: <20230504110003.2548-1-ravi.bangoria@amd.com> <20230504110003.2548-4-ravi.bangoria@amd.com> <20230524214133.GA2359762@dev-arch.thelio-3990X> <20230525142031.GU83892@hirez.programming.kicks-ass.net> <86jzwtdhmk.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: irogers@google.com, oliver.upton@linux.dev, peterz@infradead.org, ravi.bangoria@amd.com, nathan@kernel.org, namhyung@kernel.org, eranian@google.com, acme@kernel.org, mark.rutland@arm.com, jolsa@kernel.org, bp@alien8.de, kan.liang@linux.intel.com, adrian.hunter@intel.com, maddy@linux.ibm.com, x86@kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, sandipan.das@amd.com, ananth.narayan@amd.com, santosh.shukla@amd.com, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 27 May 2023 18:00:13 +0100, Ian Rogers wrote: >=20 > On Sat, May 27, 2023 at 6:32=E2=80=AFAM Marc Zyngier wro= te: > > > > On Sat, 27 May 2023 00:00:47 +0100, > > Ian Rogers wrote: > > > > > > On Thu, May 25, 2023 at 8:56=E2=80=AFAM Oliver Upton wrote: > > > > > > > > On Thu, May 25, 2023 at 04:20:31PM +0200, Peter Zijlstra wrote: > > > > > On Thu, May 25, 2023 at 07:11:41AM +0000, Oliver Upton wrote: > > > > > > > > > > > The PMUv3 driver does pass a name, but it relies on getting bac= k an > > > > > > allocated pmu id as @type is -1 in the call to perf_pmu_registe= r(). > > > > > > > > > > > > What actually broke is how KVM probes for a default core PMU to= use for > > > > > > a guest. kvm_pmu_probe_armpmu() creates a counter w/ PERF_TYPE_= RAW and > > > > > > reads the pmu from the returned perf_event. The linear search h= ad the > > > > > > effect of eventually stumbling on the correct core PMU and succ= eeding. > > > > > > > > > > > > Perf folks: is this WAI for heterogenous systems? > > > > > > > > > > TBH, I'm not sure. hetero and virt don't mix very well AFAIK and = I'm not > > > > > sure what ARM64 does here. > > > > > > > > > > IIRC the only way is to hard affine things; that is, force vCPU of > > > > > 'type' to the pCPU mask of 'type' CPUs. > > > > > > > > We provide absolutely no illusion of consistency across implementat= ions. > > > > Userspace can select the PMU type, and then it is a userspace probl= em > > > > affining vCPUs to the right pCPUs. > > > > > > > > And if they get that wrong, we just bail and refuse to run the vCPU. > > > > > > > > > If you don't do that; or let userspace 'override' that, things go > > > > > sideways *real* fast. > > > > > > > > Oh yeah, and I wish PMUs were the only problem with these hetero > > > > systems... > > > > > > Just to add some context from what I understand. There are inbuilt > > > type numbers for PMUs: > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/i= nclude/uapi/linux/perf_event.h?h=3Dperf-tools-next#n34 > > > so the PMU generally called /sys/devices/cpu should have type 4 (ARM > > > give it another name). For heterogeneous ARM there is a single PMU and > > > the same events are programmed regardless of whether it is a big or a > > > little core - the cpumask lists all CPUs. > > > > I think you misunderstood the way heterogeneous arm64 systems are > > described . Each CPU type gets its own PMU type, and its own event > > list. Case in point: > > > > $ grep . /sys/devices/*pmu/{type,cpus} > > /sys/devices/apple_avalanche_pmu/type:9 > > /sys/devices/apple_blizzard_pmu/type:8 > > /sys/devices/apple_avalanche_pmu/cpus:4-9 > > /sys/devices/apple_blizzard_pmu/cpus:0-3 > > > > Type 4 (aka PERF_EVENT_RAW) is AFAICT just a way to encode the raw > > event number, nothing else. >=20 > Which PMU will a raw event open on? On the PMU that matches the current CPU. > Note, the raw events don't support > the extended type that is present in PERF_TYPE_HARDWARE and > PERF_TYPE_HW_CACHE: > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/tree/include/= uapi/linux/perf_event.h#n41 > as the bits are already in use for being just plain config values. I'm not sure how relevant this is to the numbering of PMUs on arm64. > I suspect not being type 4 is a bug on apple ARM here. If that's a bug on this machine, it's a bug on all machines, at which point it is the de-facto API: $ grep . /sys/devices/armv8*/{type,cpus} /sys/devices/armv8_cortex_a53/type:8 /sys/devices/armv8_cortex_a72/type:9 /sys/devices/armv8_cortex_a53/cpus:0-3 /sys/devices/armv8_cortex_a72/cpus:4-5 See, non-Apple HW. And now for a system with homogeneous CPUs: $ grep . /sys/devices/armv8*/{type,cpus} /sys/devices/armv8_pmuv3_0/type:8 /sys/devices/armv8_pmuv3_0/cpus:0-159 Still no type 4. I could go on for hours, I have plenty of HW around me! So whatever your source of information is, it doesn't match reality. Our PMUs are numbered arbitrarily, and have been so for... a very long time. At least since perf_pmu_register has supported dynamic registration (see 2e80a82a49c4c). Thanks, M. --=20 Without deviation from the norm, progress is not possible.