Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp71269imu; Tue, 27 Nov 2018 09:09:35 -0800 (PST) X-Google-Smtp-Source: AFSGD/Wpv/+01wmSuw3aGIRrb5BOEPg4WJ5DMTKfD7APIO1+8292uec4IGJS2wki59dy5DLoz+ZR X-Received: by 2002:a63:fd53:: with SMTP id m19mr30339728pgj.340.1543338574979; Tue, 27 Nov 2018 09:09:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543338574; cv=none; d=google.com; s=arc-20160816; b=MCKuwu35ozo7McIvMqPqZn+wWpCnJnK7VR+fBb873kPydKN0pbEvNTuwmzYT7udDsv R74FF8ED0UoRI8EAWVeilEXygwpse9vwEotGOp3xyOjp5jTAPlX0twRAUmPm4RBygkQc qSQVHwU5SrpuPBtzP1Z5URb/KS9cfvVJF6Y0miK9Mkgm+blCnQkCL7mzTB14PgmDXf+L 4ioEj7VRpm9GiZTB9GMyCVkiuSgAwMLhGaq5pR/CoOkOWTWl+lvi23crx6utL8neQ9zq /vb/mRIYhPBG3ww6Cr//J2wvWZnnBHUyI8Ts077R9Z9tMTS5ph1E8rCn2o3/D0Mfh7qM xtTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:references:message-id:date :thread-index:thread-topic:subject:cc:to:from:dkim-signature; bh=Ln44/joA6oYke9ISI/ByAMmMMkorWs33EuVXpeFDVXg=; b=Q+7JD7B/thFYRR3CnG8SNmWZJc4NhNB2rNIn8ckX8q4Pj8V9CDpRueo646NWR5XBCi ULlugtb4w0KCywxjqql+v/V8VT1EeRa/xUtDVEiQaxwPUUFl6xrEs6qkEFC3GiOTvBKD naU65ZS9sRzElMRaDaIJ7hnDj2KPwxXzdsgAMDQwuVkPBeJaOySpnndK6cVGlc5svxbb PhZ6MHsfvHK4iFy4ETL28A5+HYRGE2iQ9lmIgexAhojalh0aOdX+TMBIZXUOCHt04KBr AUg000FGePhDKnGCIVDAaV6KgiIhWPVC7tol6dUKTQhxZdbstmQ0jjtbTlTjASbCA1rF kKfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b="gxl/9Gfo"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m11-v6si4363784pla.251.2018.11.27.09.09.01; Tue, 27 Nov 2018 09:09:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b="gxl/9Gfo"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730452AbeK1EEz (ORCPT + 99 others); Tue, 27 Nov 2018 23:04:55 -0500 Received: from smtprelay4.synopsys.com ([198.182.47.9]:47036 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726338AbeK1EEz (ORCPT ); Tue, 27 Nov 2018 23:04:55 -0500 Received: from mailhost.synopsys.com (mailhost1.synopsys.com [10.12.238.239]) by smtprelay.synopsys.com (Postfix) with ESMTP id D539024E07F3; Tue, 27 Nov 2018 09:06:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=synopsys.com; s=mail; t=1543338371; bh=c67KAe9hAUBgHJB89mR4wt1Ya0rCHH4epWI2vEHz+nY=; h=From:To:CC:Subject:Date:References:From; b=gxl/9GfofjenqqexWsj5Ir+9PRurA+CrAzgQZVepvyzoo8OOnkG5c+L+cK/V+7hYO qKc9tugV7AAO8ASLF2T81dDTCaWXmwc3EulN/bFV8oe2jVtvG7S6xwzMZCK13omT3X /4kzx2KDa5UzX/7HkL8uZqNfSoXWfvyT6RHnLgV6fkQ5VENsu8I7xPHSEJsCSb2yYg XYRtSkDPglkADEKBt4WhxnNIzTqwHgyAYmaMvXHQEY8YnOfBa8EhNVGyC8O1yWAi8e wFMWfxqnayQsJO0SdyHQMloZLKpSniC/HbtG7ZE4hxIogJ8IoAYfT+DuPV8KDlKYe0 HS9a59eQqfgMA== Received: from US01WEHTC2.internal.synopsys.com (us01wehtc2-vip.internal.synopsys.com [10.12.239.238]) by mailhost.synopsys.com (Postfix) with ESMTP id 78E7A5392; Tue, 27 Nov 2018 09:06:10 -0800 (PST) Received: from US01WEMBX2.internal.synopsys.com ([fe80::e4b6:5520:9c0d:250b]) by US01WEHTC2.internal.synopsys.com ([10.12.239.237]) with mapi id 14.03.0415.000; Tue, 27 Nov 2018 09:06:10 -0800 From: Vineet Gupta To: Eugeniy Paltsev , "linux-arm-kernel@lists.infradead.org" , "linux-snps-arc@lists.infradead.org" CC: "linux-kernel@vger.kernel.org" , "peterz@infradead.org" , "jolsa@redhat.com" , "acme@kernel.org" , Alexey Brodkin , "namhyung@kernel.org" , "mark.rutland@arm.com" , "will.deacon@arm.com" , "mingo@redhat.com" , "alexander.shishkin@linux.intel.com" Subject: Re: 'branches' perf event mapping differs on ARC and ARM Thread-Topic: 'branches' perf event mapping differs on ARC and ARM Thread-Index: AQHUhl6UMI9IICATOUqAlhDukujZmA== Date: Tue, 27 Nov 2018 17:06:10 +0000 Message-ID: References: <1543329386.13651.13.camel@synopsys.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.144.199.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/27/18 6:36 AM, Eugeniy Paltsev wrote:=0A= > Hi,=0A= >=0A= > While playing with perf tool on ARMv7 and ARCv2 processors and profiling = the=0A= > same application I got interesting results. Even if we got pretty=0A= > similar total=0A= > execution time and instructions number the number of branches on ARC is a= bout=0A= > three times more then on ARM.=0A= >=0A= > I dug into architecture=0A= > specific perf sources and found that we map different=0A= > HW counters into generic 'branches' event on ARC and ARM.=0A= > - We use "ijmp" event on ARC which=0A= > counts all jump and branch instructions (regardless=0A= > of real execution flow - even if no real jump happens)=0A= =0A= That doesn't seem correct IMO. A NOT taken conditional branch doesn't chang= e=0A= control flow, so semantically doesn't qualify as a branch.=0A= On ARC, the generic branches event should be mapped to "actually taken bran= ches"=0A= condition i.e. ijmptak=0A= =0A= =0A= > - We use "pc_write_retired" event on ARM=0A= > which counts only taken branches (Instruction=0A= > architecturally executed, condition check pass - software change of the P= C)=0A= =0A= That seems correct.=0A= =0A= > I guess counting all jump and branch instructions is correct because we u= se=0A= > 'branches' event value to calculate relative value of 'branch-misses' usi= ng=0A= >=0A= > following formula:=0A= > ----------------------------8----------------------------=0A= > branch-misses-ration =3D 'branch-misses' / 'branches' * 100.0=0A= > ----------------=0A= > ------------8----------------------------=0A= > And using only taken branches here is incorrect IMHO.=0A= =0A= Why ? branch-misses is a CPU specific micro-arch state where the a changed = control=0A= flow was NOT predicted. If an implementation mispredicts NOT taken branches= , those=0A= should actually get counted and be fed to hardware folks to improve the mic= ro-arch.=0A= =0A= =0A= > So I guess we should=0A= > map 'br_immed_retired' instead of=0A= > "pc_write_retired" into generic 'branches'=0A= > event on ARM.=0A= =0A=