Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751291AbdIEM0m (ORCPT ); Tue, 5 Sep 2017 08:26:42 -0400 Received: from mail.kdab.com ([176.9.126.58]:39838 "EHLO mail.kdab.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750808AbdIEM0k (ORCPT ); Tue, 5 Sep 2017 08:26:40 -0400 From: Milian Wolff To: Andi Kleen Cc: linux-perf-users@vger.kernel.org, Arnaldo Carvalho de Melo , linux-kernel@vger.kernel.org, peterz@infradead.org Subject: Re: broken cycle counts from perf record in frequency mode [Was: Re: deducing CPU clock rate over time from cycle samples] Date: Tue, 05 Sep 2017 14:26:38 +0200 Message-ID: <2027973.spD7XaVQdP@milian-kdab2> Organization: KDAB In-Reply-To: <20170905034058.GV2482@two.firstfloor.org> References: <2900948.aeNJFYEL58@agathebauer> <1788258.9pJ1GNok8V@milian-kdab2> <20170905034058.GV2482@two.firstfloor.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart5715753.GKeFb1RmlZ"; micalg="sha256"; protocol="application/pkcs7-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8878 Lines: 144 --nextPart5715753.GKeFb1RmlZ Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" On Tuesday, September 5, 2017 5:40:58 AM CEST Andi Kleen wrote: > > The cycle value gets associated with a sample via it's period value, which > > is used by `perf report` in the analysis. If I get a single "broken" > > sample with > > I always thought it just used the number of samples? No, that is not the case. It uses the cycle period as "weight" by default. Note also the recent patch set for `perf annotate` by Taeung Song which adds the functionality to switch between sample or period fractions. I'm actually not sure whether that exists for `perf report` yet. In some situations where the cycle period values associated with the samples are correct, I actually also saw how this is useful: I have seen perf data files, where tons of samples got recorded around syscall entry/exit, but most samples only had tiny cycle values associated with them. If one only looks at number of samples, then it would look like the syscalls are expensive. While in reality, way more cycles are executed in userspace. This issue was/is apparent when looking at `perf report` values vs. the FlameGraph visualization created by the normal `stackcollapse-perf.pl` script. The latter does only look at the number of samples, the former takes the sample period value. Hotspot also used the former approach, but then I adapted perf's behavior. > > a cycle count of, say 1E14 and then a million other samples, each with > > "sane" cycle counts of let's say 1E5, then the one broken sample will > > hold 50% of the total amount of measured cycles. So perf report will show > > that the function where the broken sample points to will have a cost of > > 50%. > > I don't think I've seen such a situation. Did you? I have seen this situation. This is what this current revival of this thread is all about. Without such issues, I wouldn't claim it's such a serious issue. > BTW I'm not arguing against fixing it, but typically I just > recommend to avoid frequency mode. The fast sampling at the beginning > has caused a range of low level PMU bugs and it is hard to reason about > because of its complex behavior. Also it has no protection against > synchronizing with repeated patterns in the execution, which > can cause bad shadowing effects. If you use the Intel > event aliases they have all sensible periods set by default. I think we both agree that the frequency mode as-is should not be the default. But it is, and this is a serious issue in my opinion. We will need to find a sensible default for the event period and use that mode by default. I nowadays always add something like `-c 3000000` which gives me roughly 1k samples per second on a ~3GHz machine. It's just a ballpark figure and of course gets influenced by frequency scaling, but it's good enough for me. We could use a similar approach to find a period based on the max CPU clock rate automatically. But of course that would only work for cycles, and not for instructions or any of the other fancy event counters. But since the frequency mode is probably similarly broken there, it should not be the default. Better ask the user for explicit values rather than doing something automatically which can lead to broken results. Cheers -- Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts --nextPart5715753.GKeFb1RmlZ Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCCDEIw ggXmMIIDzqADAgECAhBqm+E4O/8ra58B1dm4p1JWMA0GCSqGSIb3DQEBDAUAMIGFMQswCQYDVQQG EwJHQjEbMBkGA1UECBMSR3JlYXRlciBNYW5jaGVzdGVyMRAwDgYDVQQHEwdTYWxmb3JkMRowGAYD VQQKExFDT01PRE8gQ0EgTGltaXRlZDErMCkGA1UEAxMiQ09NT0RPIFJTQSBDZXJ0aWZpY2F0aW9u IEF1dGhvcml0eTAeFw0xMzAxMTAwMDAwMDBaFw0yODAxMDkyMzU5NTlaMIGXMQswCQYDVQQGEwJH QjEbMBkGA1UECBMSR3JlYXRlciBNYW5jaGVzdGVyMRAwDgYDVQQHEwdTYWxmb3JkMRowGAYDVQQK ExFDT01PRE8gQ0EgTGltaXRlZDE9MDsGA1UEAxM0Q09NT0RPIFJTQSBDbGllbnQgQXV0aGVudGlj YXRpb24gYW5kIFNlY3VyZSBFbWFpbCBDQTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEB AL6znlesKHZ1QBbHOAOY08YYdiFQ8yV5C0y1oNF9Olg+nKcxLqf2NHbZhGra0D00SOTq9bus3/mx gUsg/Wh/eXQ0pnp8tZ8XZWAnlyKMpjL+qUByRjXCA6RQyDMqVaVUkbIr5SU0RDX/kSsKwer3H1pT /HUrBN0X8sKtPTdGX8XAWt/VdMLBrZBlgvnkCos+KQWWCo63OTTqRvaq8aWccm+KOMjTcE6s2mj6 RkalweyDI7X+7U5lNo6jzC8RTXtVV4/Vwdax720YpMPJQaDaElmOupyTf1Qib+cpukNJnQmwygjD 8m046DQkLnpXNCAGjuJy1F5NATksUsbfJAr7FLUCAwEAAaOCATwwggE4MB8GA1UdIwQYMBaAFLuv fgI9+qbxPISOre44mOzZMjLUMB0GA1UdDgQWBBSCr2yM+MX+lmF86B89K3FIXsSLwDAOBgNVHQ8B Af8EBAMCAYYwEgYDVR0TAQH/BAgwBgEB/wIBADARBgNVHSAECjAIMAYGBFUdIAAwTAYDVR0fBEUw QzBBoD+gPYY7aHR0cDovL2NybC5jb21vZG9jYS5jb20vQ09NT0RPUlNBQ2VydGlmaWNhdGlvbkF1 dGhvcml0eS5jcmwwcQYIKwYBBQUHAQEEZTBjMDsGCCsGAQUFBzAChi9odHRwOi8vY3J0LmNvbW9k b2NhLmNvbS9DT01PRE9SU0FBZGRUcnVzdENBLmNydDAkBggrBgEFBQcwAYYYaHR0cDovL29jc3Au Y29tb2RvY2EuY29tMA0GCSqGSIb3DQEBDAUAA4ICAQB4XLKBKDRPPO5fVs6fl1bsj6JrF/bz9kkI BtTYLzXN30D+03Hj6OxCDBEaIeNmsBhrJmuubvyE7HtoSmR809AgcYboW+rcTNZ/8u/Hv+GTrNI/ AhqX2/kiQNxmgUPt/eJPs92Qclj0HnVyy9TnSvGkSDU7I5Px+TbO+88G4zipA2psZaWeEykgzClZ lPz1FjTCkk77ZXp5cQYYexE6zeeN4/0OqqoAloFrjAF4o50YJafX8mnahjp3I2Y2mkjhk0xQfhNq bzlLWPoT3m7j7U26u7zg6swjOq8hITYc3/np5tM5aVyu6t99p17bTbY7+1RTWBviN9YJzK8HxzOb XYWBf/L+VGOYNsQDTxAk0Hbvb1j6KjUhg7fO294F29QIhhmiNOr84JHoy+fNLpfvYc/Q9EtFOI5I SYgOxLk3nD/whbUe9rmEQXLp8MB933Ij474gwwCPUpwv9mj2PMnXoc7mbrS22XUSeTwxCTP9bcmU dp4jmIoWfhQm7X9w/Zgddg+JZ/YnIHOwsGsaTUgj7fIvxqith7DoJC91WJ8Lce3CVJqb1XWeKIJ8 4F7YLXZN0oa7TktYgDdmQVxYkZo1c5noaDKH9Oq9cbm/vOYRUM1cWcef20Wkyk5S/GFyyPJwG0fR 1nRas3DqAf4cXxMiEKcff7PNa4M3RGTqH0pWR8p6EjCCBlQwggU8oAMCAQICEAf6KCF9+1doL2oE OTPysLwwDQYJKoZIhvcNAQELBQAwgZcxCzAJBgNVBAYTAkdCMRswGQYDVQQIExJHcmVhdGVyIE1h bmNoZXN0ZXIxEDAOBgNVBAcTB1NhbGZvcmQxGjAYBgNVBAoTEUNPTU9ETyBDQSBMaW1pdGVkMT0w OwYDVQQDEzRDT01PRE8gUlNBIENsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgU2VjdXJlIEVtYWls IENBMB4XDTE3MDUyMzAwMDAwMFoXDTIwMDUyMjIzNTk1OVowggFZMQswCQYDVQQGEwJTRTEPMA0G A1UEERMGNjgzIDMxMRIwEAYDVQQIEwlWYWVybWxhbmQxEDAOBgNVBAcTB0hhZ2ZvcnMxGDAWBgNV BAkTD05vcnJpbmdzIHZhZWcgMjEPMA0GA1UEEhMGQm94IDMwMSYwJAYDVQQKDB1LbGFyw6RsdmRh bGVucyBEYXRha29uc3VsdCBBQjEdMBsGA1UECxMUQSBLREFCIEdyb3VwIENvbXBhbnkxQzBBBgNV BAsMOklzc3VlZCB0aHJvdWdoIEtsYXLDpGx2ZGFsZW5zIERhdGFrb25zdWx0IEFCIEUtUEtJIE1h bmFnZXIxHzAdBgNVBAsTFkNvcnBvcmF0ZSBTZWN1cmUgRW1haWwxFTATBgNVBAMTDE1pbGlhbiBX b2xmZjEkMCIGCSqGSIb3DQEJARYVbWlsaWFuLndvbGZmQGtkYWIuY29tMIIBIjANBgkqhkiG9w0B AQEFAAOCAQ8AMIIBCgKCAQEAxrzfNBVvRbiAknuTBXuQnNm9sLIFLo0vbPB6kswk78A3tA++Zn5c lQUHhGlQq1cdYxagnUpqwvG3Sod15mPSOLkAPf/mabLN7p+lFbRaUP+97ZkTZtvb4BCC3osIEFI4 G393OSFWqc2qmIPE/SwSASbAA20Fcaa2M6P1lhOk/ttUh2jIurTPF0wUycIA7lBddrOgaOA8e2m6 iLTNHtlrfRbBaUX91D5ebY+UWmIjXSQ9+CtutMzBkwnF0rZKririvOkklg9VzEGNQVHrQfDF2s/U pOtmtuVSwElauGT/KALyCFuIrYC1pmaKH8S1xODJqiRaf6jH8E+KQzKjyM/ErwIDAQABo4IB1TCC AdEwHwYDVR0jBBgwFoAUgq9sjPjF/pZhfOgfPStxSF7Ei8AwHQYDVR0OBBYEFN+m99RtIuA1bSdw 6b1brOX7X3AJMA4GA1UdDwEB/wQEAwIFoDAMBgNVHRMBAf8EAjAAMB0GA1UdJQQWMBQGCCsGAQUF BwMEBggrBgEFBQcDAjBGBgNVHSAEPzA9MDsGDCsGAQQBsjEBAgEDBTArMCkGCCsGAQUFBwIBFh1o dHRwczovL3NlY3VyZS5jb21vZG8ubmV0L0NQUzBaBgNVHR8EUzBRME+gTaBLhklodHRwOi8vY3Js LmNvbW9kb2NhLmNvbS9DT01PRE9SU0FDbGllbnRBdXRoZW50aWNhdGlvbmFuZFNlY3VyZUVtYWls Q0EuY3JsMIGLBggrBgEFBQcBAQR/MH0wVQYIKwYBBQUHMAKGSWh0dHA6Ly9jcnQuY29tb2RvY2Eu Y29tL0NPTU9ET1JTQUNsaWVudEF1dGhlbnRpY2F0aW9uYW5kU2VjdXJlRW1haWxDQS5jcnQwJAYI KwYBBQUHMAGGGGh0dHA6Ly9vY3NwLmNvbW9kb2NhLmNvbTAgBgNVHREEGTAXgRVtaWxpYW4ud29s ZmZAa2RhYi5jb20wDQYJKoZIhvcNAQELBQADggEBABf47LSJADqH+ow9INv3QM1NC/qq2bjxGvsZ 68iD11VEUAFlsYfsVTgQqUirwPVTYenXtwVBELHZyywsui1JxL7HKQetLQegDDP/RyfjReVaWxhy 3OpuItsgLVbru9QVgPifnoBFPtfZcwjeJDmeSbLT8oj4Rd0KYBOIve7WKvsfNPsNwfbLwY2zILkE LjxZcVi2AwZHDyab+dzL/3YcLuJj1lSawBGn7ilpcdZydlv4aye51pD/MemLIYLcylt+ImrmjnTV y+QlAHRF3s5FE8yAr+W1MBD/1bKZCSgFt8VQoAlz3hiQh8QqZp4Zl8WuVL4+mP/mT6VDEWgq/0Bo cukxggJuMIICagIBATCBrDCBlzELMAkGA1UEBhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hl c3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEaMBgGA1UEChMRQ09NT0RPIENBIExpbWl0ZWQxPTA7BgNV BAMTNENPTU9ETyBSU0EgQ2xpZW50IEF1dGhlbnRpY2F0aW9uIGFuZCBTZWN1cmUgRW1haWwgQ0EC EAf6KCF9+1doL2oEOTPysLwwDQYJYIZIAWUDBAIBBQCggZMwGAYJKoZIhvcNAQkDMQsGCSqGSIb3 DQEHATAcBgkqhkiG9w0BCQUxDxcNMTcwOTA1MTIyNjM4WjAoBgkqhkiG9w0BCQ8xGzAZMAsGCWCG SAFlAwQBAjAKBggqhkiG9w0DBzAvBgkqhkiG9w0BCQQxIgQgFXPKwW0r3FEVT1BG4O582M2iewid CzW9MA2jBTe+/5IwDQYJKoZIhvcNAQEBBQAEggEAGJ4Jw2X5aJMxR9nKuRGAdgmGaCJdg1SPBZUF v/udwWkk816Zdd/4uqVU/Tg6KFQo4B4ZzggDKwf8875AguqQVHpHHvcXR7WD3yPybIBZKswkwuFa JxeVC6lSzgPx0RozSNXZrOuP3DP4/hsbRuOIQxKo7rpnjU/EFc3a4IDiKZPWpLh0xEnSVTEqX3CP v3dsALbz1g4UntlrFx5Oflhu1j7zkgzJNxvvFPIMzBMGJQFd0pWlxr2fd7z/Ow13FqjPI8EnYB2p B6DyzNI4xrYrYXGwoFtJZzn5h+alqJY1tNDBL1JAVf24w5noSdGWSSiWA/2hHrOi9QrPKQxf3Uk7 ogAAAAAAAA== --nextPart5715753.GKeFb1RmlZ--