Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752948AbdHJP5v (ORCPT ); Thu, 10 Aug 2017 11:57:51 -0400 Received: from mga11.intel.com ([192.55.52.93]:39313 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752243AbdHJP5t (ORCPT ); Thu, 10 Aug 2017 11:57:49 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,353,1498546800"; d="c'?scan'208";a="1002197714" Subject: Re: [PATCH v6 2/3]: perf/core: use context tstamp_data for skipped events on mux interrupt To: Peter Zijlstra Cc: Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Andi Kleen , Kan Liang , Dmitri Prokhorov , Valery Cherepennikov , Mark Rutland , Stephane Eranian , David Carrillo-Cisneros , linux-kernel References: <96c7776f-1f17-a39e-23e9-658596216d6b@linux.intel.com> <20170803150052.za2vofyqfgarukdr@hirez.programming.kicks-ass.net> From: Alexey Budankov Organization: Intel Corp. Message-ID: Date: Thu, 10 Aug 2017 18:57:43 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170803150052.za2vofyqfgarukdr@hirez.programming.kicks-ass.net> Content-Type: multipart/mixed; boundary="------------6207A269E8E2EBBB4538F4FD" Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11036 Lines: 270 This is a multi-part message in MIME format. --------------6207A269E8E2EBBB4538F4FD Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Well, Ok. I re-implemented this patch of patch set and also implemented a unit test that IMHO mimics the case mentioned above to check if it solves the issue. The test is a single thread application that creates 272 fds for every of two INSTRUCTIONS_RETIRED events covering 272 cores of Intel Xeon Phi. The test simultaneously executes 1 million of instructions several times when the events are enabled and reads the events' counts, TOTAL_ENABLED and TOTAL_RUNNING over read() system call. The first event is allocated, enabled and disabled at the beginning and the end of the test execution phase where as the second event allocation and measurements are included into the measurement interval of the first event. Below is what I am getting now when running the test on the patched kernel: EID CPU COUNT T_ENABLED T_RUNNING SCALE --- 0 enabled --- --- million instructions --- 0 26 1334 1666671 137366 8.24 0 94 138637 1735872 527098 30.37 0 162 874166 1823695 1083785 59.43 --- million instructions --- 0 26 1334 3328832 137366 4.13 0 94 1164146 3318599 2109825 63.58 0 162 874166 3390716 1083785 31.96 --- million instructions --- 0 26 1334 4835955 137366 2.84 0 94 2189671 4820750 3611976 74.93 0 162 874166 4934513 1083785 21.96 --- 1 enabled --- --- million instructions --- 0 26 1334 22661310 137366 0.61 0 94 3248450 22667165 21458391 94.67 0 162 874166 22742990 1083785 4.77 1 94 1033387 2150307 2150307 100.00 --- million instructions --- 0 26 1334 24878504 137366 0.55 0 94 4289784 24869644 23660870 95.14 0 162 874166 24943564 1083785 4.34 1 94 2074675 4370708 4370708 100.00 --- 1 disabled --- --- million instructions --- 0 26 1334 27681611 137366 0.50 0 94 5337752 27675968 26467194 95.63 0 162 874166 27749528 1083785 3.91 1 94 2089278 5024218 5024218 100.00 --- 0 disabled --- --- million instructions --- 0 26 1334 29004868 137366 0.47 0 94 5372734 28964075 27755301 95.83 0 162 874166 29005751 1083785 3.74 1 94 2089278 5024218 5024218 100.00 The output demonstrates that test thread migrated two time during execution thus several fds were employed for measuring amount of executed instructions. Also the output shows that T_RUNNING values of events updated and maintained so that sums of SCALE values for every event are near 100% (no multiplexing) after every million instructions execution. Unit test code is attached for convenience. The key thing in the patch is explicit updating of tstamp fields for INACTIVE events in update_event_times(). Signed-off-by: Alexey Budankov --- kernel/events/core.c | 52 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 31 insertions(+), 21 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 0a4f619..d195fdc 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1391,6 +1391,27 @@ static u64 perf_event_time(struct perf_event *event) return ctx ? ctx->time : 0; } +void perf_event_tstamp_init(struct perf_event *event) +{ + u64 tstamp = perf_event_time(event); + + event->tstamp_enabled = tstamp; + event->tstamp_running = tstamp; + event->tstamp_stopped = tstamp; +} + +void perf_event_tstamp_update(struct perf_event *event) +{ + u64 tstamp, delta; + + tstamp = perf_event_time(event); + + delta = tstamp - event->tstamp_stopped; + + event->tstamp_running += delta; + event->tstamp_stopped = tstamp; +} + /* * Update the total_time_enabled and total_time_running fields for a event. */ @@ -1405,6 +1426,9 @@ static void update_event_times(struct perf_event *event) event->group_leader->state < PERF_EVENT_STATE_INACTIVE) return; + if (event->state == PERF_EVENT_STATE_INACTIVE) + perf_event_tstamp_update(event); + /* * in cgroup mode, time_enabled represents * the time the event was enabled AND active @@ -1430,7 +1454,6 @@ static void update_event_times(struct perf_event *event) run_end = perf_event_time(event); event->total_time_running = run_end - event->tstamp_running; - } /* @@ -1954,9 +1977,6 @@ event_sched_out(struct perf_event *event, struct perf_cpu_context *cpuctx, struct perf_event_context *ctx) { - u64 tstamp = perf_event_time(event); - u64 delta; - WARN_ON_ONCE(event->ctx != ctx); lockdep_assert_held(&ctx->lock); @@ -1967,18 +1987,15 @@ event_sched_out(struct perf_event *event, * via read() for time_enabled, time_running: */ if (event->state == PERF_EVENT_STATE_INACTIVE && - !event_filter_match(event)) { - delta = tstamp - event->tstamp_stopped; - event->tstamp_running += delta; - event->tstamp_stopped = tstamp; - } + !event_filter_match(event)) + perf_event_tstamp_update(event); if (event->state != PERF_EVENT_STATE_ACTIVE) return; perf_pmu_disable(event->pmu); - event->tstamp_stopped = tstamp; + event->tstamp_stopped = perf_event_time(event); event->pmu->del(event, 0); event->oncpu = -1; event->state = PERF_EVENT_STATE_INACTIVE; @@ -2294,7 +2311,6 @@ group_sched_in(struct perf_event *group_event, { struct perf_event *event, *partial_group = NULL; struct pmu *pmu = ctx->pmu; - u64 now = ctx->time; bool simulate = false; if (group_event->state == PERF_EVENT_STATE_OFF) @@ -2340,12 +2356,10 @@ group_sched_in(struct perf_event *group_event, if (event == partial_group) simulate = true; - if (simulate) { - event->tstamp_running += now - event->tstamp_stopped; - event->tstamp_stopped = now; - } else { + if (simulate) + perf_event_tstamp_update(event); + else event_sched_out(event, cpuctx, ctx); - } } event_sched_out(group_event, cpuctx, ctx); @@ -2390,13 +2404,9 @@ static int group_can_go_on(struct perf_event *event, static void add_event_to_ctx(struct perf_event *event, struct perf_event_context *ctx) { - u64 tstamp = perf_event_time(event); - list_add_event(event, ctx); perf_group_attach(event); - event->tstamp_enabled = tstamp; - event->tstamp_running = tstamp; - event->tstamp_stopped = tstamp; + perf_event_tstamp_init(event); } static void ctx_sched_out(struct perf_event_context *ctx, --------------6207A269E8E2EBBB4538F4FD Content-Type: text/plain; charset=UTF-8; name="check_multiplexing_read.c" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="check_multiplexing_read.c" LyogY2hlY2tfbXVsdGlwbGV4aW5nX3JlYWQuYwkJCQkJCSovCgojaW5jbHVkZSA8c3RkaW8u aD4KI2luY2x1ZGUgPHN0ZGxpYi5oPgojaW5jbHVkZSA8c3RyaW5nLmg+CgojaW5jbHVkZSA8 dW5pc3RkLmg+CiNpbmNsdWRlIDxlcnJuby5oPgoKI2luY2x1ZGUgPHN5cy9tbWFuLmg+Cgoj aW5jbHVkZSA8c3lzL2lvY3RsLmg+CiNpbmNsdWRlIDxhc20vdW5pc3RkLmg+CgojaW5jbHVk ZSAicGVyZl9ldmVudC5oIgojaW5jbHVkZSAidGVzdF91dGlscy5oIgojaW5jbHVkZSAicGVy Zl9oZWxwZXJzLmgiCiNpbmNsdWRlICJpbnN0cnVjdGlvbnNfdGVzdGNvZGUuaCIKCiNkZWZp bmUgTlVNX0NQVVMgMjcyCiNkZWZpbmUgTlVNX0VWRU5UUyAyCgppbnQgZmRbTlVNX0VWRU5U U11bTlVNX0NQVVNdOwpsb25nIGxvbmcgZXZlbnRzW05VTV9FVkVOVFNdPXsKCVBFUkZfQ09V TlRfSFdfQ1BVX0NZQ0xFUywKCVBFUkZfQ09VTlRfSFdfSU5TVFJVQ1RJT05TCn07CgpzdGF0 aWMgbG9uZyBsb25nIGJhc2VfcmVzdWx0c1tOVU1fQ1BVU11bM107CgojZGVmaW5lIFRJTUVf RU5BQkxFRAkxCiNkZWZpbmUgVElNRV9SVU5OSU5HCTIKCnN0YXRpYyBpbnQgdGVzdF9yb3V0 aW5lKHZvaWQpIHsKCglpbnQgaSxyZXN1bHQ7CgoJcHJpbnRmKCItLS0gbWlsbGlvbiBpbnN0 cnVjdGlvbnMgLS0tXG4iKTsKCglmb3IoaT0wO2k8MTtpKyspIHsKCQlyZXN1bHQ9aW5zdHJ1 Y3Rpb25zX21pbGxpb24oKTsKCX0KCglyZXR1cm4gcmVzdWx0Owp9Cgp2b2lkIGFsbG9jX2V2 ZW50cyhsb25nIGxvbmcgY29uZmlnLCBpbnQgZmRbTlVNX0NQVVNdKQp7CglpbnQgcmV0LGks ajsKCglzdHJ1Y3QgcGVyZl9ldmVudF9hdHRyIHBlOwoKCWZvcihqPTA7ajxOVU1fQ1BVUztq KyspIHsKCQltZW1zZXQoJnBlLDAsc2l6ZW9mKHN0cnVjdCBwZXJmX2V2ZW50X2F0dHIpKTsK CQlwZS50eXBlPVBFUkZfVFlQRV9IQVJEV0FSRTsKCQlwZS5zaXplPXNpemVvZihzdHJ1Y3Qg cGVyZl9ldmVudF9hdHRyKTsKCQlwZS5jb25maWc9Y29uZmlnOwoJCXBlLmRpc2FibGVkPTE7 CgkJcGUuZXhjbHVkZV9rZXJuZWw9MTsKCQlwZS5leGNsdWRlX2h2PTE7CgkJcGUucmVhZF9m b3JtYXQ9UEVSRl9GT1JNQVRfVE9UQUxfVElNRV9FTkFCTEVEIHwKCQkJCVBFUkZfRk9STUFU X1RPVEFMX1RJTUVfUlVOTklORzsKCgkJZmRbal09cGVyZl9ldmVudF9vcGVuKCZwZSwwLGos LTEsMCk7CgkJaWYgKGZkW2pdPDApIHsKCQkJZnByaW50ZihzdGRlcnIsIkZhaWxlZCBhZGRp bmcgbXB4IGV2ZW50IDAgJXNcbiIsCgkJCQlzdHJlcnJvcihlcnJubykpOwoJCQlyZXR1cm4g LTE7CgkJfQoJfQp9Cgp2b2lkIGZyZWVfZXZlbnRzKGludCBmZFtOVU1fQ1BVU10pCnsKCWlu dCBqOwoJZm9yKGo9MDtqPE5VTV9DUFVTO2orKykgewoJCWNsb3NlKGZkW2pdKTsKCX0KfQoK dm9pZCBlbmFibGVfZXZlbnRzKGludCBmZFtOVU1fQ1BVU10pCnsKCWludCBqLHJldDsKCWZv cihqPTA7ajxOVU1fQ1BVUztqKyspIHsKCQlyZXQ9aW9jdGwoZmRbal0sIFBFUkZfRVZFTlRf SU9DX0VOQUJMRSwwKTsKCQlpZiAocmV0PDApIHsKCQkJZnByaW50ZihzdGRlcnIsIkVycm9y IHN0YXJ0aW5nIGV2ZW50IGZkWyVkXVxuIixqKTsKCQl9Cgl9Cn0KCnZvaWQgZGlzYWJsZV9l dmVudHMoaW50IGZkW05VTV9DUFVTXSkKewoJaW50IGoscmV0OwoJZm9yKGo9MDtqPE5VTV9D UFVTO2orKykgewoJCXJldD1pb2N0bChmZFtqXSwgUEVSRl9FVkVOVF9JT0NfRElTQUJMRSww KTsKCQlpZiAocmV0PDApIHsKCQkJZnByaW50ZihzdGRlcnIsIkVycm9yIHN0b3BwaW5nIGV2 ZW50IGZkWyVkXVxuIixqKTsKCQl9Cgl9Cn0KCnZvaWQgcmVhZF9ldmVudHMoaW50IGksIGlu dCBmZFtOVU1fQ1BVU10pCnsKCWludCBqLCByZXQ7Cglmb3Ioaj0wO2o8TlVNX0NQVVM7aisr KSB7CgkJcmV0PXJlYWQoZmRbal0sJmJhc2VfcmVzdWx0c1tqXSwzKnNpemVvZihsb25nIGxv bmcpKTsKCQlpZiAocmV0PDMqc2l6ZW9mKGxvbmcgbG9uZykpIHsKCQkJZnByaW50ZihzdGRl cnIsIkV2ZW50IGZkWzBdWyVkXSB1bmV4cGVjdGVkIHJlYWQgc2l6ZSAlZFxuIixqLHJldCk7 CgkJCXJldHVybjsKCQl9CgkJaWYgKGJhc2VfcmVzdWx0c1tqXVswXSkKCQkJcHJpbnRmKCIl ZFx0JWRcdCVsbGRcdFx0XHQlbGxkXHRcdFx0JWxsZFx0XHRcdCUuMmZcbiIsIGksIGosCgkJ CQliYXNlX3Jlc3VsdHNbal1bMF0sCgkJCQliYXNlX3Jlc3VsdHNbal1bVElNRV9FTkFCTEVE XSwKCQkJCWJhc2VfcmVzdWx0c1tqXVtUSU1FX1JVTk5JTkddLAoJCQkJKGRvdWJsZSliYXNl X3Jlc3VsdHNbal1bVElNRV9SVU5OSU5HXS8KCQkJCShkb3VibGUpYmFzZV9yZXN1bHRzW2pd W1RJTUVfRU5BQkxFRF0gKiAxMDAuKTsKCX0KfQoKaW50IG1haW4oaW50IGFyZ2MsIGNoYXIq KiBhcmd2KSB7CgoJaW50IHJldCxxdWlldCxpLGo7CgoJc3RydWN0IHBlcmZfZXZlbnRfYXR0 ciBwZTsKCQoJY2hhciB0ZXN0X3N0cmluZ1tdPSJUZXN0aW5nIC4uLiI7CgoJcHJpbnRmKCJc bkVJRFx0Q1BVXHRDT1VOVFx0XHRcdFRfRU5BQkxFRFx0XHRcdFRfUlVOTklOR1x0XHRcdFND QUxFXG4iKTsKCglhbGxvY19ldmVudHMoZXZlbnRzWzFdLCBmZFswXSk7CgllbmFibGVfZXZl bnRzKGZkWzBdKTsKCXByaW50ZigiLS0tIDEgZW5hYmxlZFxuIik7CgoJdGVzdF9yb3V0aW5l KCk7CglyZWFkX2V2ZW50cygwLCBmZFswXSk7CgoJdGVzdF9yb3V0aW5lKCk7CglyZWFkX2V2 ZW50cygwLCBmZFswXSk7CgoJdGVzdF9yb3V0aW5lKCk7CglyZWFkX2V2ZW50cygwLCBmZFsw XSk7CgoJYWxsb2NfZXZlbnRzKGV2ZW50c1sxXSwgZmRbMV0pOwoJZW5hYmxlX2V2ZW50cyhm ZFsxXSk7CglwcmludGYoIi0tLSAxIGVuYWJsZWRcbiIpOwoKCXRlc3Rfcm91dGluZSgpOwoJ cmVhZF9ldmVudHMoMCwgZmRbMF0pOwoJcmVhZF9ldmVudHMoMSwgZmRbMV0pOwoKCXRlc3Rf cm91dGluZSgpOwoJcmVhZF9ldmVudHMoMCwgZmRbMF0pOwoJcmVhZF9ldmVudHMoMSwgZmRb MV0pOwoKCWRpc2FibGVfZXZlbnRzKGZkWzFdKTsKCXByaW50ZigiLS0tIDEgZGlzYWJsZWRc biIpOwoKCXRlc3Rfcm91dGluZSgpOwoJcmVhZF9ldmVudHMoMCxmZFswXSk7CglyZWFkX2V2 ZW50cygxLGZkWzFdKTsKCglkaXNhYmxlX2V2ZW50cyhmZFswXSk7CglwcmludGYoIi0tLSAw IGRpc2FibGVkXG4iKTsKCXRlc3Rfcm91dGluZSgpOwoKCXJlYWRfZXZlbnRzKDAsZmRbMF0p OwoJcmVhZF9ldmVudHMoMSxmZFsxXSk7CgoJZnJlZV9ldmVudHMoZmRbMV0pOwoJZnJlZV9l dmVudHMoZmRbMF0pOwoKCXRlc3RfcGFzcyh0ZXN0X3N0cmluZyk7CgoJcmV0dXJuIDA7Cn0K --------------6207A269E8E2EBBB4538F4FD--