Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4253929imm; Mon, 15 Oct 2018 11:32:59 -0700 (PDT) X-Google-Smtp-Source: ACcGV62zwio3EN4jsD+Tf1m7cnU6Qd/FyUSaNAF3agUkW6UdSt/4953sWchlmfeAjps6mUFhd/Ed X-Received: by 2002:a17:902:7c8a:: with SMTP id y10-v6mr17732162pll.322.1539628379001; Mon, 15 Oct 2018 11:32:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539628378; cv=none; d=google.com; s=arc-20160816; b=L99QtWOz+jXTdaQ+b9hHiHuvg46RT/M2wwR9VilyjvAQI949T4HZS6Qf2OFWxsaO4k qo8I9wOhHX+x5Ul/YCSPVYDk7UVGh12mNayEylnPIczxaHomtARj3KiDCqKhbt97BXG0 YuBEpxKHLAGgWl7KJE8I+Enc2OF3/l1gLGRkmV6mh8T5kWLS8iyrljs1TYamgj33wf7c eHSgClzX1dl/WTa/Gu1IJDqU24xwwq576WQYj8Nl/3aHdHD90J6yVA0B3vxqie6KFqwP zJjR37c+m+n+qk5OIWaFjMm8O/enmNh2ILr7vlMr4tFvcjpxcEJpr4KDcNQ9kFauk+UJ Lwwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=qOnoWmUePpxemXMheY28Pu/lsa4HfUtce7Syl28Bmps=; b=TpVQx4jf8Iltenb+eOc0u8Z2A3rr4oyvY72XR2hrq0WJ0zkiuAyHxW1wg3acPyWcym bHKIrtfHxyhEWajxi1sVzKe7vFgzGi3Ctv7NF8qifgww6RfPxLeuQqkWJN2WddOSwjPW iJKNk2SsoNWzv1elrUTF34jEcmg35zjYFoQPGayi5eCF+5EKiDiKj8vVXD4GOROzBRTn ja37Jo0HaxjGbXrURgD7NzB/JRcSmcoTshTtCbrc3KhrLLfO/iCP+htDsPq4ZIxzJGNj vYQ5JZEMYANVKOaRdymsGwNyOKXMb5yIW2p9+zT/+V8mnD+OVaZeJ0bvJHJIvayMfSU6 U6Hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sQXwu4hx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 32-v6si11427869pls.207.2018.10.15.11.32.42; Mon, 15 Oct 2018 11:32:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sQXwu4hx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726953AbeJPCSE (ORCPT + 99 others); Mon, 15 Oct 2018 22:18:04 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:32806 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726681AbeJPCSE (ORCPT ); Mon, 15 Oct 2018 22:18:04 -0400 Received: by mail-wr1-f65.google.com with SMTP id e4-v6so22519031wrs.0 for ; Mon, 15 Oct 2018 11:31:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qOnoWmUePpxemXMheY28Pu/lsa4HfUtce7Syl28Bmps=; b=sQXwu4hxZWZRocce+mlkZqEOSQYYSvE5z3ys7FQeFUOtLWPNLMyBaUfLUBHP0YyZS7 ufUNhOvFfwZp4cxmMQdBB6I4/5moQScXfuwofRoSLhzUb35yZpKJjk13XrsdCAs5VZgM M/ZrKcNr//9whe7pWjRXVWvm3nsU1tSGwkv1AFfl+KL/MenpcrI6g0C/Bt4NpkZ1NHaF SBhQCpbigDoR2jNQ4hsnERd0cObBgKL2uJuvGNfXC5y7KwQNuK+6Ao4t/RqO81HHnMSo iYC62oMHBBIv/Y7omfw92DHhJYonyJYrpvoEEW/qnu79AJE1KKH0J9KL/GvCuoSGoWdJ q80w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qOnoWmUePpxemXMheY28Pu/lsa4HfUtce7Syl28Bmps=; b=XbZruaDiJsR8E7kZ9/HuCbfHOnOttWdtNElCSotfI0N3FvtDhyqX1D/vhoRTrY+fGl vNWQyqhc2eM6LO+RwKuLwPGIHGQfQLQBiyv+Vf1D5fipzk5qDqJCBWD+u6lt+ch25deS 9/dSv15Se3huUISrx8PF6tZONEjzZT+ZQAujiL2o6JBtzg4LDYOGlCe6GmNS16uvb2ZM 067ZCZOxtraHwUzNevUfjmmq94a9zKFMpnF5vnqYJhFeswzr6BVPqOc4lRvhTnGdpitf dcwNn1qK2kjKGGIkkHjVPAE5rFgIhCEEb0TUijaH96b/32pG295fnQKw8TdITKIiU7Cr GU7g== X-Gm-Message-State: ABuFfohznSqnW3whGlxD6xDpumXIbOy9p1iyjUvDVOPucCCtJNYl85Ie sQuxgZSpaOgi5R73U5EX4IDGuFFEJoIIvZU+4YOTDA== X-Received: by 2002:a5d:56cb:: with SMTP id m11-v6mr15007157wrw.58.1539628297345; Mon, 15 Oct 2018 11:31:37 -0700 (PDT) MIME-Version: 1.0 References: <20181010104559.GO5728@hirez.programming.kicks-ass.net> <3a738a08-2295-a4e9-dce7-a3e2b2ad794e@linux.intel.com> <20181015083448.GN9867@hirez.programming.kicks-ass.net> In-Reply-To: From: Stephane Eranian Date: Mon, 15 Oct 2018 11:31:24 -0700 Message-ID: Subject: Re: [RFC][PATCH] perf: Rewrite core context handling To: Alexey Budankov Cc: Peter Zijlstra , Ingo Molnar , LKML , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , songliubraving@fb.com, Thomas Gleixner , Mark Rutland , megha.dey@intel.com, frederic@kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Mon, Oct 15, 2018 at 10:29 AM Alexey Budankov wrote: > > > Hi, > On 15.10.2018 11:34, Peter Zijlstra wrote: > > On Mon, Oct 15, 2018 at 10:26:06AM +0300, Alexey Budankov wrote: > >> Hi, > >> > >> On 10.10.2018 13:45, Peter Zijlstra wrote: > >>> Hi all, > >>> > >>> There have been various issues and limitations with the way perf uses > >>> (task) contexts to track events. Most notable is the single hardware PMU > >>> task context, which has resulted in a number of yucky things (both > >>> proposed and merged). > >>> > >>> Notably: > >>> > >>> - HW breakpoint PMU > >>> - ARM big.little PMU > >>> - Intel Branch Monitoring PMU > >>> > >>> Since we now track the events in RB trees, we can 'simply' add a pmu > >>> order to them and have them grouped that way, reducing to a single > >>> context. Of course, reality never quite works out that simple, and below > >>> ends up adding an intermediate data structure to bridge the context -> > >>> pmu mapping. > >>> > >>> Something a little like: > >>> > >>> ,------------------------[1:n]---------------------. > >>> V V > >>> perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event > >>> ^ ^ | | > >>> `--------[1:n]---------' `-[n:1]-> pmu <-[1:n]-' > >>> > >>> This patch builds (provided you disable CGROUP_PERF), boots and survives > >>> perf-top without the machine catching fire. > >>> > >>> There's still a fair bit of loose ends (look for XXX), but I think this > >>> is the direction we should be going. > >>> > >>> Comments? > >>> > >>> Not-Quite-Signed-off-by: Peter Zijlstra (Intel) > >>> --- > >>> arch/powerpc/perf/core-book3s.c | 4 > >>> arch/x86/events/core.c | 4 > >>> arch/x86/events/intel/core.c | 6 > >>> arch/x86/events/intel/ds.c | 6 > >>> arch/x86/events/intel/lbr.c | 16 > >>> arch/x86/events/perf_event.h | 6 > >>> include/linux/perf_event.h | 80 +- > >>> include/linux/sched.h | 2 > >>> kernel/events/core.c | 1412 ++++++++++++++++++++-------------------- > >>> 9 files changed, 815 insertions(+), 721 deletions(-) > >> > >> Rewrite is impressive however it doesn't result in code base reduction as it is. > > > > Yeah.. that seems to be nature of these things .. > > > >> Nonetheless there is a clear demand for per pmu events groups tracking and rotation > >> in single cpu context (HW breakpoints, ARM big.little, Intel LBRs) and there is > >> a supply thru groups ordering on RB-tree. > >> > >> This might be driven into the kernel by some new Perf features that would base on > >> that RB-tree groups ordering or by refactoring of existing code but in the way it > >> would result in overall code base reduction thus lowering support cost. > > > > If you have a concrete suggestion on how to reduce complexity? I tried, > > but couldn't find any (without breaking something). > > Could some of those PMUs (HW breakpoints, ARM big.little, Intel LBRs) > or other Perf related code be adjusted now so that overall subsystem > code base would reduce? > I have always had a hard time understanding the role of all these structs in the generic code. This is still very confusing and very hard to follow. In my mind, you have per-task and per-cpu perf_events contexts. And for each you can have multiple PMUs, some hw some sw. Each PMU has its own list of events maintained in RB tree. There is never any interactions between PMUs. Maybe this is how this is done or proposed by your patches, but it certainly is not obvious. Also the Intel LBR is not a PMU on is own. Maybe you are talking about the BTS in arch/x86/even/sintel/bts.c. > > > > The active lists and pmu_ctx_list could arguably be replaced with > > (slower) iteratons over the RB tree, but you'll still need the per pmu > > nr_events/nr_active counts to determine if rotation is required at all. > > > > And like you know, performance is quite important here too. I'd love to > > reduce complexity while maintaining or improve performance, but that > > rarely if ever happens :/ > >