Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4190658imm; Mon, 15 Oct 2018 10:30:30 -0700 (PDT) X-Google-Smtp-Source: ACcGV63Q+cLQDuIsxq9jK3E0Bg5p4pnx0yuaZqEk2h9QErzn7JbhukqWEdNeZOK8sdGKB8Vgl3vB X-Received: by 2002:a63:7f0e:: with SMTP id a14-v6mr16882046pgd.296.1539624630511; Mon, 15 Oct 2018 10:30:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539624630; cv=none; d=google.com; s=arc-20160816; b=PJ0yA9vdDWP4RaCTClOvHxwdUd9Ss4D9b/iMCTabiARpf6AI6uo8MVem1XNvNKdI2I y2MNP7G+xqZ6g0/ItKloqe9xCQzhe1ojHEN8uPVIKc0bvbh3VwCh/5bluZjOXY33ax29 sGqBYJzxbN5sSLSe8E7fZMUu5NKWWtCrNIrr7oEasIU5IuPTu9yBueyBy1PbJFFjUdKv yUp0jNpiXQac+ad0Vby5FWaFrd4cPwyXWw5VORS4HXCC7hQccgkl9d1Ezv6IRnbL6VHB FQw2zPcz+u85JVy/xi3IpIYSUUQ6UAwTnc+ztWXWDRWOURtyqeaz43nqZQneJHUO4VL5 0uRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=lspmTMQinBtGgecqhyChLN7aguE/OzqcQiwTG6gC/Rg=; b=niiz3t55L3ZX6e6iR2VUFDNraf83OymK/jYUwz1j5inQsQBE7o1JnW+l0TEl5XzQWa /AotWSWbpY35PSooPGNAQsjQtVOKYC6wC0Cj27vFbUJpOGJFGKi8WXk5xcKWqUc0Yzd3 UQLR8A1uxIzvsi6cpvCzxL7OCVzkrs7gy8CxaE3/Tzja/iB+MfylWbJ5gxHKrJKgOvEA zT/aKMwYzwfNCQyWVEDfwdGbN2XOwFoBP13d25eL5+mtt0DQsasWxeMnpxdfcgl+65xQ W0FGkovfWKqxWfjdUSXLGcIB7fz17O/VSPQVZLzqGuWQT95/gQD+qfClfMImy2Awyrxm wM6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r68-v6si11342953pfa.15.2018.10.15.10.30.14; Mon, 15 Oct 2018 10:30:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726719AbeJPBPt (ORCPT + 99 others); Mon, 15 Oct 2018 21:15:49 -0400 Received: from mga06.intel.com ([134.134.136.31]:59400 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726594AbeJPBPt (ORCPT ); Mon, 15 Oct 2018 21:15:49 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Oct 2018 10:29:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,385,1534834800"; d="scan'208";a="88526405" Received: from linux.intel.com ([10.54.29.200]) by FMSMGA003.fm.intel.com with ESMTP; 15 Oct 2018 10:29:37 -0700 Received: from [10.252.7.84] (abudanko-mobl.ccr.corp.intel.com [10.252.7.84]) by linux.intel.com (Postfix) with ESMTP id 6C1ED580487; Mon, 15 Oct 2018 10:29:34 -0700 (PDT) Subject: Re: [RFC][PATCH] perf: Rewrite core context handling To: Peter Zijlstra Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, songliubraving@fb.com, eranian@google.com, tglx@linutronix.de, mark.rutland@arm.com, megha.dey@intel.com, frederic@kernel.org References: <20181010104559.GO5728@hirez.programming.kicks-ass.net> <3a738a08-2295-a4e9-dce7-a3e2b2ad794e@linux.intel.com> <20181015083448.GN9867@hirez.programming.kicks-ass.net> From: Alexey Budankov Organization: Intel Corp. Message-ID: Date: Mon, 15 Oct 2018 20:29:32 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20181015083448.GN9867@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 15.10.2018 11:34, Peter Zijlstra wrote: > On Mon, Oct 15, 2018 at 10:26:06AM +0300, Alexey Budankov wrote: >> Hi, >> >> On 10.10.2018 13:45, Peter Zijlstra wrote: >>> Hi all, >>> >>> There have been various issues and limitations with the way perf uses >>> (task) contexts to track events. Most notable is the single hardware PMU >>> task context, which has resulted in a number of yucky things (both >>> proposed and merged). >>> >>> Notably: >>> >>> - HW breakpoint PMU >>> - ARM big.little PMU >>> - Intel Branch Monitoring PMU >>> >>> Since we now track the events in RB trees, we can 'simply' add a pmu >>> order to them and have them grouped that way, reducing to a single >>> context. Of course, reality never quite works out that simple, and below >>> ends up adding an intermediate data structure to bridge the context -> >>> pmu mapping. >>> >>> Something a little like: >>> >>> ,------------------------[1:n]---------------------. >>> V V >>> perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event >>> ^ ^ | | >>> `--------[1:n]---------' `-[n:1]-> pmu <-[1:n]-' >>> >>> This patch builds (provided you disable CGROUP_PERF), boots and survives >>> perf-top without the machine catching fire. >>> >>> There's still a fair bit of loose ends (look for XXX), but I think this >>> is the direction we should be going. >>> >>> Comments? >>> >>> Not-Quite-Signed-off-by: Peter Zijlstra (Intel) >>> --- >>> arch/powerpc/perf/core-book3s.c | 4 >>> arch/x86/events/core.c | 4 >>> arch/x86/events/intel/core.c | 6 >>> arch/x86/events/intel/ds.c | 6 >>> arch/x86/events/intel/lbr.c | 16 >>> arch/x86/events/perf_event.h | 6 >>> include/linux/perf_event.h | 80 +- >>> include/linux/sched.h | 2 >>> kernel/events/core.c | 1412 ++++++++++++++++++++-------------------- >>> 9 files changed, 815 insertions(+), 721 deletions(-) >> >> Rewrite is impressive however it doesn't result in code base reduction as it is. > > Yeah.. that seems to be nature of these things .. > >> Nonetheless there is a clear demand for per pmu events groups tracking and rotation >> in single cpu context (HW breakpoints, ARM big.little, Intel LBRs) and there is >> a supply thru groups ordering on RB-tree. >> >> This might be driven into the kernel by some new Perf features that would base on >> that RB-tree groups ordering or by refactoring of existing code but in the way it >> would result in overall code base reduction thus lowering support cost. > > If you have a concrete suggestion on how to reduce complexity? I tried, > but couldn't find any (without breaking something). Could some of those PMUs (HW breakpoints, ARM big.little, Intel LBRs) or other Perf related code be adjusted now so that overall subsystem code base would reduce? Thanks, Alexey > > The active lists and pmu_ctx_list could arguably be replaced with > (slower) iteratons over the RB tree, but you'll still need the per pmu > nr_events/nr_active counts to determine if rotation is required at all. > > And like you know, performance is quite important here too. I'd love to > reduce complexity while maintaining or improve performance, but that > rarely if ever happens :/ >