Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp198976pxu; Wed, 25 Nov 2020 00:14:50 -0800 (PST) X-Google-Smtp-Source: ABdhPJzs4T1iI0yP0hwufBZtPy4PTtG6ffCUZrV2VRXHzn1UjeDbdD9WCoKJ1+/krzd2AGb61V+8 X-Received: by 2002:aa7:c049:: with SMTP id k9mr2307646edo.49.1606292089800; Wed, 25 Nov 2020 00:14:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606292089; cv=none; d=google.com; s=arc-20160816; b=PRGl3cqI5SLKmtSG0PNPpOTemr6Mhlq8qi7WiFRfLBnm1G2OacyDMD9D1tX/n6KSa5 Zn0IFw3Ad8uYr3YhYkhxK/5IWk5sct0rO/dpGZryVsjlS5WRvr1GMaOUna80BM38GFCF mS96QEbS7W1SKuFtk/uEOfkmssOty9OvgaUR3Y8zYFVkyyU2zy7nou/UfAONY8IE5h6K cI8QhoGdzP2kJTHbh29uCxZ7B23sdZQiyrLdAFpTRzp+RA1jbJseP+nxfetG4/BbLOAl iXutqkSk38rXR2A4vxXh/VWR2fL13jRpKo2G5NBJ6355F9zUkPFWiWIRJNb3HdMEiCNO hSGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=qar4o6Yqq/4MIANHQrtpm7N8ALhTKB7BFOUVePHpLlA=; b=Ji10P75gP4oZtEP0jAoB3eqOgiWAkbDZP5CbVpJhCY0vrfV7uX31Tk62cGmzJ7eyQP tWDT36aozKazlQNW5sgZiBiTuGJ5Z88yzthX7ecWkPPDO2H6ADnbVcpHO2ua3L9bs1+i gdBg1RMGv6UMvYqGls+H3qcvIvXWa3oNVtPZtc0d0n+cXCP7bjUQD0vxw8VqmcQljaZv wjh08n08WNrk+z43Bfko4VJJJa9teispL2OPEHxvAX7SvFdG6ZNLYTty/3H/VXJLuiRK uUdKlQQcGFL/uIPkO/r3YJI5yzEroGCmn3uxuB8EodUJeapiggzT5NVaGxoenGYsLDvf /xRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=O8z1GklQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f22si762218edx.179.2020.11.25.00.14.26; Wed, 25 Nov 2020 00:14:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=O8z1GklQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726315AbgKYIMX (ORCPT + 99 others); Wed, 25 Nov 2020 03:12:23 -0500 Received: from ozlabs.org ([203.11.71.1]:47637 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725776AbgKYIMX (ORCPT ); Wed, 25 Nov 2020 03:12:23 -0500 Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4CgtsY1DkGz9s0b; Wed, 25 Nov 2020 19:12:17 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ellerman.id.au; s=201909; t=1606291941; bh=MkLk/nwJ58DfXoaI4w1ykli0O5C4KGioSAd70M9vhcM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=O8z1GklQYS+BFRrXpCb7vCEGKtfYXRnzjqgjQSQzy9NeMqjmUdSgVOXJy7ZkI8cu9 XPZiKZuIM38JsXjbsCYd/sO7l/SWUG1Q9yqgVl5oWI0ajWUDXuiVfU0UtyUUWBSyqw Kx0rI9kNHDcc1mgtgbMJS/2Ko8YwcU+ElIruL2OoDYFjiDM7ND+XBgzWbVOIL4M179 CwzEc2E4FQwac2J/84eldEOhgvW4pgW1d00jg4EJIBkNMZtmXViyDU6101eLcRR5iI gJrdI49inRUanyb7EzpvzcOsk5LrNx1rYuZd/kpxAeMkT9yoJ+t3Y47dYy1wzGsQMd ejKRGZhDs/y9Q== From: Michael Ellerman To: Namhyung Kim Cc: "Liang\, Kan" , Peter Zijlstra , Ingo Molnar , linux-kernel , Stephane Eranian , Ian Rogers , Gabriel Marin , Arnaldo Carvalho de Melo , Jiri Olsa , Andi Kleen , Benjamin Herrenschmidt , Paul Mackerras , linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events In-Reply-To: References: <20201106212935.28943-1-kan.liang@linux.intel.com> <20201109095235.GC2594@hirez.programming.kicks-ass.net> <20201109110405.GN2651@hirez.programming.kicks-ass.net> <0a1db246-c34a-22a3-160c-3e0c0a38119d@linux.intel.com> <20201111162509.GW2611@hirez.programming.kicks-ass.net> <2dc483f6-7b29-c42b-13a4-4c549d720aa2@linux.intel.com> <87a6v81gou.fsf@mpe.ellerman.id.au> Date: Wed, 25 Nov 2020 19:12:13 +1100 Message-ID: <87lfepzwgy.fsf@mpe.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Namhyung Kim writes: > Hello, > > On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman wrote: >> >> Namhyung Kim writes: >> > Hi Peter and Kan, >> > >> > (Adding PPC folks) >> > >> > On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim wrote: >> >> >> >> Hello, >> >> >> >> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan wrote: >> >> > >> >> > >> >> > >> >> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote: >> >> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote: >> >> > > >> >> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should >> >> > >> be invoked to flush the PEBS buffer in each context switch. However, The >> >> > >> perf_sched_events in account_event() is not updated accordingly. The >> >> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only >> >> > >> per-task event works. >> >> > >> At that time, the perf_pmu_sched_task() is outside of >> >> > >> perf_event_context_sched_in/out. It means that perf has to double >> >> > >> perf_pmu_disable() for per-task event. >> >> > > >> >> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be >> >> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the >> >> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it >> >> > >> is actually the new code for per-CPU events. The optimization for per-task >> >> > >> events is still kept. >> >> > >> For the case, which has both a CPU context and a task context, yes, the >> >> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the >> >> > >> sched_task() only need to be invoked once in a context switch. The >> >> > >> sched_task() will be eventually invoked in the task context. >> >> > > >> >> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and >> >> > > only set that for large pebs. Are you sure the other users (Intel LBR >> >> > > and PowerPC BHRB) don't need it? >> >> > >> >> > I didn't set it for LBR, because the perf_sched_events is always enabled >> >> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB >> >> > for LBR. >> >> > >> >> > if (has_branch_stack(event)) >> >> > inc = true; >> >> > >> >> > > >> >> > > If they indeed do not require the pmu::sched_task() callback for CPU >> >> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface >> >> > >> >> > No, LBR requires the pmu::sched_task() callback for CPU events. >> >> > >> >> > Now, The LBR registers have to be reset in sched in even for CPU events. >> >> > >> >> > To fix the shorter LBR callstack issue for CPU events, we also need to >> >> > save/restore LBRs in pmu::sched_task(). >> >> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/ >> >> > >> >> > > is confusing at best. >> >> > > >> >> > > Can't we do something like this instead? >> >> > > >> >> > I think the below patch may have two issues. >> >> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now. >> >> > - We may disable the large PEBS later if not all PEBS events support >> >> > large PEBS. The PMU need a way to notify the generic code to decrease >> >> > the nr_sched_task. >> >> >> >> Any updates on this? I've reviewed and tested Kan's patches >> >> and they all look good. >> >> >> >> Maybe we can talk to PPC folks to confirm the BHRB case? >> > >> > Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB >> > for PowerPC too. But it'd be nice if ppc folks can confirm the change. >> >> Sorry I've read the whole thread, but I'm still not entirely sure I >> understand the question. > > Thanks for your time and sorry about not being clear enough. > > We found per-cpu events are not calling pmu::sched_task() > on context switches. So PERF_ATTACH_SCHED_CB was > added to indicate the core logic that it needs to invoke the > callback. OK. TBH I've never thought of using branch stack with a per-cpu event, but I guess you can do it. I think the same logic applies as LBR, we need to read the BHRB entries in the context of the task that they were recorded for. > The patch 3/3 added the flag to PPC (for BHRB) with other > changes (I think it should be split like in the patch 2/3) and > want to get ACKs from the PPC folks. If you post a new version with Maddy's comments addressed then he or I can ack it. cheers