Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3354406pxf; Mon, 15 Mar 2021 07:50:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQhL/1nL1Q0FtUUcijQE/X/6Xdejlho8+7YO4/AH9VyE45tJLZNqaqBj+Ot5MHFPMKbpil X-Received: by 2002:a17:906:d8c6:: with SMTP id re6mr23279717ejb.311.1615819818764; Mon, 15 Mar 2021 07:50:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615819818; cv=none; d=google.com; s=arc-20160816; b=RZEpUAjNDva0uQHO3FRdnIC3KZM9JcnXi2JJWWME/6aQhvZ3zJb78XZsyJERNsUC0x quwXWiigkHGC1BNPc7PJjrhNaWgfYnLRaM0jCrHd+InCbeS5FRAS47UVEmShSWfvEtzN QxN1OKlHips+vZBhzIK0vy8XtrAphSGu8e+MYcH3SkFigznup2gOsi2EvLpmpG7Ig39V Bl8GPdB5zquU5qtGrAc79AnA6eRZpMarrNIYCXZQwPMD+M7SDiH1klel0ZNYTa3QknVI 4t5tzOzO5c0Onp6tfm7VYqhlbxpj2eRlcrcWlMqa1YcA+SpKqvfW0RrSZHpBDVt23mLg PyYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=EV5/IY7Y3wkEGfxYs4Bg0Kcr2QOQg7vxgozcJc47uIw=; b=g0OWHMfzq+pKU0pXpwEfNHu6T56yQmGh+6H6WDixm/23m+Jv8owVbcLQh+Z0NGaI5G CCvaeOqrgGMDHb5h5HZNy7J1G6koHsDBhEkmt0U6cLbrtrIYyqrd2q2c57x4uF3IJGDE DuowsbUHurhYIMffmlwE5ybjRGo1VGAohm4u/No68flD29EqhasbCLKHtl9N4ezQOT2R 2me4C6p71h+3uYhhYd0znzNJP7VHPvkKPF+HMXc5h1Npo/A2Uxl1y0/16R/0jVe29kWS VJxvR6OH2WgAQLzV4PxWdPkanjYX25x5GAzPceWG65vfKt12yjQt2sO/PQcHaoP/YNi+ ji6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=fq+W0FCW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c8si11028209edy.116.2021.03.15.07.49.55; Mon, 15 Mar 2021 07:50:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=fq+W0FCW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241064AbhCOOsY (ORCPT + 99 others); Mon, 15 Mar 2021 10:48:24 -0400 Received: from mail.kernel.org ([198.145.29.99]:50442 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234509AbhCOODr (ORCPT ); Mon, 15 Mar 2021 10:03:47 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 17AEA64EF9; Mon, 15 Mar 2021 14:03:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1615817026; bh=PplaeYjiW7UAMAtNFVbW8J4x19PJGp6QA3tBFY0m3Rw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fq+W0FCWixILVgBoT9p9MAlaPzr9OMWAO+aqMsdMO7dRcixtKCiRIk4pm1jcY6wjU epKWxVJUEooz6ee0fVLZgFXqzpRxpajL+ifvixl+f9KIKWc4jrgbmVaftyxTfmI1qN Ccirnu8JtzVz3h9Q89KsfPYwHQwqIkUyqbkq7kDM= From: gregkh@linuxfoundation.org To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Gabriel Marin , Kan Liang , "Peter Zijlstra (Intel)" , Ingo Molnar , Sasha Levin , Namhyung Kim Subject: [PATCH 5.10 253/290] perf/core: Flush PMU internal buffers for per-CPU events Date: Mon, 15 Mar 2021 14:55:46 +0100 Message-Id: <20210315135550.565470611@linuxfoundation.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210315135541.921894249@linuxfoundation.org> References: <20210315135541.921894249@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Greg Kroah-Hartman From: Kan Liang [ Upstream commit a5398bffc01fe044848c5024e5e867e407f239b8 ] Sometimes the PMU internal buffers have to be flushed for per-CPU events during a context switch, e.g., large PEBS. Otherwise, the perf tool may report samples in locations that do not belong to the process where the samples are processed in, because PEBS does not tag samples with PID/TID. The current code only flush the buffers for a per-task event. It doesn't check a per-CPU event. Add a new event state flag, PERF_ATTACH_SCHED_CB, to indicate that the PMU internal buffers have to be flushed for this event during a context switch. Add sched_cb_entry and perf_sched_cb_usages back to track the PMU/cpuctx which is required to be flushed. Only need to invoke the sched_task() for per-CPU events in this patch. The per-task events have been handled in perf_event_context_sched_in/out already. Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Reported-by: Gabriel Marin Originally-by: Namhyung Kim Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20201130193842.10569-1-kan.liang@linux.intel.com Signed-off-by: Sasha Levin --- include/linux/perf_event.h | 2 ++ kernel/events/core.c | 42 ++++++++++++++++++++++++++++++++++---- 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 96450f6fb1de..22ce0604b448 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -606,6 +606,7 @@ struct swevent_hlist { #define PERF_ATTACH_TASK 0x04 #define PERF_ATTACH_TASK_DATA 0x08 #define PERF_ATTACH_ITRACE 0x10 +#define PERF_ATTACH_SCHED_CB 0x20 struct perf_cgroup; struct perf_buffer; @@ -872,6 +873,7 @@ struct perf_cpu_context { struct list_head cgrp_cpuctx_entry; #endif + struct list_head sched_cb_entry; int sched_cb_usage; int online; diff --git a/kernel/events/core.c b/kernel/events/core.c index c3ba29d058b7..4af161b3f322 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -383,6 +383,7 @@ static DEFINE_MUTEX(perf_sched_mutex); static atomic_t perf_sched_count; static DEFINE_PER_CPU(atomic_t, perf_cgroup_events); +static DEFINE_PER_CPU(int, perf_sched_cb_usages); static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events); static atomic_t nr_mmap_events __read_mostly; @@ -3466,11 +3467,16 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn, } } +static DEFINE_PER_CPU(struct list_head, sched_cb_list); + void perf_sched_cb_dec(struct pmu *pmu) { struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); - --cpuctx->sched_cb_usage; + this_cpu_dec(perf_sched_cb_usages); + + if (!--cpuctx->sched_cb_usage) + list_del(&cpuctx->sched_cb_entry); } @@ -3478,7 +3484,10 @@ void perf_sched_cb_inc(struct pmu *pmu) { struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); - cpuctx->sched_cb_usage++; + if (!cpuctx->sched_cb_usage++) + list_add(&cpuctx->sched_cb_entry, this_cpu_ptr(&sched_cb_list)); + + this_cpu_inc(perf_sched_cb_usages); } /* @@ -3507,6 +3516,24 @@ static void __perf_pmu_sched_task(struct perf_cpu_context *cpuctx, bool sched_in perf_ctx_unlock(cpuctx, cpuctx->task_ctx); } +static void perf_pmu_sched_task(struct task_struct *prev, + struct task_struct *next, + bool sched_in) +{ + struct perf_cpu_context *cpuctx; + + if (prev == next) + return; + + list_for_each_entry(cpuctx, this_cpu_ptr(&sched_cb_list), sched_cb_entry) { + /* will be handled in perf_event_context_sched_in/out */ + if (cpuctx->task_ctx) + continue; + + __perf_pmu_sched_task(cpuctx, sched_in); + } +} + static void perf_event_switch(struct task_struct *task, struct task_struct *next_prev, bool sched_in); @@ -3529,6 +3556,9 @@ void __perf_event_task_sched_out(struct task_struct *task, { int ctxn; + if (__this_cpu_read(perf_sched_cb_usages)) + perf_pmu_sched_task(task, next, false); + if (atomic_read(&nr_switch_events)) perf_event_switch(task, next, false); @@ -3837,6 +3867,9 @@ void __perf_event_task_sched_in(struct task_struct *prev, if (atomic_read(&nr_switch_events)) perf_event_switch(task, prev, true); + + if (__this_cpu_read(perf_sched_cb_usages)) + perf_pmu_sched_task(prev, task, true); } static u64 perf_calculate_period(struct perf_event *event, u64 nsec, u64 count) @@ -4661,7 +4694,7 @@ static void unaccount_event(struct perf_event *event) if (event->parent) return; - if (event->attach_state & PERF_ATTACH_TASK) + if (event->attach_state & (PERF_ATTACH_TASK | PERF_ATTACH_SCHED_CB)) dec = true; if (event->attr.mmap || event->attr.mmap_data) atomic_dec(&nr_mmap_events); @@ -11056,7 +11089,7 @@ static void account_event(struct perf_event *event) if (event->parent) return; - if (event->attach_state & PERF_ATTACH_TASK) + if (event->attach_state & (PERF_ATTACH_TASK | PERF_ATTACH_SCHED_CB)) inc = true; if (event->attr.mmap || event->attr.mmap_data) atomic_inc(&nr_mmap_events); @@ -12848,6 +12881,7 @@ static void __init perf_event_init_all_cpus(void) #ifdef CONFIG_CGROUP_PERF INIT_LIST_HEAD(&per_cpu(cgrp_cpuctx_list, cpu)); #endif + INIT_LIST_HEAD(&per_cpu(sched_cb_list, cpu)); } } -- 2.30.1