Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp323461ybz; Fri, 17 Apr 2020 01:42:27 -0700 (PDT) X-Google-Smtp-Source: APiQypJlnhqZ9GIgzUB3gGwS9Uv7rIhk4KP3dZGh2YYuGRtRWegKxZIRiIdx1NIeLAuTeq9rwujp X-Received: by 2002:a05:6402:b70:: with SMTP id cb16mr1846948edb.48.1587112947258; Fri, 17 Apr 2020 01:42:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587112947; cv=none; d=google.com; s=arc-20160816; b=Zw3oIwCqiqxoW8+N9lyJs0X6perNO4qtuUkHFG52B8Y2eUslbB/Wk4PlxSt2DyTl1S 8vonVSiSVaX79pkK1+LEaCGy1xPsfS2n0f7iJH/U4JeZJi5qTrLbs/NW2RNttrAS6X3m wJJbhatgxXRWHtiu/h8iXEwsofsbLJhElMLb7z3IjtCiyYZBoOGaXH2t9XPU4BbE3QrI PoQXttQSGyBtX3xsu0voVrL3a72Pz+hP7l6g0WL2PK+CE4d0L61NdO1TgtiX4R/JChr8 xNeqnxDTXVW0ohjH3S6HO579HlZS6/LAZW1G3WMzAhxUlVfRwVzoktprCploDG3mxTCb +GmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject:reply-to :ironport-sdr:ironport-sdr; bh=1ET/eyhwUq3wox532gV2ZI69V32UcdvO0cUAJZCj99Y=; b=TzzNAzTe3WBGeKO/JF1wvZiNr/QMpPXDs2/i4BOsw5bGNqw/kSALwpC8EVVcjI5g1T YKUVritwKBcdJDv0tb0asSs+8JLnayWlpbL+oCpB96BZuHidBAxIT2piI4GIJF6MCDBe XUkilbqD0O0FmoH3BSiqv3Frgk9jq7Y5rwLIkCKhW3JZ9+AvsbkamXCotM7S7h2FJRy2 K4aLXWP8tCm+QtdwQeb+TcFFRn5qrG1Pocu5BavVkfdf5JjSyAudUgKBE3Yr6N/GPaNJ sopPdm9EzG46M1NwEF/DQ/JORQAf2vVBnImYX0HtI0K9ni+wNL+OY1phL+hEvnuLWEHm d1qA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id oh24si13450501ejb.531.2020.04.17.01.42.03; Fri, 17 Apr 2020 01:42:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729890AbgDQIkQ (ORCPT + 99 others); Fri, 17 Apr 2020 04:40:16 -0400 Received: from mga03.intel.com ([134.134.136.65]:64716 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729796AbgDQIkP (ORCPT ); Fri, 17 Apr 2020 04:40:15 -0400 IronPort-SDR: NbpRTZLCPYpktC/mUs7G+2qu3svogeO+H1RdLpTNp88BrqBsWmeLM+Q97PikQZ/QkmTOFxhCIv Dz1H7M2pSgnA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Apr 2020 01:40:15 -0700 IronPort-SDR: Qw/6VtXAAp+Z178rQ5jifBekM2FbH8hLF3U16DemPpFbboc0ntAnVJiYH/FveK9KEZeMhJbaCW 1yaWy2lCMPNg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,394,1580803200"; d="scan'208";a="454658961" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.238.4.236]) ([10.238.4.236]) by fmsmga005.fm.intel.com with ESMTP; 17 Apr 2020 01:40:05 -0700 Reply-To: like.xu@intel.com Subject: Re: [PATCH v9 03/10] perf/x86: Add constraint to create guest LBR event without hw counter To: Peter Zijlstra , Like Xu Cc: Paolo Bonzini , kvm@vger.kernel.org, Andi Kleen , Jim Mattson , Wanpeng Li , Sean Christopherson , Joerg Roedel , Liran Alon , Thomas Gleixner , Ingo Molnar , Arnaldo Carvalho de Melo , Liang Kan , Wei Wang , linux-kernel@vger.kernel.org, Ingo Molnar References: <20200313021616.112322-1-like.xu@linux.intel.com> <20200313021616.112322-4-like.xu@linux.intel.com> <20200409163717.GD20713@hirez.programming.kicks-ass.net> <0b89963d-33d8-3b0f-fc56-eff3ccce648d@intel.com> From: "Xu, Like" Organization: Intel OTC Message-ID: Date: Fri, 17 Apr 2020 16:40:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <0b89963d-33d8-3b0f-fc56-eff3ccce648d@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On 2020/4/10 11:03, Xu, Like wrote: > Hi Peter, > > First of all, thanks for your comments! > > On 2020/4/10 0:37, Peter Zijlstra wrote: >>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c >>> index 3bb738f5a472..e919187a0751 100644 >>> --- a/arch/x86/events/core.c >>> +++ b/arch/x86/events/core.c >>> @@ -74,7 +74,8 @@ u64 x86_perf_event_update(struct perf_event *event) >>>       int idx = hwc->idx; >>>       u64 delta; >>>   -    if (idx == INTEL_PMC_IDX_FIXED_BTS) >>> +    if ((idx == INTEL_PMC_IDX_FIXED_BTS) || >>> +        (idx == INTEL_PMC_IDX_FIXED_VLBR)) >>>           return 0; >>>         /* >>> @@ -1102,7 +1103,8 @@ static inline void x86_assign_hw_event(struct >>> perf_event *event, >>>       hwc->last_cpu = smp_processor_id(); >>>       hwc->last_tag = ++cpuc->tags[i]; >>>   -    if (hwc->idx == INTEL_PMC_IDX_FIXED_BTS) { >>> +    if ((hwc->idx == INTEL_PMC_IDX_FIXED_BTS) || >>> +        (hwc->idx == INTEL_PMC_IDX_FIXED_VLBR)) { >>>           hwc->config_base = 0; >>>           hwc->event_base    = 0; >>>       } else if (hwc->idx >= INTEL_PMC_IDX_FIXED) { >>> @@ -1233,7 +1235,8 @@ int x86_perf_event_set_period(struct perf_event >>> *event) >>>       s64 period = hwc->sample_period; >>>       int ret = 0, idx = hwc->idx; >>>   -    if (idx == INTEL_PMC_IDX_FIXED_BTS) >>> +    if ((idx == INTEL_PMC_IDX_FIXED_BTS) || >>> +        (idx == INTEL_PMC_IDX_FIXED_VLBR)) >>>           return 0; >>>         /* >> That seems unfortunate; can that be >= INTEL_PMC_IDX_FIXED_BTS ? If so, >> that probably wants a comment with the definitions. >> >> Or otherwise check for !hwc->event_base. That should be 0 for both these >> things. > Yes, the !hwc->event_base looks good to me. >> >>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c >>> index 3be51aa06e67..901c82032f4a 100644 >>> --- a/arch/x86/events/intel/core.c >>> +++ b/arch/x86/events/intel/core.c >>> @@ -2157,6 +2157,9 @@ static void intel_pmu_disable_event(struct >>> perf_event *event) >>>           return; >>>       } >>>   +    if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_VLBR)) >>> +        return; >>> + >> Please check code-gen to see if you can cut down on brancher here; >> there's 4 cases: >> >>   - vlbr >>   - bts >>   - fixed >>   - gp >> >> perhaps you can write it like so: >> >> (also see >> https://lkml.kernel.org/r/20190828090217.GN2386@hirez.programming.kicks-ass.net >> ) >> >> static void intel_pmu_enable_event(struct perf_event *event) >> { >>     ... >>     int idx = hwx->idx; >> >>     if (idx < INTEL_PMC_IDX_FIXED) { >>         intel_set_masks(event, idx); >>         __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); >>     } else if (idx < INTEL_PMC_IDX_FIXED_BTS) { >>         intel_set_masks(event, idx); >>         intel_pmu_enable_fixed(event); >>     } else if (idx == INTEL_PMC_IDX_FIXED_BTS) { >>         intel_pmu_enable_bts(hwc->config); >>     } >> >>     /* nothing for INTEL_PMC_IDX_FIXED_VLBR */ >> } >> >> That should sort the branches in order of: gp,fixed,bts,vlbr > > Note the current order is: bts, pebs, fixed, gp. > > Sure, let me try to refactor it in this way. >> >>>       cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx); >>>       cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx); >>>       cpuc->intel_cp_status &= ~(1ull << hwc->idx); >>> @@ -2241,6 +2244,9 @@ static void intel_pmu_enable_event(struct >>> perf_event *event) >>>           return; >>>       } >>>   +    if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_VLBR)) >>> +        return; >>> + >>>       if (event->attr.exclude_host) >>>           cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx); >>>       if (event->attr.exclude_guest) >> idem. > idem. >> >>> @@ -2595,6 +2601,15 @@ intel_bts_constraints(struct perf_event *event) >>>       return NULL; >>>   } >>>   +static struct event_constraint * >>> +intel_guest_event_constraints(struct perf_event *event) >>> +{ >>> +    if (unlikely(is_guest_lbr_event(event))) >>> +        return &guest_lbr_constraint; >>> + >>> +    return NULL; >>> +} >> This is a mis-nomer, it isn't just any guest_event > > Sure,  I'll rename it to intel_guest_lbr_event_constraints() > instead of using it as a unified interface to get all of guest event > constraints. > >> >>> + >>>   static int intel_alt_er(int idx, u64 config) >>>   { >>>       int alt_idx = idx; >>> @@ -2785,6 +2800,10 @@ __intel_get_event_constraints(struct >>> cpu_hw_events *cpuc, int idx, >>>   { >>>       struct event_constraint *c; >>>   +    c = intel_guest_event_constraints(event); >>> +    if (c) >>> +        return c; >>> + >>>       c = intel_bts_constraints(event); >>>       if (c) >>>           return c; >>> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h >>> index 1025bc6eb04f..9a62264a3068 100644 >>> --- a/arch/x86/events/perf_event.h >>> +++ b/arch/x86/events/perf_event.h >>> @@ -969,6 +969,20 @@ static inline bool intel_pmu_has_bts(struct >>> perf_event *event) >>>       return intel_pmu_has_bts_period(event, hwc->sample_period); >>>   } >>>   +static inline bool is_guest_event(struct perf_event *event) >>> +{ >>> +    if (event->attr.exclude_host && is_kernel_event(event)) >>> +        return true; >>> +    return false; >>> +} >> I don't like this one, what if another in-kernel users generates an >> event with exclude_host set ? > Thanks for the clear attitude. > > How about: > - remove the is_guest_event() to avoid potential misuse; > - move all checks into is_guest_lbr_event() and make it dedicated: > > static inline bool is_guest_lbr_event(struct perf_event *event) > { >     if (is_kernel_event(event) && >         event->attr.exclude_host && needs_branch_stack(event)) >         return true; >     return false; > } > > In this case, it's safe to generate an event with exclude_host set > and also use LBR to count guest or nothing for other in-kernel users > because the intel_guest_lbr_event_constraints() makes LBR exclusive. > > For this generic usage, I may rename: > - is_guest_lbr_event() to is_lbr_no_counter_event(); > - intel_guest_lbr_event_constraints() to > intel_lbr_no_counter_event_constraints(); > > Is this acceptable to you? > If there is anything needs to be improved, please let me know. Do you have any preference for this ? If you have more comments for the general idea or code details, please let me know. For example, you may take a look at the interface named intel_pmu_create_lbr_event() in the "[PATCH v9 07/10] KVM: x86/pmu: Add LBR feature emulation via guest LBR event". If not, I'll spin the next version based on your current feedback. Thanks, Like Xu > >>> @@ -989,6 +1003,7 @@ void release_ds_buffers(void); >>>   void reserve_ds_buffers(void); >>>     extern struct event_constraint bts_constraint; >>> +extern struct event_constraint guest_lbr_constraint; >>>     void intel_pmu_enable_bts(u64 config); >>>   diff --git a/arch/x86/include/asm/perf_event.h >>> b/arch/x86/include/asm/perf_event.h >>> index e018a1cf604c..674130aca75a 100644 >>> --- a/arch/x86/include/asm/perf_event.h >>> +++ b/arch/x86/include/asm/perf_event.h >>> @@ -181,9 +181,19 @@ struct x86_pmu_capability { >>>   #define GLOBAL_STATUS_UNC_OVF                BIT_ULL(61) >>>   #define GLOBAL_STATUS_ASIF                BIT_ULL(60) >>>   #define GLOBAL_STATUS_COUNTERS_FROZEN            BIT_ULL(59) >>> -#define GLOBAL_STATUS_LBRS_FROZEN            BIT_ULL(58) >>> +#define GLOBAL_STATUS_LBRS_FROZEN_BIT            58 >>> +#define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT) >>>   #define GLOBAL_STATUS_TRACE_TOPAPMI            BIT_ULL(55) >>>   +/* >>> + * We model guest LBR event tracing as another fixed-mode PMC like BTS. >>> + * >>> + * We choose bit 58 (LBRS_FROZEN_BIT) which is used to indicate that >>> the LBR >>> + * stack is frozen on a hardware PMI request in the PERF_GLOBAL_STATUS >>> msr, >>> + * and the 59th PMC counter (if any) is not supposed to use it as well. >> Is this saying that STATUS.58 should never be set? I don't really >> understand the language. > My fault, and let me make it more clearly: > > We choose bit 58 because it's used to indicate LBR stack frozen state > not like other overflow conditions in the PERF_GLOBAL_STATUS msr, > and it will not be used for any actual fixed events. > >> >>> + */ >>> +#define INTEL_PMC_IDX_FIXED_VLBR GLOBAL_STATUS_LBRS_FROZEN_BIT >>> + >>>   /* >>>    * Adaptive PEBS v4 >>>    */ >