Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751805AbdFHTlI (ORCPT ); Thu, 8 Jun 2017 15:41:08 -0400 Received: from mail-it0-f41.google.com ([209.85.214.41]:38904 "EHLO mail-it0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751734AbdFHTlG (ORCPT ); Thu, 8 Jun 2017 15:41:06 -0400 MIME-Version: 1.0 In-Reply-To: <20170608081531.iv27xaghntniwmm6@hirez.programming.kicks-ass.net> References: <20170607232226.26365-1-andi@firstfloor.org> <20170607232226.26365-3-andi@firstfloor.org> <20170608081531.iv27xaghntniwmm6@hirez.programming.kicks-ass.net> From: Stephane Eranian Date: Thu, 8 Jun 2017 12:40:59 -0700 Message-ID: Subject: Re: [PATCH v2 2/4] perf/x86: Fix data source decoding for Skylake To: Peter Zijlstra Cc: Andi Kleen , Arnaldo Carvalho de Melo , LKML , Jiri Olsa , Andi Kleen , Sukadev Bhattiprolu , Madhavan Srinivasan Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3098 Lines: 77 Hi, On Thu, Jun 8, 2017 at 1:15 AM, Peter Zijlstra wrote: > > On Wed, Jun 07, 2017 at 04:22:24PM -0700, Andi Kleen wrote: > > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > > index b1c0b187acfe..95daade294d7 100644 > > --- a/include/uapi/linux/perf_event.h > > +++ b/include/uapi/linux/perf_event.h > > @@ -931,14 +931,18 @@ union perf_mem_data_src { > > mem_snoop:5, /* snoop mode */ > > mem_lock:2, /* lock instr */ > > mem_dtlb:7, /* tlb access */ > > - mem_rsvd:31; > > + mem_lvlx:8, /* memory hierarchy level, ext */ > > + mem_snoopx:2, /* snoop mode, ext */ > > + mem_rsvd:21; > > }; > > }; > > #elif defined(__BIG_ENDIAN_BITFIELD) > > union perf_mem_data_src { > > __u64 val; > > struct { > > - __u64 mem_rsvd:31, > > + __u64 mem_rsvd:21, > > + mem_snoopx:2, /* snoop mode, ext */ > > + mem_lvlx:8, /* memory hierarchy level, ext */ > > mem_dtlb:7, /* tlb access */ > > mem_lock:2, /* lock instr */ > > mem_snoop:5, /* snoop mode */ > > So one thing we could do is add a mem_hops field and always set that, > even for the old stuff. The old stuff will not know about that field and > ignore the bits, but new stuff will then not need as many LVL bits. > That would be better than lvlx I think. I am guessing you're suggesting an integer count here and not a bitmask. Right? Then I wonder why it would need 8 bits or 255 possible levels! > Of course, we then get into the problem of how many bits of hops we > need.. Power guys ? > > > @@ -975,6 +979,16 @@ union perf_mem_data_src { > > #define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */ > > #define PERF_MEM_LVL_SHIFT 5 > > > > +#define PERF_MEM_LVLX_L4 0x01 /* L4 */ > > +#define PERF_MEM_LVLX_REM_L4 0x02 /* Remote L4 */ > > +#define PERF_MEM_LVLX_REM_RAM 0x04 /* Remote Ram, unknown hops */ > > +#define PERF_MEM_LVLX_PMEM 0x08 /* Persistent Memory */ > > +#define PERF_MEM_LVLX_REM_PMEM 0x10 /* Remote Persistent Memory */ > > +#define PERF_MEM_LVLX_REM_NA 0x20 /* Remote N/A level */ > > Still wondering what the point of REM_NA is.. can you explain? > > > +/* 2 free */ > > + > > +#define PERF_MEM_LVLX_SHIFT 33 > > + > > /* snoop mode */ > > #define PERF_MEM_SNOOP_NA 0x01 /* not available */ > > #define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */ > > @@ -983,6 +997,10 @@ union perf_mem_data_src { > > #define PERF_MEM_SNOOP_HITM 0x10 /* snoop hit modified */ > > #define PERF_MEM_SNOOP_SHIFT 19 > > > > +#define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ > > +/* 1 free */ > > +#define PERF_MEM_SNOOPX_SHIFT 41 > > + > > /* locked instruction */ > > #define PERF_MEM_LOCK_NA 0x01 /* not available */ > > #define PERF_MEM_LOCK_LOCKED 0x02 /* locked transaction */ > > -- > > 2.9.4 > >