Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751820AbdFIIEz (ORCPT ); Fri, 9 Jun 2017 04:04:55 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:56068 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751640AbdFIIEu (ORCPT ); Fri, 9 Jun 2017 04:04:50 -0400 Subject: Re: [PATCH v2 2/4] perf/x86: Fix data source decoding for Skylake To: Peter Zijlstra , Andi Kleen Cc: acme@kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, eranian@google.com, Andi Kleen , sukadev@linux.vnet.ibm.com, Michael Ellerman References: <20170607232226.26365-1-andi@firstfloor.org> <20170607232226.26365-3-andi@firstfloor.org> <20170608081531.iv27xaghntniwmm6@hirez.programming.kicks-ass.net> From: Madhavan Srinivasan Date: Fri, 9 Jun 2017 13:34:35 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170608081531.iv27xaghntniwmm6@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-TM-AS-MML: disable x-cbid: 17060908-0016-0000-0000-0000024BAC06 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17060908-0017-0000-0000-000006CAB981 Message-Id: <1d43622e-e1f3-de5a-ae49-9b4926db0ad4@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-09_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706090143 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2696 Lines: 77 On Thursday 08 June 2017 01:45 PM, Peter Zijlstra wrote: > On Wed, Jun 07, 2017 at 04:22:24PM -0700, Andi Kleen wrote: > >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> index b1c0b187acfe..95daade294d7 100644 >> --- a/include/uapi/linux/perf_event.h >> +++ b/include/uapi/linux/perf_event.h >> @@ -931,14 +931,18 @@ union perf_mem_data_src { >> mem_snoop:5, /* snoop mode */ >> mem_lock:2, /* lock instr */ >> mem_dtlb:7, /* tlb access */ >> - mem_rsvd:31; >> + mem_lvlx:8, /* memory hierarchy level, ext */ >> + mem_snoopx:2, /* snoop mode, ext */ >> + mem_rsvd:21; >> }; >> }; >> #elif defined(__BIG_ENDIAN_BITFIELD) >> union perf_mem_data_src { >> __u64 val; >> struct { >> - __u64 mem_rsvd:31, >> + __u64 mem_rsvd:21, >> + mem_snoopx:2, /* snoop mode, ext */ >> + mem_lvlx:8, /* memory hierarchy level, ext */ >> mem_dtlb:7, /* tlb access */ >> mem_lock:2, /* lock instr */ >> mem_snoop:5, /* snoop mode */ > So one thing we could do is add a mem_hops field and always set that, > even for the old stuff. The old stuff will not know about that field and > ignore the bits, but new stuff will then not need as many LVL bits. > > Of course, we then get into the problem of how many bits of hops we > need.. Power guys ? Currently we support 3 hops (local, remote and distant) and in future we may have another for capi. So 4 levels of hops might do. 8 would be nice future proof. Maddy > >> @@ -975,6 +979,16 @@ union perf_mem_data_src { >> #define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */ >> #define PERF_MEM_LVL_SHIFT 5 >> >> +#define PERF_MEM_LVLX_L4 0x01 /* L4 */ >> +#define PERF_MEM_LVLX_REM_L4 0x02 /* Remote L4 */ >> +#define PERF_MEM_LVLX_REM_RAM 0x04 /* Remote Ram, unknown hops */ >> +#define PERF_MEM_LVLX_PMEM 0x08 /* Persistent Memory */ >> +#define PERF_MEM_LVLX_REM_PMEM 0x10 /* Remote Persistent Memory */ >> +#define PERF_MEM_LVLX_REM_NA 0x20 /* Remote N/A level */ > Still wondering what the point of REM_NA is.. can you explain? > >> +/* 2 free */ >> + >> +#define PERF_MEM_LVLX_SHIFT 33 >> + >> /* snoop mode */ >> #define PERF_MEM_SNOOP_NA 0x01 /* not available */ >> #define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */ >> @@ -983,6 +997,10 @@ union perf_mem_data_src { >> #define PERF_MEM_SNOOP_HITM 0x10 /* snoop hit modified */ >> #define PERF_MEM_SNOOP_SHIFT 19 >> >> +#define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ >> +/* 1 free */ >> +#define PERF_MEM_SNOOPX_SHIFT 41 >> + >> /* locked instruction */ >> #define PERF_MEM_LOCK_NA 0x01 /* not available */ >> #define PERF_MEM_LOCK_LOCKED 0x02 /* locked transaction */ >> -- >> 2.9.4 >>