Received: by 2002:a05:7412:798b:b0:fc:a2b0:25d7 with SMTP id fb11csp639380rdb; Thu, 22 Feb 2024 15:00:16 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVdtZcGuhNpHlWpKqGm1vbS4l1eZHuwrtv+bEEf9zh8Ir3DFgxZqG5BGq85Csi6IgdpqQm0Lk4ccVuK8Vxa+jqko4ZFOEpR8y3T0m3OJw== X-Google-Smtp-Source: AGHT+IHbZsjZtzWRycpSwcwsMxDJZuLQ7y+Ic//GVWM7eB37GyOXjYfXdXpALKaWGtUCppsSSNEL X-Received: by 2002:ac8:5910:0:b0:42c:791c:3e67 with SMTP id 16-20020ac85910000000b0042c791c3e67mr558840qty.16.1708642816264; Thu, 22 Feb 2024 15:00:16 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708642816; cv=pass; d=google.com; s=arc-20160816; b=xDGnHWkIfhF4dGNiWr0rQB8p53fkjswfi3NPA+PcjTpaXdT4BlE/oCDqlTwwZepL2k UM48M5rq201S4rV6k6AXkgN9v40T3tpRW4ez99zLQH9zOiCJOsf5Q3G0V7O5U8BZ87TX 8PiOvm3f3IjdI7rPIEqhkiqSD7BXbJ5Z3lKVAELB3GQ9l4OwE6nB2a2mRgwOmuIs9iK9 HL7RDGchZyha+sU1U20ZdJAl1Nvw2AKdybBvAdmeJhNG9SS5/IwtJks0GD//M79Pk8dB Xj11QSHzi8wnHwDGSmxovOjEfx3eZkajZ5dGO5BjA9fyQlC8fqHTDVtaV5gDwOHJRUJx VGOQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=gFR4Q+y5U1jDyz/jl8IlOpKSvsozMKPSMsq0eeN58ec=; fh=KwfJJkiTpWS65qyNnLtBKATVvoOXJYzeWCPL0TCq+rc=; b=UNrzAuPwCjAAgQtXraLL9T3d9Ak6dM+Gb4RiDd5BC3QGriRz897Ne8BihbV6S0sVhp KQ38vplKAUbb0mItkTPlJojd7pWFY+5/t2jhYd0tuFrW7l3/6D5pH/JDhfzzf59QOA0Y rQE8yVW4lYTIMZLVM6fm7Nl6nEDXEXirvFZy2wtbajXOlMu/Nv0r1EKudO8v2BdmoFFj 4gyiDaoBO7oYi/1HDDh8HyDsNQl2Vk5QQF3JBvPhULKk4Il0BcBCM0ov95sySnLjgOXK dlNBYGMMKlI415X+QrX6QKLMaRJ2FfLyxMKHKpBUqjGoVKdjik6+nKljUgJ3aBX45myq QxiQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MYhQ90J7; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-77297-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-77297-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id t24-20020ac87398000000b0042e60d97a44si123281qtp.162.2024.02.22.15.00.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Feb 2024 15:00:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-77297-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MYhQ90J7; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-77297-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-77297-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 825241C23960 for ; Thu, 22 Feb 2024 20:05:24 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E1A526AFBA; Thu, 22 Feb 2024 20:05:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="MYhQ90J7" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 338B4548F3; Thu, 22 Feb 2024 20:05:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708632315; cv=none; b=WV3Ur6hfTxwiTd6IvnJ7wUGWMTRthoegFiWD1ADs7JAzpX+IXlkVyqnRuSYF7X99o8LaqPziYrGyoKspGkPWovIQSyvk3t0My3KHON0Zo88wBIyTpi0tcTsCSv67B7v4q4yMmmuyUAASTTUytjjPcKQlzHT5dmbsjfYiVBbP6+k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708632315; c=relaxed/simple; bh=NNN5sIYqYmoNZjYzCoccvYjI75lw8cPlmAS3bA9uaqQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=a2yrhPgCmWs+m1S1/go+mO/C/xXBqOXLBCg+znLbVi/6y9ZZ0o0S37r1FL97iuq35hco/cfcgjdPCj1Fa+qb5LqfwDdJJT5jWkznsZ483tr/Zx7GZiZhbeT14dmJQZwQoDaNx0EpU3HU2lqmfQY4idJmMZAJBCWSCvDZoq/l4wM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=MYhQ90J7; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708632314; x=1740168314; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=NNN5sIYqYmoNZjYzCoccvYjI75lw8cPlmAS3bA9uaqQ=; b=MYhQ90J7Ah8DAE01j+Jpb2Lxda40odLF4oQBM0UvWUTrwRcKynuJDIAb 0OBviFQFQAsGKReJ2/znXo4GD+Xnru1Ve4EU0mDuVn3zUMDT5B7ad7y7T p9YaW5OZForxUl0+KCoXTcraUsvi8j9JhCqgencfekm1hcOYuCUp6oKkh WLWcI38Tx0Q8StfrTMOabg9zYE1C/hcxKIWk2Nf4haoYZ/v4u3fVzT33G +kZf7+TWqzjZfR/GvLtIqGDH1THFaPYnCn2aPxVkpiGKzd9igIVvUMiOJ lAyc4sYEjF5ywZYF0XlMJ2Kmny78nwVSqCX6+ak5JjXFFjyi3ouJ/NVRt A==; X-IronPort-AV: E=McAfee;i="6600,9927,10992"; a="13596797" X-IronPort-AV: E=Sophos;i="6.06,179,1705392000"; d="scan'208";a="13596797" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2024 12:05:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10992"; a="936897396" X-IronPort-AV: E=Sophos;i="6.06,179,1705392000"; d="scan'208";a="936897396" Received: from linux.intel.com ([10.54.29.200]) by fmsmga001.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2024 12:05:12 -0800 Received: from [10.212.89.194] (kliang2-mobl1.ccr.corp.intel.com [10.212.89.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 241E1580DED; Thu, 22 Feb 2024 12:05:10 -0800 (PST) Message-ID: <05d29733-cfc4-42e1-bbb1-a496d9522d0e@linux.intel.com> Date: Thu, 22 Feb 2024 15:05:08 -0500 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [BUG] perf/x86/intel: HitM false-positives on Ice Lake / Tiger Lake (I think?) Content-Language: en-US To: Arnaldo Carvalho de Melo , Ian Rogers , Jann Horn Cc: Joe Mario , Jiri Olsa , Peter Zijlstra , Ingo Molnar , Namhyung Kim , Mark Rutland , Alexander Shishkin , Adrian Hunter , Feng Tang , Andi Kleen , the arch/x86 maintainers , kernel list , linux-perf-users@vger.kernel.org, Stephane Eranian , "Taylor, Perry" , "Alt, Samantha" , "Biggers, Caleb" , "Wang, Weilin" References: From: "Liang, Kan" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi Jann, Sorry for the late response. On 2024-02-20 10:42 a.m., Arnaldo Carvalho de Melo wrote: > Just adding Joe Mario to the CC list. > > On Mon, Feb 19, 2024 at 03:20:00PM -0800, Ian Rogers wrote: >> On Mon, Feb 19, 2024 at 5:01 AM Jann Horn wrote: >>> >>> Hi! >>> >>> From what I understand, "perf c2c" shows bogus HitM events on Ice Lake >>> (and newer) because Intel added some feature where *clean* cachelines >>> can get snoop-forwarded ("cross-core FWD"), and the PMU apparently >>> treats this mostly the same as snoop-forwarding of modified cache >>> lines (HitM)? On a Tiger Lake CPU, I can see addresses from the kernel >>> rodata section in "perf c2c report". >>> >>> This is mentioned in the SDM, Volume 3B, section "20.9.7 Load Latency >>> Facility", table "Table 20-101. Data Source Encoding for Memory >>> Accesses (Ice Lake and Later Microarchitectures)", encoding 07H: >>> "XCORE FWD. This request was satisfied by a sibling core where either >>> a modified (cross-core HITM) or a non-modified (cross-core FWD) >>> cache-line copy was found." >>> >>> I don't see anything about this in arch/x86/events/intel/ds.c - if I >>> understand correctly, the kernel's PEBS data source decoding assumes >>> that 0x07 means "L3 hit, snoop hitm" on these CPUs. I think this needs >>> to be adjusted somehow - and maybe it just isn't possible to actually >>> distinguish between HitM and cross-core FWD in PEBS events on these >>> CPUs (without big-hammer chicken bit trickery)? Maybe someone from >>> Intel can clarify? >>> >>> (The SDM describes that E-cores on the newer 12th Gen have more >>> precise PEBS encodings that distinguish between "L3 HITM" and "L3 >>> HITF"; but I guess the P-cores there maybe still don't let you >>> distinguish HITM/HITF?) Right, there is no way to distinguish HITM/HITF on Tiger Lake. I think what we can do is to add both HITM and HITF for the 0x07 to match the SDM description. How about the below patch (not tested yet)? diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index d49d661ec0a7..8c966b5b23cb 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -84,7 +84,7 @@ static u64 pebs_data_source[] = { OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, NONE), /* 0x04: L3 hit */ OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, MISS), /* 0x05: L3 hit, snoop miss */ OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HIT), /* 0x06: L3 hit, snoop hit */ - OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM), /* 0x07: L3 hit, snoop hitm */ + OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM) | P(SNOOPX, FWD), /* 0x07: L3 hit, snoop hitm & fwd */ OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HIT), /* 0x08: L3 miss snoop hit */ OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/ OP_LH | P(LVL, LOC_RAM) | LEVEL(RAM) | P(SNOOP, HIT), /* 0x0a: L3 miss, shared */ >>> >>> >>> I think https://perfmon-events.intel.com/tigerLake.html is also >>> outdated, or at least it uses ambiguous grammar: The >>> MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD event (EventSel=D2H UMask=04H) is >>> documented as "Counts retired load instructions where a cross-core >>> snoop hit in another cores caches on this socket, the data was >>> forwarded back to the requesting core as the data was modified >>> (SNOOP_HITM) or the L3 did not have the data(SNOOP_HIT_WITH_FWD)" - >>> from what I understand, a "cross-core FWD" should be a case where the >>> L3 does have the data, unless L3 has become non-inclusive on Ice Lake? >>> For the event, the BriefDescription in the event list json file gives a more accurate description. "BriefDescription": "Snoop hit a modified(HITM) or clean line(HIT_W_FWD) in another on-pkg core which forwarded the data back due to a retired load instruction.", https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/pmu-events/arch/x86/tigerlake/cache.json#n286 Thanks, Kan >>> On a Tiger Lake CPU, I can see this event trigger for the >>> sys_call_table, which is located in the rodata region and probably >>> shouldn't be containing Modified cache lines: >>> >>> # grep -A1 -w sys_call_table /proc/kallsyms >>> ffffffff82800280 D sys_call_table >>> ffffffff82801100 d vdso_mapping >>> # perf record -e mem_load_l3_hit_retired.xsnp_fwd:ppp --all-kernel -c 100 --data >>> ^C[ perf record: Woken up 11 times to write data ] >>> [ perf record: Captured and wrote 22.851 MB perf.data (43176 samples) ] >>> # perf script -F event,ip,sym,addr | egrep --color 'ffffffff828002[89abcdef]' >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff82800280 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff828002d8 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff82800280 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff828002b8 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff828002b8 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff828002b8 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff82800280 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff82800288 >>> ffffffff82526275 do_syscall_64 >>> mem_load_l3_hit_retired.xsnp_fwd:ppp: ffffffff828002b8 >>> ffffffff82526275 do_syscall_64 >>> >>> >>> (For what it's worth, there is a thread on LKML where "cross-core FWD" >>> got mentioned: ) >> >> +others better qualified than me to respond. >> >> Hi Jann, >> >> I'm not overly familiar with the issue, but it appears a similar issue >> has been reported for Broadwell Xeon here: >> https://community.intel.com/t5/Software-Tuning-Performance/Broadwell-Xeon-perf-c2c-showing-remote-HITM-but-remote-socket-is/td-p/1172120 >> I'm not sure that thread will be particularly useful, but having the >> Intel people better qualified than me to answer is probably the better >> service of this email. >> >> Thanks, >> Ian >