Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp3431892imw; Mon, 11 Jul 2022 08:31:39 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vAz/CYTY+2vlQiYDGc+7UewOsTOJ4bXDEqd0HxmPXIZ8BA6S9nPBg6NQ12K0JBksZYfZIM X-Received: by 2002:aa7:88c3:0:b0:52a:d6ee:eb5d with SMTP id k3-20020aa788c3000000b0052ad6eeeb5dmr3222198pff.63.1657553499589; Mon, 11 Jul 2022 08:31:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657553499; cv=none; d=google.com; s=arc-20160816; b=ROn1EMW6yr/nLpJXP+EoT5iR+h9zvRnNHjfUw3J0r5/TG6He8s7DXmpe19pl11gDS1 XN6SYgi3QSgqlE3ffO4mN/6JbudzfWQp3hD1b8d1zocUcc6pPcBBcDX94DboMGIXszJ2 7kypvWsM/mwUr0DwB93rpTPdmUvJFhPWE4yb6yuT9SRDbJIfVxNsVhRHlTE4whknloI7 4Tue7gGUJcx6L/PsIlIQaT0QwGksZCeVpEYdUAJPMbZJhkAL4S+wj5BCnz568r1XuZko zAcjsA4ajdUQgrRVD2N9MO1UBfu2zJoEVc7s0pr0MRwkyyGaP1qphUEL61SoaPdexjPN DGyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=HzVzGIosr2vR6HJlsjjoqTv6mcI/+dRnfwx8KfvnDUA=; b=nV0KZFbLcsq5oe9bJV3v7qtBboCbgIUGbKC7Qg/+fRFgOpIrhSOYWuYDyuZOYDjH6N P1Wf4N/RUHDR6/AstIO2Hyg/MSHymy1Nko0XyswONZU6cOqgfgk4BCW2Jo8uk51Vx7Xj JzQ/urbPPeWN4GR41FdDBrJfSfATODzumT5LwRJ60DE9F2lUgPnoNxFATT+19vdDBT5Z 8zcx/44hq+uyEt30n48y02u52PnE/qmFNcwMuAeM+g2YaSkySuPkLmXGIc1NZtN3TPkW e7VmTDgMHIJOEdpm4l8MYGNoBrUZj6mUXxQIYydcbvutUUk/+hVihn9SILQTLkawnJOh 7i0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XCHLK7Gk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n1-20020a632701000000b003fd97d4ed97si10901940pgn.21.2022.07.11.08.31.26; Mon, 11 Jul 2022 08:31:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XCHLK7Gk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232270AbiGKPZk (ORCPT + 99 others); Mon, 11 Jul 2022 11:25:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231572AbiGKPZi (ORCPT ); Mon, 11 Jul 2022 11:25:38 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D84D28731 for ; Mon, 11 Jul 2022 08:25:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657553138; x=1689089138; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=PJY0gCVqbX61mhga+/o/dwiJnok4+p1Q97+JBuljLlI=; b=XCHLK7GkI5TFmTWEZgNVOGpBg4n7tvsou3V2XEHFq8UsKdIRReE1QXuZ cdi/MrRNfkVfoqwTCkNofTgcjzfhqHjrZAk6ZZQhBhXEDjf95vs/m08Z7 LSTIbTKYqZgCYSI/tSFbRq/sdsXIiubs0dSs/rRge2ySTeKMSlRN6vE64 GV0jP5dche3hQHVXxGMbQyDNX33dtGTizMrqPp7f8PgV2e6bQD5wz9TPX /Rpaz+Jv41EQLda0dUxmvLK0fAFiKLwZ4AlvGzyatVJ44YqWDMuUbjFvL YouhjzqVp71zmlaXMuzzPpPLP/8shLlM9rCoCrPGVDpWyCXvS0ZagHgx/ A==; X-IronPort-AV: E=McAfee;i="6400,9594,10405"; a="284715463" X-IronPort-AV: E=Sophos;i="5.92,263,1650956400"; d="scan'208";a="284715463" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2022 08:25:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,263,1650956400"; d="scan'208";a="569803689" Received: from linux.intel.com ([10.54.29.200]) by orsmga006.jf.intel.com with ESMTP; 11 Jul 2022 08:25:37 -0700 Received: from [10.252.212.229] (kliang2-mobl1.ccr.corp.intel.com [10.252.212.229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 12E6E580BDA; Mon, 11 Jul 2022 08:25:35 -0700 (PDT) Message-ID: <4b15d3d1-389b-fee4-d1b9-8732859e3696@linux.intel.com> Date: Mon, 11 Jul 2022 11:25:34 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.0.1 Subject: Re: [perf] unchecked MSR access error: WRMSR to 0x689 in intel_pmu_lbr_restore Content-Language: en-US To: Vince Weaver Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , pawan.kumar.gupta@linux.intel.com References: <66961a7d-a6d8-2536-9ed3-68c2e7f4e03d@maine.edu> From: "Liang, Kan" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-07-08 12:13 p.m., Vince Weaver wrote: > On Wed, 6 Jul 2022, Vince Weaver wrote: > >> Let the fuzzer running a long time on 5.19-rc1 and after a few weeks it >> triggered this weird trace. It is repeatable (although I haven't >> narrowed down exactly what's causing it). >> >> It's odd in that it just dumps a , it doesn't provide any info on >> what the actual trigger is. >> >> This is on a Haswell machine. > > I bumped up to current git and managed to trigger this again, this time > it actually managed to print the error message. > > [ 7763.384369] unchecked MSR access error: WRMSR to 0x689 (tried to write 0x1fffffff8101349e) at rIP: 0xffffffff810704a4 (native_write_msr+0x4/0x20) The 0x689 is a valid LBR register, which is MSR_LASTBRANCH_9_FROM_IP. The issue should be caused by the known TSX bug, which is mentioned in the commit 9fc9ddd61e0 ("perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug when no TSX"). It looks like the TSX support has been deactivated, however the quirk in the commit isn't applied for some reason. To apply the quirk, perf relies on the boot CPU's flag and LBR format. static inline bool lbr_from_signext_quirk_needed(void) { bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM); return !tsx_support && x86_pmu.lbr_has_tsx; } Could you please share the value of the PERF_CAPABILITIES MSR 0x00000345 of the machine? I'd like to double check whether the LBR fromat is correct. 0x5 is expected. If the LBR format is correct, maybe the boot CPU's flag is not cleared when the TSX support is deactivated. I noticed that Pawan recently had several TSX patches merged which may impact the flags. 400331f8ffa3 ("x86/tsx: Disable TSX development mode at boot") 258f3b8c3210 ("x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits") If you only observe the issue with the latest kernel, you may want to revert the above two patches and see if it helps. Thanks, Kan > [ 7763.397420] Call Trace: > [ 7763.399881] > [ 7763.401994] intel_pmu_lbr_restore+0x9a/0x1f0 > [ 7763.406363] intel_pmu_lbr_sched_task+0x91/0x1c0 > [ 7763.410992] __perf_event_task_sched_in+0x1cd/0x240 > [ 7763.415879] ? __perf_event_task_sched_out+0x75/0x5c0 > [ 7763.420939] finish_task_switch.isra.0+0x15b/0x2a0 > [ 7763.425740] ? __switch_to+0x112/0x430 > [ 7763.429503] __schedule+0x2cf/0x10d0 > [ 7763.433088] ? send_signal_locked+0xc8/0x130 > [ 7763.437364] schedule+0x4e/0xb0 > [ 7763.440518] do_wait+0x15b/0x2f0 > [ 7763.443757] kernel_wait4+0xa6/0x140 > [ 7763.447337] ? thread_group_exited+0x60/0x60 > [ 7763.451608] do_syscall_64+0x3b/0xc0 > [ 7763.455199] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > [ 7763.460259] RIP: 0033:0x7f9063feba26 > [ 7763.463838] Code: ff e9 0e 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 3d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 90 48 83 ec 28 89 54 24 14 48 89 74 24 > [ 7763.482587] RSP: 002b:00007ffca3875c08 EFLAGS: 00000246 ORIG_RAX: 000000000000003d > [ 7763.490161] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9063feba26 > [ 7763.497303] RDX: 0000000000000000 RSI: 00007ffca3875c1c RDI: 0000000000002214 > [ 7763.504441] RBP: 00007ffca3875c20 R08: 00007f90640f321c R09: 00007f90640f3240 > [ 7763.511577] R10: 0000000000000000 R11: 0000000000000246 R12: 000055d6ce93c4f0 > [ 7763.518718] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 7763.525850] >