Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp210178rdb; Tue, 5 Dec 2023 03:23:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IG6atH4F7wJAletQYGH07te9ZX8p8trPkhukQ0mK73jai3Ad2hb4zCE8uV7Jupr49y4GRd7 X-Received: by 2002:a05:6a00:2e13:b0:6ce:2731:a07f with SMTP id fc19-20020a056a002e1300b006ce2731a07fmr1318280pfb.46.1701775413496; Tue, 05 Dec 2023 03:23:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701775413; cv=none; d=google.com; s=arc-20160816; b=DGZ9eEE9ro2LQBl1of5LP8g/esTv9oq2jrZPrxHMZPmQ5gMaKBp/kwc6KEUvz1oNC+ yVCLfmbTNSpA3lQbMmKnLUYImKTeGE+V7bKXM3jTVwKd97vCavOmaQWsznhyMfZElBD3 M0WxqXeRfO6OjP8t5HEvkJtCzXBx1hu6vgrOzzA0kq/OvQ79RqJsiaxB9Nv+xwfpmNHI qIn9iJOPhlxv35eODziDJVP0yMaTL9xoW374fmBiC2wmyhOJYItgjd1H939fSDugdQ94 NdS+pq93Npt8f5P/80AcDx5Ao45rcuqLS2nO9y7FNdxaIoJptlCI032gm6bbxEpskeJ9 ZgIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=x/5FBdawbTM84E7DgL0A+dsLaoO2kQdQdnKlf9bO2hI=; fh=uCCWEUVpLc/xDP/D3WvGY/BxHU9d0NQYu/T71GEsNmA=; b=HNnQwIhVWATUKDmhfKDyOQaGgyCQLNYcbbslNyzmhpo/bH6+PczP3fuTUoLF5goiom 4s3GjFaVRQ8IG8ANqeazUP2Z1qPY2YhcY9LeBaxTSAHNYaoBu61MFHn45HkOX7uKyy57 v0o7j+8Dl6hNUyfzSKZKSYoUVNAlc/Yg1my3WNth5ugzEz1zICw/gYuczRQXGV/2Kl6M sSd6MUDVW2V3ZLla5sk8OUNSwgv9lsxS5XOrH6chCFbhSm2csa+uUpjY30S/AodLjEJI 7gcIuLijvO4ovbPS92xdBaO1bF2Ln8j72FntQ4rvIY+TzgLi+mxVvMEnJBJrcVTYGJSF Y8gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Y57+k4cT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id bw19-20020a056a02049300b005c6ac5b5fd1si697120pgb.676.2023.12.05.03.23.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 03:23:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Y57+k4cT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 3DA0180F9BF4; Tue, 5 Dec 2023 03:23:27 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442274AbjLELW3 (ORCPT + 99 others); Tue, 5 Dec 2023 06:22:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442150AbjLELVh (ORCPT ); Tue, 5 Dec 2023 06:21:37 -0500 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 542691AA; Tue, 5 Dec 2023 03:21:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701775292; x=1733311292; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=k2EFvyDZMYbQ2fXvfAoaFsJLuhPG//kHzGw1Pxarr6c=; b=Y57+k4cTuyds4UhLuiv8ehRoWCtBpNMvYI789irrnD4+v+nWUk/pXkZ+ 7rp5SzZPIBE+9srf3iGmmaWYcBN9LAAP0K/eALJNjW1ogYmOnGoeTXfUg 6m3QTgIiHFwAc6sxknq5I5AEbYUSj2R73RXNtz1EnsxSqpJDjyoGYRa6+ f6fCjRhaLAPbZHlOz7VkRxQIKJKjGH9KW/OG9j+YjfE+v20TmqFYwZ2/0 1GL/IMrGOxA17nh2Ez7nzHOIX7OCOwGpv0qev04L+r1u3xaNOB+MjcAET Mcxd15B1PpI6KfOG6SRRkakIhpXf6I83fKHG9hdnJ9ljv0vUkKEwVieEP g==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="942582" X-IronPort-AV: E=Sophos;i="6.04,252,1695711600"; d="scan'208";a="942582" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Dec 2023 03:21:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1018192980" X-IronPort-AV: E=Sophos;i="6.04,252,1695711600"; d="scan'208";a="1018192980" Received: from unknown (HELO fred..) ([172.25.112.68]) by fmsmga006.fm.intel.com with ESMTP; 05 Dec 2023 03:21:23 -0800 From: Xin Li To: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, kvm@vger.kernel.org, xen-devel@lists.xenproject.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, luto@kernel.org, pbonzini@redhat.com, seanjc@google.com, peterz@infradead.org, jgross@suse.com, ravi.v.shankar@intel.com, mhiramat@kernel.org, andrew.cooper3@citrix.com, jiangshanlai@gmail.com, nik.borisov@suse.com, shan.kang@intel.com Subject: [PATCH v13 23/35] x86/fred: Add a debug fault entry stub for FRED Date: Tue, 5 Dec 2023 02:50:12 -0800 Message-ID: <20231205105030.8698-24-xin3.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205105030.8698-1-xin3.li@intel.com> References: <20231205105030.8698-1-xin3.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 05 Dec 2023 03:23:27 -0800 (PST) From: "H. Peter Anvin (Intel)" When occurred on different ring level, i.e., from user or kernel context, #DB needs to be handled on different stack: User #DB on current task stack, while kernel #DB on a dedicated stack. This is exactly how FRED event delivery invokes an exception handler: ring 3 event on level 0 stack, i.e., current task stack; ring 0 event on the #DB dedicated stack specified in the IA32_FRED_STKLVLS MSR. So unlike IDT, the FRED debug exception entry stub doesn't do stack switch. On a FRED system, the debug trap status information (DR6) is passed on the stack, to avoid the problem of transient state. Furthermore, FRED transitions avoid a lot of ugly corner cases the handling of which can, and should be, skipped. The FRED debug trap status information saved on the stack differs from DR6 in both stickiness and polarity; it is exactly in the format which debug_read_clear_dr6() returns for the IDT entry points. Signed-off-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v9: * Disable #DB to avoid endless recursion and stack overflow when a watchpoint/breakpoint is set in the code path which is executed by #DB handler (Thomas Gleixner). Changes since v1: * call irqentry_nmi_{enter,exit}() in both IDT and FRED debug fault kernel handler (Peter Zijlstra). --- arch/x86/kernel/traps.c | 43 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index c876f1d36a81..848c85208a57 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -50,6 +50,7 @@ #include #include #include +#include #include #include #include @@ -934,8 +935,7 @@ static bool notify_debug(struct pt_regs *regs, unsigned long *dr6) return false; } -static __always_inline void exc_debug_kernel(struct pt_regs *regs, - unsigned long dr6) +static noinstr void exc_debug_kernel(struct pt_regs *regs, unsigned long dr6) { /* * Disable breakpoints during exception handling; recursive exceptions @@ -947,6 +947,11 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, * * Entry text is excluded for HW_BP_X and cpu_entry_area, which * includes the entry stack is excluded for everything. + * + * For FRED, nested #DB should just work fine. But when a watchpoint or + * breakpoint is set in the code path which is executed by #DB handler, + * it results in an endless recursion and stack overflow. Thus we stay + * with the IDT approach, i.e., save DR7 and disable #DB. */ unsigned long dr7 = local_db_save(); irqentry_state_t irq_state = irqentry_nmi_enter(regs); @@ -976,7 +981,8 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, * Catch SYSENTER with TF set and clear DR_STEP. If this hit a * watchpoint at the same time then that will still be handled. */ - if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs)) + if (!cpu_feature_enabled(X86_FEATURE_FRED) && + (dr6 & DR_STEP) && is_sysenter_singlestep(regs)) dr6 &= ~DR_STEP; /* @@ -1008,8 +1014,7 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs, local_db_restore(dr7); } -static __always_inline void exc_debug_user(struct pt_regs *regs, - unsigned long dr6) +static noinstr void exc_debug_user(struct pt_regs *regs, unsigned long dr6) { bool icebp; @@ -1093,6 +1098,34 @@ DEFINE_IDTENTRY_DEBUG_USER(exc_debug) { exc_debug_user(regs, debug_read_clear_dr6()); } + +#ifdef CONFIG_X86_FRED +/* + * When occurred on different ring level, i.e., from user or kernel + * context, #DB needs to be handled on different stack: User #DB on + * current task stack, while kernel #DB on a dedicated stack. + * + * This is exactly how FRED event delivery invokes an exception + * handler: ring 3 event on level 0 stack, i.e., current task stack; + * ring 0 event on the #DB dedicated stack specified in the + * IA32_FRED_STKLVLS MSR. So unlike IDT, the FRED debug exception + * entry stub doesn't do stack switch. + */ +DEFINE_FREDENTRY_DEBUG(exc_debug) +{ + /* + * FRED #DB stores DR6 on the stack in the format which + * debug_read_clear_dr6() returns for the IDT entry points. + */ + unsigned long dr6 = fred_event_data(regs); + + if (user_mode(regs)) + exc_debug_user(regs, dr6); + else + exc_debug_kernel(regs, dr6); +} +#endif /* CONFIG_X86_FRED */ + #else /* 32 bit does not have separate entry points. */ DEFINE_IDTENTRY_RAW(exc_debug) -- 2.43.0