Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1549985pxu; Sat, 5 Dec 2020 22:48:06 -0800 (PST) X-Google-Smtp-Source: ABdhPJyRzXkDWPm2eCCrDNUi//DEuollEGdajdRBTOrORDMyZuHr9ZweVgjDsH7KsCM1U+lrcCca X-Received: by 2002:a05:6402:d0a:: with SMTP id eb10mr14987883edb.305.1607237285879; Sat, 05 Dec 2020 22:48:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607237285; cv=none; d=google.com; s=arc-20160816; b=H4K2ERIenGQR+NMDBbHfaPrxzyFA0md+8/QSYSTU/Ia2THYgddrARL9V7ymybDa9bd H0kLRLkgfedbA2mTJ91ae4j5zqMfaARXEpffY9Op+W2GjWq7Tda9I2FWDHulye0YrwcK e8eNemiji/VipmT/vY5fe+oO0HxL+SyY3cop2Qjin6caCYo08p+m6HfCxYKZ4rWEQGbd Fb0D2lyIl2onwWv3DrLWnepW0FNB03jJSbBdwiz70i28aKIJTGsHId2FB6JA01eGsPU0 4aQpqaooZLaGnQa0wMx8PLk1B3yxakFioFuOCO3lgynQgGhp9jxu8WgfPHv8givkB6Al njQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=JrSeeGeijZI5Kf7BLN1l4URCY6Ao7mCwgeFqJBM1H1s=; b=0PhWF1tmGAhjdaqB0bl/TCSMxcPsqYww/oMurt5JWjTvdm0t6vfko6ozJus4aEw08p yGhGXUwwzEqskAhRFQ1oNfQuDFAoXjNY/Ia6zbyinWogV/cWAh9hFUDw9AQwH6XLJTot jQ7odldMucFyPMFymNG3xUycGPWZdekOaAT6wNatAMfKkWryA/Tp0rJ0jaoTz0wVOkKI zwFG2OKzzSM7UxZhv+rPN/BvA05qJEPmPwcckuukayUcTwZ4NorvPTF9T7nee6LNCiwj /wWE8yqXkvtPl5Sr5vbL0BycbrrsKupk/AoOguWcewaKOAheUo3UWyLNfVXaslzGVyJV YG2Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j15si5699105edk.470.2020.12.05.22.47.10; Sat, 05 Dec 2020 22:48:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725800AbgLFGgW (ORCPT + 99 others); Sun, 6 Dec 2020 01:36:22 -0500 Received: from mail.kernel.org ([198.145.29.99]:35116 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725379AbgLFGgV (ORCPT ); Sun, 6 Dec 2020 01:36:21 -0500 From: "Andrew G. Morgan" Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org Cc: "Andrew G. Morgan" Subject: [PATCH] Debug config option for including cpu thread id in fault and signal info Date: Sat, 5 Dec 2020 22:35:05 -0800 Message-Id: <20201206063505.531798-1-morgan@kernel.org> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This option defaults to off. I found this useful doing some fault isolation on an out of warranty PC. I found one CPU core was consistently segfaulting and, together with isolcpus=,... I was able to breathe some life back into that machine. Since the si_errno value is generally hard-coded as zero for most signals, userspace generally ignores it. Indeed, where I have tried it, the userspace programs appear to be remarkably tolerant of this re-purposing of the si_errno value. Signed-off-by: Andrew G. Morgan --- init/Kconfig | 16 ++++++++++++++++ kernel/signal.c | 21 ++++++++++++++------- 2 files changed, 30 insertions(+), 7 deletions(-) diff --git a/init/Kconfig b/init/Kconfig index 02d13ae27abb..74f1202c449a 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -634,6 +634,22 @@ config PSI_DEFAULT_DISABLED endmenu # "CPU/Task time and stats accounting" +config SIG_ERRNO_HAS_THREAD_INFO + bool "Debug thread source for faults and interrupts" + default n + help + Select this to reveal to userspace, via siginfo_t.si_errno, + the HW thread number associated with a fault or signal + source. This feature has a number of HW debugging and + performance applications. For example, if a core seems + to be unstable, isolcpus= at boot can help avoid using + it. + + The legacy and default value for this is mostly 0 for faults + and signals, see 'man sigaction' for details. To distinguish + thread 0 from this legacy, when the si_errno value holds a + valid thread number, its uppermost bit is also set to 1. + config CPU_ISOLATION bool "CPU isolation" depends on SMP || COMPILE_TEST diff --git a/kernel/signal.c b/kernel/signal.c index ef8f2a28d37c..523b93ec89f9 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -1617,13 +1618,19 @@ send_sig(int sig, struct task_struct *p, int priv) } EXPORT_SYMBOL(send_sig); +#ifdef CONFIG_SIG_ERRNO_HAS_THREAD_INFO +#define ZERO_OR_THREAD_INDEX ((1<<(8*sizeof(int)-1))|task_cpu(current)) +#else +#define ZERO_OR_THREAD_INDEX 0 +#endif + void force_sig(int sig) { struct kernel_siginfo info; clear_siginfo(&info); info.si_signo = sig; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = SI_KERNEL; info.si_pid = 0; info.si_uid = 0; @@ -1659,7 +1666,7 @@ int force_sig_fault_to_task(int sig, int code, void __user *addr clear_siginfo(&info); info.si_signo = sig; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = code; info.si_addr = addr; #ifdef __ARCH_SI_TRAPNO @@ -1691,7 +1698,7 @@ int send_sig_fault(int sig, int code, void __user *addr clear_siginfo(&info); info.si_signo = sig; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = code; info.si_addr = addr; #ifdef __ARCH_SI_TRAPNO @@ -1712,7 +1719,7 @@ int force_sig_mceerr(int code, void __user *addr, short lsb) WARN_ON((code != BUS_MCEERR_AO) && (code != BUS_MCEERR_AR)); clear_siginfo(&info); info.si_signo = SIGBUS; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = code; info.si_addr = addr; info.si_addr_lsb = lsb; @@ -1726,7 +1733,7 @@ int send_sig_mceerr(int code, void __user *addr, short lsb, struct task_struct * WARN_ON((code != BUS_MCEERR_AO) && (code != BUS_MCEERR_AR)); clear_siginfo(&info); info.si_signo = SIGBUS; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = code; info.si_addr = addr; info.si_addr_lsb = lsb; @@ -1740,7 +1747,7 @@ int force_sig_bnderr(void __user *addr, void __user *lower, void __user *upper) clear_siginfo(&info); info.si_signo = SIGSEGV; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = SEGV_BNDERR; info.si_addr = addr; info.si_lower = lower; @@ -1755,7 +1762,7 @@ int force_sig_pkuerr(void __user *addr, u32 pkey) clear_siginfo(&info); info.si_signo = SIGSEGV; - info.si_errno = 0; + info.si_errno = ZERO_OR_THREAD_INDEX; info.si_code = SEGV_PKUERR; info.si_addr = addr; info.si_pkey = pkey; -- 2.26.2