Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3366267pxb; Sun, 31 Jan 2021 13:35:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJwl8h2qUPTl+SCiEWdpbOl8kFAZO6z2G5BBJSNOIxGe3gPv4ZM1RKg0warGi6xSeVc6/TOA X-Received: by 2002:a17:906:6951:: with SMTP id c17mr1961131ejs.395.1612128928419; Sun, 31 Jan 2021 13:35:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612128928; cv=none; d=google.com; s=arc-20160816; b=LwqrcGK+nt5shY+LJRLmDMQTlNzukmD+aE5woZvOxJ8/fAz3pIJ9KHsIQuBUfg8xx5 M8tV90Qz/tbsJuxaIU594o1KzmSvxZre+o63OfgnfVSHZ/EYROx1GWwqO9hMyqLKhjrY P5CeNAjCLqAmrm0KkV++DiTtgPViJrOgAwCDCRMhHwNfKGiFaL0A0hwPlWynxspxyfNn SCApLP2MxZ4PD6sSPBQLU0kqma5vJZsjuNFsyt7/5qIFIWUzKtmPlA09rKedDtW5/Lmh ILTeNFz3wXzBkODTwpzT4qEQlC9IhMFERymfu1OTJbr047QfTe1PY0JuvkOE0ttK5u+V AlVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=aVv1VE6oTnkwweqQiJQChepY7AAM6Zx/5bgSsuoIdW8=; b=TlpmvjewyarsDwWMtemDfAbRI0psxpt/pRNzxL/kn2Tna9m/eVhCd4HvLY99TGhYg4 ZUIuRreb61mH60H+/D2Lz+Eyjf2khBsicqw/NVEkbnd0AekCEw/Waa7rMo2JbPsFtLC5 0fETQ3OGkrTzTnG2mDc4zJkDr3UnZNgt6nYuiDeAe5TSG1TeypSd+JITlZz8NqCCDomO tcsXBPiEuNXGVguMAwzVccWvox1NhtC0PhYWhHRoHln2I2wi1dQttKPO7yImZAb/hP9w Pd5CBXfPlIfyswb8PB8h+CS7eeMdwU9l3basBW70VQycalsJVeCBR/ukd4YIcxPEGeey ukPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=kthmt4qb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rv14si8982497ejb.288.2021.01.31.13.35.03; Sun, 31 Jan 2021 13:35:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=kthmt4qb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231784AbhAaUAo (ORCPT + 99 others); Sun, 31 Jan 2021 15:00:44 -0500 Received: from mail.kernel.org ([198.145.29.99]:52044 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231652AbhAaTyi (ORCPT ); Sun, 31 Jan 2021 14:54:38 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5C06464E45; Sun, 31 Jan 2021 17:24:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612113892; bh=HepQvI0wvXt/1k1onm6Zz8Ow8kY0AcY6XDUlPSCCaVQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kthmt4qbvtS/tpzTn/Of6D4YG6PbKRwP6avoNIfh2vdMX6J5l4MUKSXPNDJNbLbXY eIWKHSMragBzpIB+VRSRkioCbz6qTNGpH0cUF1edYgecTvIFbItd5ZdRT44gxpQS3H YhkP3t0WHbAelkzx3RFSf+w8Oa2TIU4B47fTcmBjl3zoazQ6Y86EduZtIKVlSLWK/5 v1xIhAleOUQ6BfpAS/ARRUJIA6uY+DE/NWUr9PFQuoJFNXx9va9nQ+Yej9R3ErtoVP UEM05Mul0/4KtfUGr3FBVVG4lF9X2+PpiftvfCdzsllQUm0jOBPPre1UG4kFAU9j9+ r72YqlkuhsZlQ== From: Andy Lutomirski To: x86@kernel.org Cc: LKML , Dave Hansen , Alexei Starovoitov , Daniel Borkmann , Yonghong Song , Masami Hiramatsu , Andy Lutomirski , Peter Zijlstra Subject: [PATCH 07/11] x86/fault: Split the OOPS code out from no_context() Date: Sun, 31 Jan 2021 09:24:38 -0800 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Not all callers of no_context() want to run exception fixups. Separate the OOPS code out from the fixup code in no_context(). Cc: Dave Hansen Cc: Peter Zijlstra Signed-off-by: Andy Lutomirski --- arch/x86/mm/fault.c | 116 +++++++++++++++++++++++--------------------- 1 file changed, 62 insertions(+), 54 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 1939e546beae..6f43d080e1e8 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -618,53 +618,20 @@ static void set_signal_archinfo(unsigned long address, } static noinline void -no_context(struct pt_regs *regs, unsigned long error_code, - unsigned long address, int signal, int si_code) +page_fault_oops(struct pt_regs *regs, unsigned long error_code, + unsigned long address) { - struct task_struct *tsk = current; unsigned long flags; int sig; if (user_mode(regs)) { /* - * This is an implicit supervisor-mode access from user - * mode. Bypass all the kernel-mode recovery code and just - * OOPS. + * Implicit kernel access from user mode? Skip the stack + * overflow and EFI special cases. */ goto oops; } - /* Are we prepared to handle this kernel fault? */ - if (fixup_exception(regs, X86_TRAP_PF, error_code, address)) { - /* - * Any interrupt that takes a fault gets the fixup. This makes - * the below recursive fault logic only apply to a faults from - * task context. - */ - if (in_interrupt()) - return; - - /* - * Per the above we're !in_interrupt(), aka. task context. - * - * In this case we need to make sure we're not recursively - * faulting through the emulate_vsyscall() logic. - */ - if (current->thread.sig_on_uaccess_err && signal) { - sanitize_error_code(address, &error_code); - - set_signal_archinfo(address, error_code); - - /* XXX: hwpoison faults will set the wrong code. */ - force_sig_fault(signal, si_code, (void __user *)address); - } - - /* - * Barring that, we can do the fixup and be happy. - */ - return; - } - #ifdef CONFIG_VMAP_STACK /* * Stack overflow? During boot, we can fault near the initial @@ -672,8 +639,8 @@ no_context(struct pt_regs *regs, unsigned long error_code, * that we're in vmalloc space to avoid this. */ if (is_vmalloc_addr((void *)address) && - (((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) || - address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) { + (((unsigned long)current->stack - 1 - address < PAGE_SIZE) || + address - ((unsigned long)current->stack + THREAD_SIZE) < PAGE_SIZE)) { unsigned long stack = __this_cpu_ist_top_va(DF) - sizeof(void *); /* * We're likely to be running with very little stack space @@ -696,20 +663,6 @@ no_context(struct pt_regs *regs, unsigned long error_code, } #endif - /* - * 32-bit: - * - * Valid to do another page fault here, because if this fault - * had been triggered by is_prefetch fixup_exception would have - * handled it. - * - * 64-bit: - * - * Hall of shame of CPU/BIOS bugs. - */ - if (is_prefetch(regs, error_code, address)) - return; - /* * Buggy firmware could access regions which might page fault, try to * recover from such faults. @@ -726,7 +679,7 @@ no_context(struct pt_regs *regs, unsigned long error_code, show_fault_oops(regs, error_code, address); - if (task_stack_end_corrupted(tsk)) + if (task_stack_end_corrupted(current)) printk(KERN_EMERG "Thread overran stack, or stack corrupted\n"); sig = SIGKILL; @@ -739,6 +692,61 @@ no_context(struct pt_regs *regs, unsigned long error_code, oops_end(flags, regs, sig); } +static noinline void +no_context(struct pt_regs *regs, unsigned long error_code, + unsigned long address, int signal, int si_code) +{ + if (user_mode(regs)) { + /* + * This is an implicit supervisor-mode access from user + * mode. Bypass all the kernel-mode recovery code and just + * OOPS. + */ + goto oops; + } + + /* Are we prepared to handle this kernel fault? */ + if (fixup_exception(regs, X86_TRAP_PF, error_code, address)) { + /* + * Any interrupt that takes a fault gets the fixup. This makes + * the below recursive fault logic only apply to a faults from + * task context. + */ + if (in_interrupt()) + return; + + /* + * Per the above we're !in_interrupt(), aka. task context. + * + * In this case we need to make sure we're not recursively + * faulting through the emulate_vsyscall() logic. + */ + if (current->thread.sig_on_uaccess_err && signal) { + sanitize_error_code(address, &error_code); + + set_signal_archinfo(address, error_code); + + /* XXX: hwpoison faults will set the wrong code. */ + force_sig_fault(signal, si_code, (void __user *)address); + } + + /* + * Barring that, we can do the fixup and be happy. + */ + return; + } + + /* + * AMD erratum #91 manifests as a spurious page fault on a PREFETCH + * instruction. + */ + if (is_prefetch(regs, error_code, address)) + return; + +oops: + page_fault_oops(regs, error_code, address); +} + /* * Print out info about fatal segfaults, if the show_unhandled_signals * sysctl is set: -- 2.29.2