Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2841250pxj; Mon, 10 May 2021 11:58:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyIo5x8Ig3V0fMPqXBpzjy0UCXG2a0nYj6kvn4mD9W4YT1Z4KDELJ9ASFwnnhdHK31O/ihu X-Received: by 2002:aa7:dad7:: with SMTP id x23mr31483238eds.86.1620673092500; Mon, 10 May 2021 11:58:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620673092; cv=none; d=google.com; s=arc-20160816; b=UEqufDj2Z6cHNfAmIT4bZOFxsHyeJLF6vIyloRaWkE+eqFogL01dlCQ0zxDFjAAsAo enDP9OH7x8VtiQ/R2GO5IUKkZ6uPHIIoNjPb3jT2Ci/o0vAlnnyY3pht8h1r5oH9mJnK srapf3QQPKKhJlrTtEd6r5mIVqnT9z34JUsWEERl17vvj1i0Hl8T1RqDwwUGqUjLmzkL ze18HauPlo05jZwli+J7D65gcfe6KMFAdj+e89xU6fSekymsg36KVb4t4vPi5cwsQUAH G1S89tu4I6IYgwx27r4zlybpchb9TTn+r8zRKc8/NLUSHnf1pGOG0LG+Oz0TJy3gKxGu UGDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-filter; bh=ihd+nnXS01m4uwWeAFyL7jCabdlbR4ZoNN6Y697+Uy8=; b=hheVvZW6RmovfNaqvetL9NuTlVkaN9g+bra7RDIAqza9jJbNtl84Dy49bpioLVgR5p 7ZP4Az2t96yrtlifUQWWkbI7PYc6+Dx0vJA8+nV/4PQ+nwu7O4WTgA2qNix0uoSToEyc 3NN/dA9r8jkTpoLqOI8LS6Zcg6YNxM5Swf4QaOdL7dq+ILq82xQT0+BS6Xt9yveboEzY WiHrwURbrEejr3OoqQOou/scD11GKHHuLnIbRtYVEgUmh+gv2VCSU205XRoPhO0uirPU k4FCBKoaTKjcHKAyHQtnv+gRyufQl9kR5Q8/DKrg59+Ln0tASGJdePRlCyJ+p6c2fDfQ jSyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zytor.com header.s=2021042801 header.b=PJVzp58N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zytor.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x7si7679693edd.59.2021.05.10.11.57.47; Mon, 10 May 2021 11:58:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@zytor.com header.s=2021042801 header.b=PJVzp58N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zytor.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233342AbhEJSzE (ORCPT + 99 others); Mon, 10 May 2021 14:55:04 -0400 Received: from terminus.zytor.com ([198.137.202.136]:52605 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232999AbhEJSyr (ORCPT ); Mon, 10 May 2021 14:54:47 -0400 Received: from tazenda.hos.anvin.org ([IPv6:2601:646:8602:8be0:7285:c2ff:fefb:fd4]) (authenticated bits=0) by mail.zytor.com (8.16.1/8.15.2) with ESMTPSA id 14AIrNlj2459085 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 10 May 2021 11:53:33 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 14AIrNlj2459085 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2021042801; t=1620672814; bh=ihd+nnXS01m4uwWeAFyL7jCabdlbR4ZoNN6Y697+Uy8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PJVzp58NrmuwgU00cnyfqsZr6vvxhAy/vfRO3mJLRS/2C4DudgKjTI+TEbqxV0dfh yiYyWCE13L0pbk9ILVjobDmxY6OkL7DE3EGF2hM0DoGionW2NgW9x31pIECbAB+8ud DAHDdVVwwGVWcjEJYozx2eOBDIhaGGDpwWFbOx2tQlvblbVUhU1NwnOKO3zXnx0ino Jl6sr9FSAngfu2EzH0n6SWyEVFGaTHA1t9YUnT/AJRcd4qPaqRyS5tZ06g9+yRMssb 5D3ScM0Qo5eHr7lHNL0qyj3tvoR52HByBZXdo5fZVBrIBnGZTN3J4GbM/tZXmFUoQ4 SvB1Uo1G2minQ== From: "H. Peter Anvin" To: Ingo Molnar , Thomas Gleixner , Borislav Petkov , Andy Lutomirski Cc: "H. Peter Anvin" , Linux Kernel Mailing List Subject: [RFC v2 PATCH 7/7] x86/entry: use int for syscall number; handle all invalid syscall nrs Date: Mon, 10 May 2021 11:53:16 -0700 Message-Id: <20210510185316.3307264-8-hpa@zytor.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210510185316.3307264-1-hpa@zytor.com> References: <20210510185316.3307264-1-hpa@zytor.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "H. Peter Anvin (Intel)" Redefine the system call number consistently to be "int". The value -1 is a non-system call (which can be poked in by ptrace/seccomp to indicate that no further processing should be done and that the return value should be the current value in regs->ax, default to -ENOSYS; any other value which does not correspond to a valid system call unconditionally calls sys_ni_syscall() and returns -ENOSYS just like any system call that corresponds to a hole in the system call table. This is the defined semantics of syscall_get_nr(), so that is what all the architecture-independent code already expects. As documented in (which is simply the documentation file for ): /** * syscall_get_nr - find what system call a task is executing * @task: task of interest, must be blocked * @regs: task_pt_regs() of @task * * If @task is executing a system call or is at system call * tracing about to attempt one, returns the system call number. * If @task is not executing a system call, i.e. it's blocked * inside the kernel for a fault or signal, returns -1. * * Note this returns int even on 64-bit machines. Only 32 bits of * system call number can be meaningful. If the actual arch value * is 64 bits, this truncates to 32 bits so 0xffffffff means -1. * * It's only valid to call this when @task is known to be blocked. */ int syscall_get_nr(struct task_struct *task, struct pt_regs *regs); Signed-off-by: H. Peter Anvin (Intel) --- arch/x86/entry/common.c | 79 +++++++++++++++++++++++----------- arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/syscall.h | 2 +- 3 files changed, 55 insertions(+), 28 deletions(-) diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index 00da0f5420de..bf1ccaf101d7 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -36,61 +36,89 @@ #include #ifdef CONFIG_X86_64 -__visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr) + +static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr) +{ + unsigned long unr = nr; + + if (likely(unr < NR_syscalls)) { + unr = array_index_nospec(unr, NR_syscalls); + regs->ax = sys_call_table[unr](regs); + return true; + } + return false; +} + +static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr) +{ + unsigned long xnr = nr; + + xnr -= __X32_SYSCALL_BIT; + + if (IS_ENABLED(CONFIG_X86_X32_ABI) && + likely(xnr < X32_NR_syscalls)) { + xnr = array_index_nospec(xnr, X32_NR_syscalls); + regs->ax = x32_sys_call_table[xnr](regs); + return true; + } + return false; +} + +__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr) { add_random_kstack_offset(); nr = syscall_enter_from_user_mode(regs, nr); instrumentation_begin(); - if (likely(nr < NR_syscalls)) { - nr = array_index_nospec(nr, NR_syscalls); - regs->ax = sys_call_table[nr](regs); -#ifdef CONFIG_X86_X32_ABI - } else if (likely((nr & __X32_SYSCALL_BIT) && - (nr & ~__X32_SYSCALL_BIT) < X32_NR_syscalls)) { - nr = array_index_nospec(nr & ~__X32_SYSCALL_BIT, - X32_NR_syscalls); - regs->ax = x32_sys_call_table[nr](regs); -#endif + + if (!do_syscall_x64(regs, nr) && + !do_syscall_x32(regs, nr) && + nr != -1) { + /* Invalid system call, but still a system call? */ + regs->ax = __x64_sys_ni_syscall(regs); } + instrumentation_end(); syscall_exit_to_user_mode(regs); } #endif #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION) -static __always_inline unsigned int syscall_32_enter(struct pt_regs *regs) +static __always_inline int syscall_32_enter(struct pt_regs *regs) { if (IS_ENABLED(CONFIG_IA32_EMULATION)) current_thread_info()->status |= TS_COMPAT; - return (unsigned int)regs->orig_ax; + return (int)regs->orig_ax; } /* * Invoke a 32-bit syscall. Called with IRQs on in CONTEXT_KERNEL. */ -static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, - unsigned int nr) +static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr) { - if (likely(nr < IA32_NR_syscalls)) { - nr = array_index_nospec(nr, IA32_NR_syscalls); - regs->ax = ia32_sys_call_table[nr](regs); + unsigned long unr = nr; + + if (likely(unr < IA32_NR_syscalls)) { + unr = array_index_nospec(unr, IA32_NR_syscalls); + regs->ax = ia32_sys_call_table[unr](regs); + } else if (nr != -1) { + regs->ax = __ia32_sys_ni_syscall(regs); } } /* Handles int $0x80 */ __visible noinstr void do_int80_syscall_32(struct pt_regs *regs) { - unsigned int nr = syscall_32_enter(regs); + int nr = syscall_32_enter(regs); add_random_kstack_offset(); /* - * Subtlety here: if ptrace pokes something larger than 2^32-1 into - * orig_ax, the unsigned int return value truncates it. This may - * or may not be necessary, but it matches the old asm behavior. + * Subtlety here: if ptrace pokes something larger than 2^31-1 into + * orig_ax, the int return value truncates it. This matches + * the semantics of syscall_get_nr(). */ - nr = (unsigned int)syscall_enter_from_user_mode(regs, nr); + nr = syscall_enter_from_user_mode(regs, nr); instrumentation_begin(); do_syscall_32_irqs_on(regs, nr); @@ -101,7 +129,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs *regs) static noinstr bool __do_fast_syscall_32(struct pt_regs *regs) { - unsigned int nr = syscall_32_enter(regs); + int nr = syscall_32_enter(regs); int res; add_random_kstack_offset(); @@ -136,8 +164,7 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs) return false; } - /* The case truncates any ptrace induced syscall nr > 2^32 -1 */ - nr = (unsigned int)syscall_enter_from_user_mode_work(regs, nr); + nr = syscall_enter_from_user_mode_work(regs, nr); /* Now this is just like a normal syscall. */ do_syscall_32_irqs_on(regs, nr); diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 1d9db15fdc69..85f04ea0e368 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -108,7 +108,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL) /* IRQs are off. */ movq %rsp, %rdi - movq %rax, %rsi + movslq %eax, %rsi call do_syscall_64 /* returns with IRQs disabled */ /* diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h index f6593cafdbd9..f7e2d82d24fb 100644 --- a/arch/x86/include/asm/syscall.h +++ b/arch/x86/include/asm/syscall.h @@ -159,7 +159,7 @@ static inline int syscall_get_arch(struct task_struct *task) ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64; } -void do_syscall_64(struct pt_regs *regs, unsigned long nr); +void do_syscall_64(struct pt_regs *regs, int nr); void do_int80_syscall_32(struct pt_regs *regs); long do_fast_syscall_32(struct pt_regs *regs); -- 2.31.1