Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp4918444pxy; Tue, 27 Apr 2021 16:00:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzAPyAH3DCLZQEJjTYSb+EhFf8kLQ1pKqjI5smA1cBEEk2QG4/xN6ZINnlGCl9lg28w3iih X-Received: by 2002:a05:6402:17b0:: with SMTP id j16mr248376edy.97.1619564437507; Tue, 27 Apr 2021 16:00:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619564437; cv=none; d=google.com; s=arc-20160816; b=GmzmS5XRCh3CVtZVGlpFKkR3MJ7/5lFb9bJ+JJtyYzlxP63JNyIAf9f5wOeKhBSUfz OOed1E3FxVZILdC6wj5gYcwy2FG0NNnV8rXGyIQWHPaTScyKNC+mfqIl0w7AZAOKJo+J vmZoPZrFD+DjO8k1QmiP1ju9UEatxwbwJc3zeCEaUbTodYcMXG2M14EQFbTy6a7RzmN6 Vc44891MEW9eD1yS+aROdP19V8gzka8brwApxnMFWv8LGJUJqq4RiCbdL+Ja809MSCpm Q+9kuhx8FQL8uKPrDSgybx0gzc+6LViJF5PbE510MhBw0oMH0luHBiSJSCLwYlqVhPwe PDmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature:dkim-filter; bh=BKXnXbtnLAZiK4ZaeAicI6A/g8pOK30QDSs2ImWpeVQ=; b=e4vnI2jZPWOkRW0ML0UvAi77XVk2r4EN8CdffZT9SruZJ7Vf6M5Z/cbufvWJ2GRZjv TdLWS0yoQY+x9yIYexHNwr7bnM9iInG79T6CNh4VozTYS5LD2VDSk9jFtlzCZ9xA7T9k KY8Ca7TA4BKNAVsBrobFuWyP8VZzZHD72K3Ey3h2YVLDbqU9TnwNDUAUD+izf/yFJ7L/ HINFBFVbJvkHETWJACtLNIErxVbhTTiD1FmUdF44W+4ROmJx3M4ZjKWyNCmd368ZiEW5 vCkPY0dWI7426An7zIvJ5r7MH99GvBLYdAJkm1mqhcjWy4IItR++ha++7MilnOzV6e3U ioMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zytor.com header.s=2021032801 header.b=M9JZEPiE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zytor.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w20si1052700eje.81.2021.04.27.16.00.13; Tue, 27 Apr 2021 16:00:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@zytor.com header.s=2021032801 header.b=M9JZEPiE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zytor.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236088AbhD0W7P (ORCPT + 99 others); Tue, 27 Apr 2021 18:59:15 -0400 Received: from terminus.zytor.com ([198.137.202.136]:51513 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236547AbhD0W7L (ORCPT ); Tue, 27 Apr 2021 18:59:11 -0400 Received: from tazenda.hos.anvin.org ([IPv6:2601:646:8602:8be0:7285:c2ff:fefb:fd4]) (authenticated bits=0) by mail.zytor.com (8.16.1/8.15.2) with ESMTPSA id 13RMw8s9941858 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Tue, 27 Apr 2021 15:58:09 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 13RMw8s9941858 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2021032801; t=1619564293; bh=BKXnXbtnLAZiK4ZaeAicI6A/g8pOK30QDSs2ImWpeVQ=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=M9JZEPiEXKQaxG0i128nVah5nnl2dSyxxAIZ2p5b3CY7sHSS3izuaHRWmbN9nscjq bSKxdpCAr7/Kf3Kl0+fQdIiFGkHCc0u9BtKmJeXZNznsJF9bjOKDvzXcxOsMb7XqX2 hOe+uWW/UMOdGcsUIzd/vi/5j3yES46AWtvUlU1jBROXiea4567PNJWpLYoZMFqVgL NMq2J5zJlGm1UXU8D+4P9OvfHXbui5e/dqwEuSS2UJ/GtF2AnUjpobB7awcvvFRRIa ES5ti8RR6LeopN+oxl78/X8W0BVsZoLB2U25rzepimjXABo8RQGHyqoN5+CXbp11qq +xPVahT1CiYuA== Subject: Re: pt_regs->ax == -ENOSYS To: Andy Lutomirski Cc: Linus Torvalds , Ingo Molnar , Thomas Gleixner , Andrew Lutomirski , Borislav Petkov , linux-kernel@vger.kernel.org, oleg@redhat.com, Kees Cook , Will Drewry References: From: "H. Peter Anvin" Message-ID: <06a5e088-b0e6-c65e-73e6-edc740aa4256@zytor.com> Date: Tue, 27 Apr 2021 15:58:03 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/27/21 2:28 PM, Andy Lutomirski wrote: > >> On Apr 27, 2021, at 2:15 PM, H. Peter Anvin wrote: >> >> Trying to stomp out some possible cargo cult programming? >> >> In the process of going through the various entry code paths, I have to admit to being a bit confused why pt_regs->ax is set to -ENOSYS very early in the system call path. >> > > It has to get set to _something_, and copying orig_ax seems perhaps silly. There could also be code that relies on ptrace poking -1 into the nr resulting in -ENOSYS. > Yeah. I obviously ran into this working on the common entry-exit code for FRED; the frame has annoyingly different formats because of this, and I wanted to avoid slowing down the system call path. >> What is perhaps even more confusing is: >> >> __visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr) >> { >> nr = syscall_enter_from_user_mode(regs, nr); >> >> instrumentation_begin(); >> if (likely(nr < NR_syscalls)) { >> nr = array_index_nospec(nr, NR_syscalls); >> regs->ax = sys_call_table[nr](regs); >> #ifdef CONFIG_X86_X32_ABI >> } else if (likely((nr & __X32_SYSCALL_BIT) && >> (nr & ~__X32_SYSCALL_BIT) < X32_NR_syscalls)) { >> nr = array_index_nospec(nr & ~__X32_SYSCALL_BIT, >> X32_NR_syscalls); >> regs->ax = x32_sys_call_table[nr](regs); >> #endif >> } >> instrumentation_end(); >> syscall_exit_to_user_mode(regs); >> } >> #endif >> >> Now, unless I'm completely out to sea, it seems to me that if syscall_enter_from_user_mode() changes the system call number to an invalid number and pt_regs->ax to !-ENOSYS then the system call will return a different value(!) depending on if it is out of range for the table (whatever was poked into pt_regs->ax) or if it corresponds to a hole in the table. This seems to me at least to be The Wrong Thing. > > I think you’re right. > >> >> Calling regs->ax = sys_ni_syscall() in an else clause would arguably be the right thing here, except possibly in the case where nr (or (int)nr, see below) == -1 or < 0. > > I think the check should be -1 for 64 bit but (u32)nr == (u32)-1 for the 32-bit path. Does that seem reasonable? I'm thinking overall that depending on 64-bit %rax is once again a mistake; I realize that the assembly code that did that kept breaking because people messed with it, but we still have: /* * Only the low 32 bits of orig_ax are meaningful, so we return int. * This importantly ignores the high bits on 64-bit, so comparisons * sign-extend the low 32 bits. */ static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { return regs->orig_ax; } "Different interpretation of the same data" is a notorious security trap. Zero-extending orig_ax would cause different behavior on 32 and 64 bits and differ from the above, so I'm thinking that just once and for all defining the system call number as a signed int for all the x86 ABIs would be the sanest. It still doesn't really answer the question if "movq $-1,%rax; syscall" or "movl $-1,%eax; syscall" could somehow cause bad things to happen, though, which makes me a little bit nervous still. -hpa