Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp4937410pxy; Tue, 27 Apr 2021 16:30:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6a7Oo9P/uKIqOD/EK1IFYaZJpHp0ez9NhJgKvhJ2na5KtG16aNHSp9Muq2y9dtrLtyjpd X-Received: by 2002:a17:902:ecd2:b029:ec:9199:7fef with SMTP id a18-20020a170902ecd2b02900ec91997fefmr26440777plh.22.1619566212714; Tue, 27 Apr 2021 16:30:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619566212; cv=none; d=google.com; s=arc-20160816; b=JQWwOdQQijbytf0D7FR6Jklue4sp0q4ae434P2vKebJfkmrrtQo4fSzcyt0CCyHXWM 68ksq6Z7vWOukY3mTxampmNw9WdUQGirQ3o26n586bT6sJFvBm7C7DNTukHcGIipURCR FcLlMXl40PKbj80SAH/YuJ6iE8ZcrhZp41OVEzjl1GQ7aLNzL1IJXs0p19p2/tAQQt0d trlAvGvdojcppm8dBzMHCYt/VF+Uh25ledHiEwzsq9CRx+r4F0frNxPZrfe4UIIYl/bu BHGPjm4EVLA2DDhZnY4Mz11XYj/xjrWvVl/co3hePH6iX31x+w/UnCdHudB/vD+QPbGB rwjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=y7HxjP/Oou+y+oWelSFeYT4BTIle82wj6/68sSsfhm8=; b=WPdj+5L8IahoId7cVyNPXMSUJuwST9Ko4Wfp74GiCpZY4ctvDuG9btl/Rq53jiryLr 1C+4edc0QHo91dTrezW13rW/7znvLdg0CDkljU1hONeY5CCZ2yAU65Btq7n6PZO58EDo OgidIu/2YxGmzRLv9Xrcgj77q+bA7/JsDRYb370i6zR0gKdlxF2ls/yhN0NFYLJIu155 gK/XiGzNnyu58yTPGwEouIhS6XrqFZ4466/GY9fQzH4FNE3QccnHmoOnLYz+wgVnPEPu z1qQ8PZMrbtvSDcKOtVv/ydx7WXNANQJVlT5BAnz6Vxfi75D5KR6IZyxlDp7TplIdfim jMTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=F++1DsRO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z17si5267140pjn.169.2021.04.27.16.29.59; Tue, 27 Apr 2021 16:30:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=F++1DsRO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236792AbhD0X3u (ORCPT + 99 others); Tue, 27 Apr 2021 19:29:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235423AbhD0X3t (ORCPT ); Tue, 27 Apr 2021 19:29:49 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F33C3C061574 for ; Tue, 27 Apr 2021 16:29:05 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id b21so3257878plz.0 for ; Tue, 27 Apr 2021 16:29:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=y7HxjP/Oou+y+oWelSFeYT4BTIle82wj6/68sSsfhm8=; b=F++1DsROrFO4sYCWRzYEt0iAAj/m+YXqYdzFguQ9nz6Q+6IRHSwVLMd2xfurWRlHSC 99zH7zWCpCkB0F7zsnixZX0sqQCwvWGqEDg1Nrh0Jh/54ilnvqmMS/6xpy6Myi5JL5Z4 PSg3j1guF/n2UhcdvlpQfS42P/U1uUkWwj0DU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=y7HxjP/Oou+y+oWelSFeYT4BTIle82wj6/68sSsfhm8=; b=PRQuAnbM/P1P0pBkiOe38Zm39Ku5fDfZco5otuthlqNgKxpgq/0Le3z3PHCIrOoOYr oe88OIWh9xRhp5a+EztbAfLY+nEB0K7WNaQoIHYcydw3GeNWyjMtYmQiyaNbJqGq7PSt qcUe1Zu3+3KkkYlNKU/lWXwxzTwcXtVQO4+VIZ1OomVqec9uAyPV7a5AD680o2SI/uaq olXT8Qy5hZV18BuwStsL7z8e8ybMVoeWXm/7I5gx8yeflvfN+syyjKAAKh/HEu5Wr2d9 6bmc1DApvcF9ols1nyYD9IpIR2+41b/jBbloJX1Ksv85YS6yn05DY0iZ5NEByLm8/vP6 nPaw== X-Gm-Message-State: AOAM533Tj2M448CVAVZt/T4BonhxggMLiUWhwEtHoEt1U2Vow/TUbtjI /C0Ct5tJs6hD8PQsWXQGvui0TQ0q0IR8OA== X-Received: by 2002:a17:90a:6585:: with SMTP id k5mr29187452pjj.40.1619566145495; Tue, 27 Apr 2021 16:29:05 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id b25sm3426904pfd.7.2021.04.27.16.29.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 16:29:04 -0700 (PDT) Date: Tue, 27 Apr 2021 16:29:03 -0700 From: Kees Cook To: "H. Peter Anvin" Cc: Andy Lutomirski , Linus Torvalds , Ingo Molnar , Thomas Gleixner , Andrew Lutomirski , Borislav Petkov , linux-kernel@vger.kernel.org, oleg@redhat.com, Will Drewry Subject: Re: pt_regs->ax == -ENOSYS Message-ID: <202104271619.0DBE456@keescook> References: <06a5e088-b0e6-c65e-73e6-edc740aa4256@zytor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <06a5e088-b0e6-c65e-73e6-edc740aa4256@zytor.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 27, 2021 at 03:58:03PM -0700, H. Peter Anvin wrote: > On 4/27/21 2:28 PM, Andy Lutomirski wrote: > > > > > On Apr 27, 2021, at 2:15 PM, H. Peter Anvin wrote: > > > > > > Trying to stomp out some possible cargo cult programming? > > > > > > In the process of going through the various entry code paths, I have to admit to being a bit confused why pt_regs->ax is set to -ENOSYS very early in the system call path. > > > > > > > It has to get set to _something_, and copying orig_ax seems perhaps silly. There could also be code that relies on ptrace poking -1 into the nr resulting in -ENOSYS. > > > > Yeah. I obviously ran into this working on the common entry-exit code for > FRED; the frame has annoyingly different formats because of this, and I > wanted to avoid slowing down the system call path. > > > > What is perhaps even more confusing is: > > > > > > __visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr) > > > { > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > instrumentation_begin(); > > > if (likely(nr < NR_syscalls)) { > > > nr = array_index_nospec(nr, NR_syscalls); > > > regs->ax = sys_call_table[nr](regs); > > > #ifdef CONFIG_X86_X32_ABI > > > } else if (likely((nr & __X32_SYSCALL_BIT) && > > > (nr & ~__X32_SYSCALL_BIT) < X32_NR_syscalls)) { > > > nr = array_index_nospec(nr & ~__X32_SYSCALL_BIT, > > > X32_NR_syscalls); > > > regs->ax = x32_sys_call_table[nr](regs); > > > #endif > > > } > > > instrumentation_end(); > > > syscall_exit_to_user_mode(regs); > > > } > > > #endif > > > > > > Now, unless I'm completely out to sea, it seems to me that if syscall_enter_from_user_mode() changes the system call number to an invalid number and pt_regs->ax to !-ENOSYS then the system call will return a different value(!) depending on if it is out of range for the table (whatever was poked into pt_regs->ax) or if it corresponds to a hole in the table. This seems to me at least to be The Wrong Thing. > > > > I think you’re right. > > > > > > > > Calling regs->ax = sys_ni_syscall() in an else clause would arguably be the right thing here, except possibly in the case where nr (or (int)nr, see below) == -1 or < 0. > > > > I think the check should be -1 for 64 bit but (u32)nr == (u32)-1 for the 32-bit path. Does that seem reasonable? FWIW, there is some confusion with how syscall_trac_enter() signals the "skip syscall" condition (-1L), vs actually calling "syscall -1". Right now they're not really distinguished, and the early ENOSYS is there so that "pretend it happened" can be implemented (by either ptrace or seccomp). As in, "set return value to $whatever, and don't run a syscall". > I'm thinking overall that depending on 64-bit %rax is once again a mistake; > I realize that the assembly code that did that kept breaking because people > messed with it, but we still have: > > /* > * Only the low 32 bits of orig_ax are meaningful, so we return int. > * This importantly ignores the high bits on 64-bit, so comparisons > * sign-extend the low 32 bits. > */ > static inline int syscall_get_nr(struct task_struct *task, struct pt_regs > *regs) > { > return regs->orig_ax; > } > > "Different interpretation of the same data" is a notorious security trap. > Zero-extending orig_ax would cause different behavior on 32 and 64 bits and > differ from the above, so I'm thinking that just once and for all defining > the system call number as a signed int for all the x86 ABIs would be the > sanest. > > It still doesn't really answer the question if "movq $-1,%rax; syscall" or > "movl $-1,%eax; syscall" could somehow cause bad things to happen, though, > which makes me a little bit nervous still. I don't think this matters? What's the condition you're worried about here? The syscall table lookup is going to be safe. -- Kees Cook