Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751597AbcDRFSz (ORCPT ); Mon, 18 Apr 2016 01:18:55 -0400 Received: from mail-ob0-f170.google.com ([209.85.214.170]:34726 "EHLO mail-ob0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751185AbcDRFSy (ORCPT ); Mon, 18 Apr 2016 01:18:54 -0400 MIME-Version: 1.0 In-Reply-To: <5714679B.3040806@zytor.com> References: <1460940317.9121.56.camel@decadent.org.uk> <20160418004731.GB3348@decadent.org.uk> <5714679B.3040806@zytor.com> From: Andy Lutomirski Date: Sun, 17 Apr 2016 22:18:34 -0700 Message-ID: Subject: Re: [PATCH] x86/entry/x32: Check top 32 bits of syscall number on the fast path To: "H. Peter Anvin" Cc: Ben Hutchings , Andy Lutomirski , X86 ML , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1527 Lines: 33 On Sun, Apr 17, 2016 at 9:50 PM, H. Peter Anvin wrote: > On 04/17/16 17:47, Ben Hutchings wrote: >> We've always masked off the top 32 bits when x32 is enabled, but >> hopefully no-one relies on that. Now that the slow path is in C, we >> check all the bits there, regardless of whether x32 is enabled. Let's >> make the fast path consistent with it. > > We have always masked off the top 32 bits *period*. > > We have had some bugs where we haven't, because someone has tried to > "optimize" the code and they have been quite serious. The system call > number is an int, which means the upper 32 bits are undefined on call > entry: we HAVE to mask them. I'm reasonably confident that normal kernels (non-x32) have not masked those bits since before I started hacking on the entry code. So the type of the syscall nr is a bit confused. If there was an installed base of programs that leaved garbage in the high bits, we would have noticed *years* ago. On the other hand, the 32-bit ptrace ABI and the seccomp ABI both think it's 32-bits. If we were designing the x86_64 ABI and everything around it from scratch, I'd suggest that that either the high bits must be zero or that the number actually be 64 bits (which are more or less the same thing). That would let us use the high bits for something interesting in the future. In practice, we can probably still declare that the thing is a 64-bit number, given that most kernels in the wild currently fail syscalls that have the high bits set. --Andy