Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp309381imu; Mon, 10 Dec 2018 22:31:38 -0800 (PST) X-Google-Smtp-Source: AFSGD/Wa2vBf+Hto7RHp3CjNc0sRVxD5b6twEEMX1cEyTUEaPcw5eT5O3idIW86JBVjks1u3NPce X-Received: by 2002:a63:4e15:: with SMTP id c21mr13573672pgb.50.1544509898123; Mon, 10 Dec 2018 22:31:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544509898; cv=none; d=google.com; s=arc-20160816; b=Zs933iMkSskthbu0+zueEUglnXW9t6+QYla71BRm92KU7A6HFZ3YS3Rdk/n+1YoB+i l+sXqzMP8JKDfEbRyUZdDd8c7nq4hGeU5WnIykX4AoyQtUYUUZQ+twsiswxNDenf2Js0 sQiZYxzdfOpquxe9MUOL5mZ3n/t2lcC5Q9G8QbvkhPZJw1RfQEQ/PbOktg4sSJuZdXst RUg3vPuEIRQX8xPy/jyeFeBi0Oo5elwx/5uuT7IT0e7zSpdWGDkEINP5CVlvtsFmHrXe 9BuQSVH6DZyFL7SANEovLIc2cXaTlXIWVEY0h5N2d06gGp/1fthyAUGddz4ia5FB/vej QPGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=teSUxqf8xnCqBHxJRDCCbAskdGB8otbhOm9yKetFeRU=; b=BpscPL+4GC9fhzZeyoBQRXzY7TA2J0ISw411IpNb7pBXUmNQCpJoyr2SwGZ9hxis2e rZHOnG3tgXYzN1CBcc5qWQmMiU/evZU0THq4VYFbzgbD/9wxWaOgfHlJKmCq6+1hfYhn w5gGuJ1COzaBsDgbhmW/VoYxwmO6Y01jqJcGd8F7OKlVYbKMl+ebXzeMVci5DIIVlwg7 YhMRMZtz+FYfmF5y6C2LslMmPxayt/OqfL1kxOHZLa671NNq3y/nA5ozNOHho94cZwgR UAFcHP2cpP9JCfcjbq2O1Sn0w9/+DQ3RYk8VEigtPCVAN7RyPLGx0Qsb7Eeyb4HrPqFq SN7Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="1B7Ta/D2"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b131si10875656pga.51.2018.12.10.22.31.23; Mon, 10 Dec 2018 22:31:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="1B7Ta/D2"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729081AbeLKFfS (ORCPT + 99 others); Tue, 11 Dec 2018 00:35:18 -0500 Received: from mail.kernel.org ([198.145.29.99]:40980 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727849AbeLKFfR (ORCPT ); Tue, 11 Dec 2018 00:35:17 -0500 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id F147220880 for ; Tue, 11 Dec 2018 05:35:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1544506516; bh=g3KDrDiAaULb1z7soFbb46MmIwpSDGWDocdcY4xEPhU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=1B7Ta/D2o/pnGRhJUSDH2ZvGgAn4FLTcQxWIEXVfPCxgrVL1tQyho5jrYp/OMp2Zn 0pNkt8se8+tkjVPsxbmBCq0uWtnDlMA9pOUkoqmt4wZT1QPFvKQf75yfA46Cmg1byh Lq0tOpoQblwAH1w8bF9JlWporb9OPYe73R8AisZg= Received: by mail-wm1-f51.google.com with SMTP id a18so839529wmj.1 for ; Mon, 10 Dec 2018 21:35:15 -0800 (PST) X-Gm-Message-State: AA+aEWbRJK9ZpYNUhncGzL99462EMqlJWnTF/bbjbWlCtYYZQoHNgX85 yeklFUYGwPb3LI3mfaQppIf2vPz3Y/hXNbiM0I+wJQ== X-Received: by 2002:a1c:f112:: with SMTP id p18mr868410wmh.83.1544506514298; Mon, 10 Dec 2018 21:35:14 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Andy Lutomirski Date: Mon, 10 Dec 2018 21:35:02 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Can we drop upstream Linux x32 support? To: "H. J. Lu" Cc: Andrew Lutomirski , X86 ML , LKML , Linux API , "H. Peter Anvin" , Peter Zijlstra , Borislav Petkov , Florian Weimer , Mike Frysinger , Rich Felker , x32@buildd.debian.org, Arnd Bergmann , Will Deacon , Catalin Marinas , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 10, 2018 at 7:15 PM H.J. Lu wrote: > > On Mon, Dec 10, 2018 at 5:23 PM Andy Lutomirski wrote: > > > > Hi all- > > > > I'm seriously considering sending a patch to remove x32 support from > > upstream Linux. Here are some problems with it: > > > > 1. It's not entirely clear that it has users. As far as I know, it's > > supported on Gentoo and Debian, and the Debian popcon graph for x32 > > has been falling off dramatically. I don't think that any enterprise > > distro has ever supported x32. > > I have been posting x32 GCC results for years: > > https://gcc.gnu.org/ml/gcc-testresults/2018-12/msg01358.html Right. My question wasn't whether x32 had developers -- it was whether it had users. If the only users are a small handful of people who keep the toolchain and working and some people who benchmark it, then I think the case for keeping it in upstream Linux is a bit weak. > > > 2. The way that system calls work is very strange. Most syscalls on > > x32 enter through their *native* (i.e. not COMPAT_SYSCALL_DEFINE) > > entry point, and this is intentional. For example, adjtimex() uses > > the native entry, not the compat entry, because x32's struct timex > > matches the x86_64 layout. But a handful of syscalls have separate > > This becomes less an issue with 64-bit time_t. > > > entry points -- these are the syscalls starting at 512. These enter > > throuh the COMPAT_SYSCALL_DEFINE entry points. > > > > The x32 syscalls that are *not* in the 512 range violate all semblance > > of kernel syscall convention. In the syscall handlers, > > in_compat_syscall() returns true, but the COMPAT_SYSCALL_DEFINE entry > > is not invoked. This is nutty and risks breaking things when people > > refactor their syscall implementations. And no one tests these > > things. Similarly, if someone calls any of the syscalls below 512 but > > sets bit 31 in RAX, then the native entry will be called with > > in_compat_set(). > > > > Conversely, if you call a syscall in the 512 range with bit 31 > > *clear*, then the compat entry is set with in_compat_syscall() > > *clear*. This is also nutty. > > This is to share syscalls between LP64 and ILP32 (x32) in x86-64 kernel. > I tried to understand what's going on. As far as I can tell, most of the magic is the fact that __kernel_long_t and __kernel_ulong_t are 64-bit as seen by x32 user code. This means that a decent number of uapi structures are the same on x32 and x86_64. Syscalls that only use structures like this should route to the x86_64 entry points. But the implementation is still highly dubious -- in_compat_syscall() will be *true* in such system calls, which means that, if someone changes: SYSCALL_DEFINE1(some_func, struct some_struct __user *, ptr) { /* x32 goes here, but it's entirely non-obvious unless you read the x86 syscall table */ native impl; } COMPAT_SYSCALL_DEFINE1(some_func, struct compat_some_struct __user *, ptr) { compat impl; } to the Obviously Equivalent (tm): SYSCALL_DEFINE1(some_func, struct some_struct __user *, ptr) { struct some_struct kernel_val; if (in_compat_syscall()) { get_compat_some_struct(&kernel_val, ptr); } else { copy_from_user(&kernel_val, ptr, sizeof(struct some_struct)); } do the work; } then x32 breaks. And I don't even know how x32 is supposed to support some hypothetical syscall like this: long sys_nasty(struct adjtimex *a, struct iovec *b); where one argument has x32 and x86_64 matching but the other has x32 and x86_32 matching. This whole thing seems extremely fragile.