Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1362314imu; Tue, 11 Dec 2018 18:27:06 -0800 (PST) X-Google-Smtp-Source: AFSGD/X3jwR9SuI3m4NEeqUBLaBDhH4oxQfuoQ1sSRp8YSU/2A2gwFCL5vs0ciGdUEWXGZE1XBDo X-Received: by 2002:a65:4646:: with SMTP id k6mr16403118pgr.153.1544581626657; Tue, 11 Dec 2018 18:27:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544581626; cv=none; d=google.com; s=arc-20160816; b=UTSSTFqMpDfZ49RZMyicg09r+v5/SN5/dEjuwSH/h/nI3uNsTJr6Txd4z8VnBlDQy0 CmRweW6uFY8u3QuoHXPJsvku7bLGyAsM8WFTqBTHx6P19ozUiPMjXbN+VP5qnZMS7Juu 2xatyohBeV3kVuN/AtFNSqm+w6VTbVAhRfO6nAMD2HpB+L78VQmKjMmMjybJvdBkdv+7 gJMCCaCAhKbEFoi6i0cKFfmWC1WglGgtnWYk6QzxLw+miaGCvFEPx7LzV61YORSvMw3c xPugoa0W+iCuVrMF2F1fTymwyLA389zGYpEqi+RkVfg+WMOX2L0JTutsa6GNxOM33Ju6 G6kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=qxAk/XHQbpupex1fILoNo6b1b8bX5d6OUTMPpDLHMXs=; b=CeGDZ0XytuEQsPp95xrJD2ZP1pVConiwlnqpg/20GLmPxr0qCRCMe/YZOmnaNtU8SX fNaSoIlWYsJkFHgPXkiPq8sFvOG6UHmsDh96CPZL+wwpRRUecp10Az2YhV0VRaQjNhiU HF6TUKmh1G1GCRPeDZw5LphG91g4bqo0IrdiK7Tc0UaIeMZnMWo7FeQipz6aLmvjJw68 U9GgKLagEVjTBk/At52UUsHqxVZsV41Ny2FsASiPa1CiPpJiK9mJXKbzeU+zEmdXNPfG HTcfsTiQOC0GVhIG9MdWdW1hFGmYB6467WUI7qIKDx6cSHqC/OyNAZfTb2iHUHSfOb++ EFQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=TTOqJsHn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e8si13671576pgn.325.2018.12.11.18.26.51; Tue, 11 Dec 2018 18:27:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=TTOqJsHn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726408AbeLLCZY (ORCPT + 99 others); Tue, 11 Dec 2018 21:25:24 -0500 Received: from mail.kernel.org ([198.145.29.99]:58340 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726218AbeLLCZP (ORCPT ); Tue, 11 Dec 2018 21:25:15 -0500 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 526FA21104 for ; Wed, 12 Dec 2018 02:25:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1544581513; bh=A4Qew2MhbK2hFq6EYro9b+ykpaUSW3pA8J+uIpJlvtQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=TTOqJsHnrAYH3mg21s/5zt2D5UeE/M8Kxk7CWBKLfDYOpkAvRYrxKLL7JyyEoQAMw FGOjxDf5u/E+h9UuWNS5ofb/bAhAROxtsqmhgSqRbtKKQquohgboJuuMPvTLt8OBmp GcTG7aOc+N71ON5qVRfKbdzwDQggl70yJSEf82iA= Received: by mail-wm1-f41.google.com with SMTP id f81so4246126wmd.4 for ; Tue, 11 Dec 2018 18:25:13 -0800 (PST) X-Gm-Message-State: AA+aEWb1RtHLMfkQT68XXU5VCjRzJaPhYYz5lFieiN55q1Mj3XPAk7RK WEl1+Fn45+4YAjLLFNWQ0zx0pq1so91HK/2T6pHIpg== X-Received: by 2002:a1c:864f:: with SMTP id i76mr4368027wmd.83.1544581511802; Tue, 11 Dec 2018 18:25:11 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Andy Lutomirski Date: Tue, 11 Dec 2018 18:24:59 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Can we drop upstream Linux x32 support? To: tg@mirbsd.de Cc: Andrew Lutomirski , Linus Torvalds , X86 ML , LKML , Linux API , "H. Peter Anvin" , Peter Zijlstra , Borislav Petkov , Florian Weimer , Mike Frysinger , "H. J. Lu" , Rich Felker , x32@buildd.debian.org, Arnd Bergmann , Will Deacon , Catalin Marinas Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Dec 11, 2018, at 3:35 PM, Thorsten Glaser wrote: > > Andy Lutomirski dixit: > >> What happens if someone adds a struct like: >> >> struct nasty_on_x32 { >> __kernel_long_t a; >> void * __user b; >> }; >> >> On x86_64, that's two 8-byte fields. On x86_32, it's two four-byte >> fields. On x32, it's an 8-byte field and a 4-byte field. Now what? > > Yes, that=E2=80=99s indeed ugly. I understand. But don=E2=80=99t we alrea= dy have > this problem with architectures which support multiple ABIs at the > same time? An amd64 kernel with i386 userspace comes to mind, or > the multiple MIPS ABIs. That=E2=80=99s the thing, though: the whole generic kernel compat infrastructure assumes there are at most two ABIs: native and, if enabled and relevant, compat. x32 breaks this entirely. > >> I'm sure we could have some magic gcc plugin or other nifty tool that >> gives us: >> >> copy_from_user(struct struct_name, kernel_ptr, user_ptr); > > Something like that might be useful. Generate call stubs, which > then call the syscall implementation with the actual user-space > struct contents as arguments. Hm, that might be too generic to > be useful. Generate macros that can read from or write specific > structures to userspace? > > I think something like this could solve other more general problems > as well, so it might be =E2=80=9Cnice to have anyway=E2=80=9D. Of course = it=E2=80=99s work, > and I=E2=80=99m not involved enough in Linux kernel programming to be abl= e > to usefully help with it (doing too much elsewhere already). > >> actually do this work. Instead we get ad hoc fixes for each syscall, >> along the lines of preadv64v2(), which get done when somebody notices > > Yes, that=E2=80=99s absolutely ugly and ridiculous and all kinds of bad. > > On the other hand, from my current experience, someone (Arnd?) noticed > all the currently existing baddies for x32 already and fixed them. > > New syscalls are indeed an issue, but perhaps something generating > copyinout stubs could help. This might allow other architectures > that could do with a new ABI but have until now feared the overhead > as well. (IIRC, m68k could do with a new ABI that reserves a register > for TLS, but Geert would know. At the same time, time_t and off_t could > be bumped to 64 bit. Something like that. If changing sizes of types > shared between kernel and user spaces is not something feared=E2=80=A6) Magic autogenerated stubs would be great. Difficult, too, given unions, multiplexers, cmsg, etc. I suppose I will see how bad it would be to split out the x32 syscall table and at least isolate the mess to some extent. IMO the real right solution would be to push the whole problem to userspace: get an ILP32 system working with almost or entirely LP64 syscalls. POSIX support might have to be a bit flexible, but still. How hard would it be to have __attribute__((ilp64)), with an optional warning if any embedded structs are not ilp64? This plus a wrapper to make sure that mmap puts everything below 4GB ought to do the trick. Or something like what arm64 is proposing where the kernel ABI has 32-bit long doesn=E2=80=99t seem too horrible.