Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1006637imu; Thu, 13 Dec 2018 07:59:35 -0800 (PST) X-Google-Smtp-Source: AFSGD/WJx3Sl0ANqCJK3MFn9sPsNU4YyAKhlEiQLizr3y8D78GJK9Ty8OLeJpuAGWRGc9tPMj7aY X-Received: by 2002:a17:902:583:: with SMTP id f3mr24856799plf.202.1544716775270; Thu, 13 Dec 2018 07:59:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544716775; cv=none; d=google.com; s=arc-20160816; b=I/OCAurBvBzckNHN6XzTs7JpELT56hDQUaRzVbA2/yDa15Ac1UDt7R4VNTxBEDVzCk 1u8cQ4W18/x6ioheiW+wP2KkE4dYv1yvAd01ji0tHCJ/0cySRj/yGPj4h6fQU+EmjkSW +LJzYudAu//oJqzfYYfKKTw4ctZLrA+UTbl/N/ltDwqmERBuS3rF8o2s25Js6lgKK9ja jCy/lSsNpvjaUsGYdVi14QpfEb0J3aDxwQKiS20EPzfniUIAWBCvMZWKyH2ibAmdTNP/ jrhhRcm/nrxEN/IfRjopEXYSYWmFTs3Rv8/wZY5AfL9Mg6hh3BC7EG+3qTKfwamdQUvB w4Zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=QdwNKm7Xk+WtOdZnOWj6QFlnk0QWpjwI3erZoU5N6PA=; b=Pqd+cwg97ZwHquKLSY+3teLCGzptnu8wimqMo6iXTdh4u046a+B/Ty7tvyPdZ+YLAM 0mMA8E0UirZer6N8S8JkLyx+CDtLturA6o4+Am8VvZPALm9Czgx7jBcPRRDgpiRIAbGl mPazvajAbENeTJNmVF2GEQowWk3ifF1ESdPpThJ0Maz3kU4s1SMs5ItmgCO41rxA3waE 6Rr3kZyp5Iov9elyN7zO8YzzRgFq1R9VsYi9Kfl3QA5i3YUXWWc77c+j1PZ/RUShA/ld ry7VKtzdtBc20hTGuttX53LGHEt4qXwdqSLb5imhW5nbWmbDv+fjVdva37ZOGu3tjpvM q97Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w12si1847379pfn.212.2018.12.13.07.59.18; Thu, 13 Dec 2018 07:59:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728722AbeLMP56 (ORCPT + 99 others); Thu, 13 Dec 2018 10:57:58 -0500 Received: from 216-12-86-13.cv.mvl.ntelos.net ([216.12.86.13]:58912 "EHLO brightrain.aerifal.cx" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727618AbeLMP56 (ORCPT ); Thu, 13 Dec 2018 10:57:58 -0500 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1gXTMu-0008IE-00; Thu, 13 Dec 2018 15:57:44 +0000 Date: Thu, 13 Dec 2018 10:57:44 -0500 From: Rich Felker To: Catalin Marinas Cc: Andy Lutomirski , tg@mirbsd.de, Linus Torvalds , X86 ML , LKML , Linux API , "H. Peter Anvin" , Peter Zijlstra , Borislav Petkov , Florian Weimer , Mike Frysinger , "H. J. Lu" , x32@buildd.debian.org, Arnd Bergmann , Will Deacon Subject: Re: Can we drop upstream Linux x32 support? Message-ID: <20181213155744.GU23599@brightrain.aerifal.cx> References: <20181212165237.GT23599@brightrain.aerifal.cx> <20181213124025.bczxzj6ez34joo6v@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181213124025.bczxzj6ez34joo6v@localhost> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 13, 2018 at 12:40:25PM +0000, Catalin Marinas wrote: > On Wed, Dec 12, 2018 at 10:03:30AM -0800, Andy Lutomirski wrote: > > On Wed, Dec 12, 2018 at 8:52 AM Rich Felker wrote: > > > On Wed, Dec 12, 2018 at 08:39:53AM -0800, Andy Lutomirski wrote: > > > > I'm proposing another alternative. Given that x32 already proves that > > > > the user bitness model doesn't have to match the kernel model (in x32, > > > > user "long" is 32-bit but the kernel ABI "long" is 64-bit), I'm > > > > proposing extending this to just make the kernel ABI be LP64. So > > > > __kernel_size_t would be 64-bit and pointers in kernel data structures > > > > would be 64-bit. In other words, most or all of the kernel ABI would > > > > just match x86_64. > > > > > > > > As far as I can tell, the only thing that really needs unusual > > > > toolchain features here is that C doesn't have an extra-wide pointer > > > > type. The kernel headers would need a way to say "this pointer is > > > > still logically a pointer, and user code may assume that it's 32 bits, > > > > but it has 8-byte alignment." > > > > > > None of this works on the userspace/C side, nor should any attempt be > > > made to make it work. Types fundamentally cannot have alignments > > > larger than their size. If you want to make the alignment of some > > > pointers 8, you have to make their size 8, and then you just have LP64 > > > again if you did it for all pointers. > > > > > > If on the other hand you tried to make just some pointers "wide > > > pointers", you'd also be completely breaking the specified API > > > contracts of standard interfaces. For example in struct iovec's > > > iov_base, &foo->iov_base is no longer a valid pointer to an object of > > > type void* that you can pass to interfaces expecting void**. Sloppy > > > misunderstandings like what you're making now are exactly why x32 is > > > already broken and buggy (&foo->tv_nsec already has wrong type for > > > struct timespec foo). > > > > I don't think it's quite that broken. For the struct iovec example, > > we currently have: > > > > struct iovec { > > void *iov_base; /* Starting address */ > > size_t iov_len; /* Number of bytes to transfer */ > > }; > > > > we could have, instead: (pardon any whitespace damage) > > > > struct iovec { > > void *iov_base; /* Starting address */ > > uint32_t __pad0; > > size_t iov_len; /* Number of bytes to transfer */ > > uint32_t __pad1; > > } __attribute__((aligned(8)); > > > > or the same thing but where iov_len is uint64_t. A pointer to > > iov_base still works exactly as expected. Something would need to be > > done to ensure that the padding is all zeroed, which might be a real > > problem. > > We looked at this approach briefly for arm64/ILP32 and zeroing the pads > was the biggest problem. User programs would not explicitly zero the pad > and I'm not sure the compiler would be any smarter. This means it's the > kernel's responsibility to zero the pad (around get_user, > copy_from_user), so it doesn't actually simplify the kernel side of the > syscall interface. > > If the data flow goes the other way (kernel to user), this approach > works fine. > > > No one wants to actually type all the macro gunk into the headers to > > make this work, but this type of transformation is what I have in mind > > when the compiler is asked to handle the headers. Or there could > > potentially be a tool that automatically consumes the uapi headers and > > spits out modified headers like this. > > If the compiler can handle the zeroing, that would be great, though not > sure how (some __attribute__((zero)) which generates a type constructor > for such structure; it kind of departs from what the C language offers). The compiler fundamentally can't. At the very least it would require effective type tracking, which requires shadow memory and is even more controversial than -fstrict-aliasing (because in a sense it's a stronger version thereof). But even effective type tracking would not help, since you can have things like: struct iovec *iov = malloc(sizeof *iov); scanf("%p %zu", &iov->iov_base, &iov->iov_len); where no store to the object via the struct type ever happens and the only stores that do happen are invisible across translation unit boundaries. (Ignore that scanf here is awful; it's just a canonical example of a function that would store the members via pointers to them.) The kernel-side approach could work if the kernel had some markup for fields that need to be zero- or sign-extended when copied from user in a 32-bit process and applied them at copy time. That could also fix the existing tv_nsec issue. I'm not sure how difficult/costly it would be though. Rich