Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2960955pxb; Fri, 12 Feb 2021 06:04:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJyEzx9Paf3nX5wDUvJbiqjzjOlCpqbUjcbo4S6DxMZ5WJKKlMkkBtPRkyDXfqN925mkVaMF X-Received: by 2002:a05:6402:1711:: with SMTP id y17mr3507330edu.72.1613138695029; Fri, 12 Feb 2021 06:04:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613138695; cv=none; d=google.com; s=arc-20160816; b=fjeu62H7VvtNaficHQsY05ZllEFB1zHwJu+S6kJRn8zdVamfjuYGC29ErTQ+Z5q0dy wlDM4qOdQa4RBtNwSVGJuqLfYNjFSl2pWuT0ih70VC+51YYzatdBEvCjz5fungqWT6sJ Z7F3XWpBht3ZCGYfQASDrMl6Y7z+48hIgt/8HViHOI6RFsJ1J4eVzbjNT2W7okG7sbBo 7+PUuPnl1OaTdH8nUQ86iZhtLEG613ihEIZELDR7QCkiya2C/iIbAIOhQszwrpcq4AdQ GL24JThDMRBHxdlSz//CST+fLPwsu3D5TeCc3YOEj/I1gs89Z892W7AXliF9TxJxVw3W NELw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=zIn/hRVmJckTxl4vy+gSbUxbza+/IFax/AC+CRv8aEI=; b=VxopmLXlNRmCI3xpSF3LQguzaiwFmKA8nogymof/uT11ePhoB74EdZGTPoW/DJwZW5 8FFWsXGrTR3oOwSb1T2f8sTukbvM5ZjZ3XfhHphyH1EyefDlJONYhjcbZKf30CxnIm8v NJkVbnYx2PYPSw3AiZYAihZYhBFqUzRPhIxXG68nE9bL514aADF23MkBn+llw/Om9Ixq OZTiFehk3V0qtWQ5kU8+5xRRtmEtLO23RUpMODyHMKDhcjy9kyjXXno9jLazhLuophvF 3c7Mrlk6AsLAsZUZNgtEEzWzzCnjPMvwL1bHlABl3EtziY8DMkh4+qMFG/lB1rgypQBy qW4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=SShzO+dG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gv30si6121696ejc.474.2021.02.12.06.04.30; Fri, 12 Feb 2021 06:04:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=SShzO+dG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231246AbhBLOAX (ORCPT + 99 others); Fri, 12 Feb 2021 09:00:23 -0500 Received: from mail.kernel.org ([198.145.29.99]:40988 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230336AbhBLOAW (ORCPT ); Fri, 12 Feb 2021 09:00:22 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id E8DF764E77 for ; Fri, 12 Feb 2021 13:59:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1613138381; bh=EP2CAHVuzyRMG/PUKnUl+jY0zn7rp2M0QTID3TQCH40=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=SShzO+dG8ltMJpUaXev4p59HSQlluU7bRNr94cRICkioaCPO+zzm7n4moRXjSF1ud n/lEVjkw/f0Iawgra3RRLRT+p0wMuKwrZu065Ug9uxnwkyYnD9mfcVO/6zgEzuOJ+3 MVvAW3qTqnEv7e6obiKS1DFfhXbTqG1ko+KoPe2JuT+t4jEd6tRAgy6f8zFl9g8pJ0 7PDSMWBF0QPleCoh5ghng7z4+dvnxzDgJP6QwU5bywOcHCFja123nfiQA6R9peH+1G WN13Ve19f7h+xFCVtIcZm5UE3f2qNmNCaBPTeAbbqG1AU087lRijEkVSFmFGp8nkkm v9urxGLl5B4sA== Received: by mail-oi1-f180.google.com with SMTP id u66so10022360oig.9 for ; Fri, 12 Feb 2021 05:59:40 -0800 (PST) X-Gm-Message-State: AOAM5316nlqt/g+mMYYbO4clrkbojoCGzTfHamHvybdSQdj7dPElmAle /hpQ4h/hDpsi2jeIAjCv15HboCdaQkqI0G2ruoE= X-Received: by 2002:aca:2117:: with SMTP id 23mr1849245oiz.4.1613138380166; Fri, 12 Feb 2021 05:59:40 -0800 (PST) MIME-Version: 1.0 References: <20210211202208.31555-1-Sonicadvance1@gmail.com> <58b03e17-3729-99ea-8691-0d735a53b9bc@arm.com> In-Reply-To: <58b03e17-3729-99ea-8691-0d735a53b9bc@arm.com> From: Arnd Bergmann Date: Fri, 12 Feb 2021 14:59:24 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RESEND RFC PATCH v2] arm64: Exposes support for 32-bit syscalls To: Steven Price Cc: Ryan Houdek , "Amanieu d'Antras" , Catalin Marinas , Will Deacon , Mark Rutland , Oleg Nesterov , Al Viro , Dave Martin , Amit Daniel Kachhap , Mark Brown , Marc Zyngier , David Brazdil , Jean-Philippe Brucker , Andrew Morton , Anshuman Khandual , Gavin Shan , Mike Rapoport , Vincenzo Frascino , Kristina Martsenko , Kees Cook , Sami Tolvanen , Frederic Weisbecker , Kevin Hao , Jason Yan , Andrey Ignatov , Peter Collingbourne , Julien Grall , Tian Tao , Qais Yousef , Jens Axboe , Linux ARM , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 12, 2021 at 12:33 PM Steven Price wrote: > On 11/02/2021 20:21, sonicadvance1@gmail.com wrote: > > The problem: > > We need to support 32-bit processes running under a userspace > > compatibility layer. The compatibility layer is a AArch64 process. > > This means exposing the 32bit compatibility syscalls to userspace. > > I'm not sure how you come to this conclusion. Running 32-bit processes > under a compatibility layer is a fine goal, but it's not clear why the > entire 32-bit compat syscall layer is needed for this. > > As a case in point QEMU's user mode emulation already achieves this in > many cases without any changes to the kernel. I think it's a quantitative difference, not a qualitative one: qemu does a nice job at translating the interfaces for many combinations of host and target architectures at a decent speed, and is improving at both the compatibility and the performance over time. What both Tango and FEX promise is to be much faster by focusing on one target architecture each, and to have better compatibility than what qemu can do. > > Who does this matter to? > > Any user that has a specific need to run legacy 32-bit software under a > > compatibility layer. > > Not all software is open source or easy to convert to 64bit, it's > > something we need to live with. > > Professional software and the gaming ecosystem is rife with this. > > > > What applications have tried to work around this problem? > > FEX emulator (1) - Userspace x86 to AArch64 compatibility layer > > Tango binary translator (2) - AArch32 to AArch64 compatibility layer > > QEmu (3) - Not really but they do some userspace ioctl emulation > > Can you expand on "not really"? Clearly there are limitations, but in > general I can happily "chroot" into a distro filesystem using an > otherwise incompatible architecture using a qemu-xxx-static binary. The ioctl emulation in qemu is limited in multiple ways: - it needs to duplicate the kernel's compat emulation for every single command it wants to handle, and will always lag behind what gets merged into the kernel and what drivers a particular distro ships. - some ioctl commands cannot be emulated in user space because the compat code relies on tracking device state in the kernel. - In some cases, emulation can be expensive, both for runtime overhead and for code complexity > > What problems did they hit? > > FEX and Tango hit problems with emulating memory related syscalls. > > - Emulating 32-bit mmap, mremap, shmat in userspace changes behaviour > > All three hit issues with ioctl emulation > > - ioctls are free to do what they want including allocating memory and > > returning opaque structures with pointers. > > Now I think we're getting to what the actual problems are: > > * mmap and friends have no (easy) way of forcing a mapping into a 32 > bit region. > * ioctls are a mess > > The first seems like a reasonable goal - I've seen examples of MAP_32BIT > being (ab)used to do this, but it actually restricts to 31 bits and it's > not even available on arm64. Here I think you'd be better off focusing > on coming up with a new (generic) way of restricting the addresses that > the kernel will pick. I think that would be useful for other projects as well. > ioctls are going to be a problem whatever you do, and I don't think > there is much option other than having a list of known ioctls and > translating them in user space - see below. In particular for the arm32-on-arm64 ioctl case, we have a known-working implementation in the kernel, I don't see why we wouldn't want to use it. the x86-32-on-anything case for FEX is trickier because it does require handling the ia32 alignment case, and the proposed patch does not handle that correctly for all commands. I think this would be fixable in the kernel, but it requires a little more work. > > This is now exposing the compat syscalls to userspace, but for the sake > > of userspace compatibility it is a necessary evil. > > You've yet to convince me that it's "necessary" - I agree on the "evil" > part ;) I think it's much easier to argue in favor of exposing the kernel's ioctl() emulation and a get_unmapped_area() limit to a process specific address than for doing the entire syscalls emulation. The emulation for any of the other syscalls should be manageable once ioctl can be called directly, though there are a couple that could fall into the same category (setsockopt, sendmsg/recvmsg, fcntl). Arnd