Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1947125imu; Wed, 12 Dec 2018 07:05:05 -0800 (PST) X-Google-Smtp-Source: AFSGD/WnFiykiQJPBqgKq6hio6TomOGSzrx+VsYTSnrI92DVZHqYMhKwGW/lcof8tLIK8XHzMJ6F X-Received: by 2002:a62:6ec8:: with SMTP id j191mr20522466pfc.198.1544627105094; Wed, 12 Dec 2018 07:05:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544627105; cv=none; d=google.com; s=arc-20160816; b=Bz4VLSvJvAM+eUT2n3JKVMF++Fgu0k4NWYS/2ZNhdVX4LnTwI7Y3W3RtBqkksGc0le Z5gCFsRlccCehvReFFZbrb/qQmia8cuWOX4tMjpAdLehp2qn/akORN2swyJWh32KabYO gRJ4xt5VM1j19xxfLNBJ4YnPF6AKoXd88O5zMIegXA/ZOOo7naXqjLyXQiX6ze3uvnUw AmFcU2sppIjDTvUzrxca/bGcPZq3T0AvCQsVpwBdcujsaxRJXFEkIgzqxL9l+19YDCeN v9KpiE0jInt6xjOrBcCMnR71zLb9yrqdEsCb20heXiG3CSu4UGJIjJbh2yVvhtmETIim LQ4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=O9dscqZY58YaUBBgpmyFPT0wkzUQsgNKFQrTIQvfBRY=; b=LMzUhkh5HrGXnwJPSoUwjvaJYXWzTReFvaLjY3o9KJOvL41Ggq/CQHdYM5TJCBgnyd NRshvaglP9hPV8HDOTR8iB78gMuLbubHqWNXYJveP2rViG0jcqCCqxMAEgat5peBOdLq WWLAaezleucez/W3j1JK9h5focc8+V2P4aLp9jYSi/erkrmiY9o2OEwZEE0DbEpUXsho rNSLISNzm89LYeSTDTbMkky+DAObKNGFa0GHasfRDLmpWIJgDXxk/qlMh6e/lMUJWBqi HAaPPWtRd7Z4WI0SoNAfjrr8IbgsqRcYyojV3MFlezft1lNFMpIppDSZUxcK5VOVhBIg twQQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y7si14735022plk.275.2018.12.12.07.04.39; Wed, 12 Dec 2018 07:05:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727751AbeLLPCj (ORCPT + 99 others); Wed, 12 Dec 2018 10:02:39 -0500 Received: from foss.arm.com ([217.140.101.70]:42614 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726228AbeLLPCi (ORCPT ); Wed, 12 Dec 2018 10:02:38 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BCC2E80D; Wed, 12 Dec 2018 07:02:37 -0800 (PST) Received: from arrakis.emea.arm.com (arrakis.cambridge.arm.com [10.1.196.113]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5ADD73F59C; Wed, 12 Dec 2018 07:02:33 -0800 (PST) Date: Wed, 12 Dec 2018 15:02:30 +0000 From: Catalin Marinas To: Andrey Konovalov Cc: Vincenzo Frascino , Mark Rutland , Kate Stewart , "open list:DOCUMENTATION" , Will Deacon , Kostya Serebryany , "open list:KERNEL SELFTEST FRAMEWORK" , Chintan Pandya , Shuah Khan , Ingo Molnar , linux-arch , Jacob Bramley , Dmitry Vyukov , Evgenii Stepanov , Kees Cook , Ruben Ayrapetyan , Ramana Radhakrishnan , Alexander Viro , Linux ARM , Linux Memory Management List , Greg Kroah-Hartman , LKML , Luc Van Oostenryck , Lee Smith , Andrew Morton , Robin Murphy , "Kirill A. Shutemov" Subject: Re: [RFC][PATCH 0/3] arm64 relaxed ABI Message-ID: <20181212150230.GH65138@arrakis.emea.arm.com> References: <20181210143044.12714-1-vincenzo.frascino@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrey, On Wed, Dec 12, 2018 at 03:23:25PM +0100, Andrey Konovalov wrote: > On Mon, Dec 10, 2018 at 3:31 PM Vincenzo Frascino > wrote: > > On arm64 the TCR_EL1.TBI0 bit has been set since Linux 3.x hence > > the userspace (EL0) is allowed to set a non-zero value in the top > > byte but the resulting pointers are not allowed at the user-kernel > > syscall ABI boundary. > > > > This patchset proposes a relaxation of the ABI and a mechanism to > > advertise it to the userspace via an AT_FLAGS. > > > > The rationale behind the choice of AT_FLAGS is that the Unix System V > > ABI defines AT_FLAGS as "flags", leaving some degree of freedom in > > interpretation. > > There are two previous attempts of using AT_FLAGS in the Linux Kernel > > for different reasons: the first was more generic and was used to expose > > the support for the GNU STACK NX feature [1] and the second was done for > > the MIPS architecture and was used to expose the support of "MIPS ABI > > Extension for IEEE Std 754 Non-Compliant Interlinking" [2]. > > Both the changes are currently _not_ merged in mainline. > > The only architecture that reserves some of the bits in AT_FLAGS is > > currently MIPS, which introduced the concept of platform specific ABI > > (psABI) reserving the top-byte [3]. > > > > When ARM64_AT_FLAGS_SYSCALL_TBI is set the kernel is advertising > > to the userspace that a relaxed ABI is supported hence this type > > of pointers are now allowed to be passed to the syscalls when they are > > in memory ranges obtained by anonymous mmap() or brk(). > > > > The userspace _must_ verify that the flag is set before passing tagged > > pointers to the syscalls allowed by this relaxation. > > > > More in general, exposing the ARM64_AT_FLAGS_SYSCALL_TBI flag and mandating > > to the software to check that the feature is present, before using the > > associated functionality, it provides a degree of control on the decision > > of disabling such a feature in future without consequently breaking the > > userspace. [...] > Acked-by: Andrey Konovalov Thanks for the ack. However, if we go ahead with this ABI proposal it means that your patches need to be reworked to allow a non-zero top byte in all syscalls, including mmap() and friends, ioctl(). There are ABI concerns in either case but I'd rather have this discussion in the open. It doesn't necessarily mean that I endorse this proposal, I would like feedback and not just from kernel developers but user space ones. The summary of our internal discussions (mostly between kernel developers) is that we can't properly describe a user ABI that covers future syscalls or syscall extensions while not all syscalls accept tagged pointers. So we tweaked the requirements slightly to only allow tagged pointers back into the kernel *if* the originating address is from an anonymous mmap() or below sbrk(0). This should cover some of the ioctls or getsockopt(TCP_ZEROCOPY_RECEIVE) where the user passes a pointer to a buffer obtained via mmap() on the device operations. (sorry for not being clear on what Vincenzo's proposal implies) -- Catalin