Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1320777imu; Fri, 9 Nov 2018 14:38:41 -0800 (PST) X-Google-Smtp-Source: AJdET5fO2UaRVlUKim0ri1i5BGRn3Abl6TnsrXL4Bc8GaoXtc+eMCFrqm/mQTlXBxzsapv0NMFXM X-Received: by 2002:a17:902:b70c:: with SMTP id d12-v6mr10859297pls.288.1541803121685; Fri, 09 Nov 2018 14:38:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541803121; cv=none; d=google.com; s=arc-20160816; b=sDABr3Gj1e8KULGm9x6W3SDKjXO+drLLXp6JLnYMJlDYljCBUtjMTyS90Zb5zPa2zW C9e8ncbfv86yWfekdE1YuUiVhKIoHopflDfIqWKexRw11+3tmHTwHhLLV3p7yhnRxMkI 7RyLXsdeJRiEtqnoCBdsC8BdRDuApDZwmqkJO+39G8Kpea3XW73GkWnRaYmhu3AwHEBd jJgcZlb2rQ0UIDowv9Fk6u7rINAdVMT49UFa7mqevhDXUHzQYXJC2P1bEzOrM9frdL8Y iNHDq2pLWlKwZ6ugOnQNTt8OFZr0/B8jvacy39vI65xuzPCW4mRvWCdbnEbGGFudVOzn 5s0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=dm5Cg4ujBBRVP1r4DE+QHpIqBHvq1YFgmDPi01hQu+c=; b=sQC/Scka1ON7T0qrNew2LVJ7QH0n1p/MSWX75rTe1RrWvFTBgIPQl4sgXs72dkj77T QmdyyhqkFqnEZ+dJzwla+TKq2xUPtrVjJKZySqXtDDfVkTtA4qLIEe59tJHXJV/GmUaK QSboqTzR9roctkL1q+kIfsb3GPXOkUd9RDg/lXTQLcGO8LUTREX4IJnluJqbsaXtWpPu RcS7n2f/q+LMfrJKAUb+C2D3K8h2ctzfOeYPHqBwdHBTjVpD5ddtAmUYBH6QadAJkoJ9 j1zwgYgRpLG7SKx1XHamMDEOfY2KMRMY0At8KeUDiwaUdISh23GIcxgjCKGc8UV5Whix gKHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=NGpFcZ91; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p2-v6si9371695pfp.82.2018.11.09.14.38.25; Fri, 09 Nov 2018 14:38:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=NGpFcZ91; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728148AbeKJIUk (ORCPT + 99 others); Sat, 10 Nov 2018 03:20:40 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:36569 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726306AbeKJIUk (ORCPT ); Sat, 10 Nov 2018 03:20:40 -0500 Received: by mail-pl1-f193.google.com with SMTP id w24-v6so1560319plq.3 for ; Fri, 09 Nov 2018 14:38:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=dm5Cg4ujBBRVP1r4DE+QHpIqBHvq1YFgmDPi01hQu+c=; b=NGpFcZ91zPTpug1N8s/YJjMeBATKcpnDVMMMepFtAO+V06OHCmlxw5MRcAFO7lEbEo PknXVF7GfxQNTVhf8Mk2bcrRVPRVU0feqfsJnDHTXSYlCtalkh9qFLrW35ukfVctaUFk 0u5PHLH6/JRG0mw/45ZvfMuaTab++QHxnA4SvTcYBiC/FrX+eI0N/Caokey6OI7FJ/F6 JmEumnFC2NxJEVLwdooQVaWlyw6egSeqI9kHBDagQpcygj+c7tPniPmKfPO++Kc3Ag01 CfvmSGTdnEDx9ybISHJ2GNtSUfDSEUOcK9lmXKSZKByr2O5nO5vyxdpm4kkTOBebmLkH G6QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=dm5Cg4ujBBRVP1r4DE+QHpIqBHvq1YFgmDPi01hQu+c=; b=Qz9taGYkLRWf/km8DFmb56V9T9hKoZv3T47qJTbjUKnJFdIqLjfsnGGqa3MaS/so3Y QsHvNxEtmY3NMVay15azt0BIGutSHzKyBuvo8J2vVgfzgeXZdDihEvsfooNJyiF+aPh+ 6AP1Y/XI3xrEhK+k3878nZrwmxYe0DRyMZaQg8spy88I2R0jJhXuiZStqctXHozZugqE zf7dOBDYVuKe/mo/bo994WkMC2wnhS1qDUXfnR2Nze4jsudS5YBk7BAdr5abbouuGmD1 xOUO9X08r6F2ulEMWR2GRGCTQ9f9NyPWIYG/tGvNu7I/t+l1+ltC5kZYjGiu8BvJhT+N OVGw== X-Gm-Message-State: AGRZ1gLBB+XHeXXbB8EZl/D44mJRAHbDULooKb13PXidg6svEgHZwrGY QDdhDbi4zG1iT3+MMYMED6PXOA== X-Received: by 2002:a17:902:d696:: with SMTP id v22-v6mr10503337ply.261.1541803081916; Fri, 09 Nov 2018 14:38:01 -0800 (PST) Received: from ?IPv6:2600:1010:b053:7a5b:70bc:5409:d54c:f5a1? ([2600:1010:b053:7a5b:70bc:5409:d54c:f5a1]) by smtp.gmail.com with ESMTPSA id o27-v6sm18251515pfk.85.2018.11.09.14.38.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Nov 2018 14:38:01 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v3 resend 1/2] mm: Add an F_SEAL_FUTURE_WRITE seal to memfd From: Andy Lutomirski X-Mailer: iPhone Mail (16A404) In-Reply-To: Date: Fri, 9 Nov 2018 14:37:58 -0800 Cc: Jann Horn , Joel Fernandes , kernel list , John Reck , John Stultz , Todd Kjos , Greg Kroah-Hartman , Christoph Hellwig , Al Viro , Andrew Morton , Bruce Fields , Jeff Layton , Khalid Aziz , Lei.Yang@windriver.com, linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org, Linux-MM , marcandre.lureau@redhat.com, Mike Kravetz , Minchan Kim , Shuah Khan , valdis.kletnieks@vt.edu, Hugh Dickins , Linux API Content-Transfer-Encoding: quoted-printable Message-Id: References: <20181108041537.39694-1-joel@joelfernandes.org> To: Daniel Colascione Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Nov 9, 2018, at 2:20 PM, Daniel Colascione wrote: >=20 >> On Fri, Nov 9, 2018 at 1:06 PM, Jann Horn wrote: >>=20 >> +linux-api for API addition >> +hughd as FYI since this is somewhat related to mm/shmem >>=20 >> On Fri, Nov 9, 2018 at 9:46 PM Joel Fernandes (Google) >> wrote: >>> Android uses ashmem for sharing memory regions. We are looking forward >>> to migrating all usecases of ashmem to memfd so that we can possibly >>> remove the ashmem driver in the future from staging while also >>> benefiting from using memfd and contributing to it. Note staging drivers= >>> are also not ABI and generally can be removed at anytime. >>>=20 >>> One of the main usecases Android has is the ability to create a region >>> and mmap it as writeable, then add protection against making any >>> "future" writes while keeping the existing already mmap'ed >>> writeable-region active. This allows us to implement a usecase where >>> receivers of the shared memory buffer can get a read-only view, while >>> the sender continues to write to the buffer. >>> See CursorWindow documentation in Android for more details: >>> https://developer.android.com/reference/android/database/CursorWindow >>>=20 >>> This usecase cannot be implemented with the existing F_SEAL_WRITE seal. >>> To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal >>> which prevents any future mmap and write syscalls from succeeding while >>> keeping the existing mmap active. >>=20 >> Please CC linux-api@ on patches like this. If you had done that, I >> might have criticized your v1 patch instead of your v3 patch... >>=20 >>> The following program shows the seal >>> working in action: >> [...] >>> Cc: jreck@google.com >>> Cc: john.stultz@linaro.org >>> Cc: tkjos@google.com >>> Cc: gregkh@linuxfoundation.org >>> Cc: hch@infradead.org >>> Reviewed-by: John Stultz >>> Signed-off-by: Joel Fernandes (Google) >>> --- >> [...] >>> diff --git a/mm/memfd.c b/mm/memfd.c >>> index 2bb5e257080e..5ba9804e9515 100644 >>> --- a/mm/memfd.c >>> +++ b/mm/memfd.c >> [...] >>> @@ -219,6 +220,25 @@ static int memfd_add_seals(struct file *file, unsig= ned int seals) >>> } >>> } >>>=20 >>> + if ((seals & F_SEAL_FUTURE_WRITE) && >>> + !(*file_seals & F_SEAL_FUTURE_WRITE)) { >>> + /* >>> + * The FUTURE_WRITE seal also prevents growing and shrin= king >>> + * so we need them to be already set, or requested now. >>> + */ >>> + int test_seals =3D (seals | *file_seals) & >>> + (F_SEAL_GROW | F_SEAL_SHRINK); >>> + >>> + if (test_seals !=3D (F_SEAL_GROW | F_SEAL_SHRINK)) { >>> + error =3D -EINVAL; >>> + goto unlock; >>> + } >>> + >>> + spin_lock(&file->f_lock); >>> + file->f_mode &=3D ~(FMODE_WRITE | FMODE_PWRITE); >>> + spin_unlock(&file->f_lock); >>> + } >>=20 >> So you're fiddling around with the file, but not the inode? How are >> you preventing code like the following from re-opening the file as >> writable? >=20 > Good catch. That's fixable too though, isn't it, just by fiddling with > the inode, right? True. >=20 > Another, more general fix might be to prevent /proc/pid/fd/N opens > from "upgrading" access modes. But that'd be a bigger ABI break. I think we should fix that, too. I consider it a bug fix, not an ABI break,= personally. >=20 >> That aside: I wonder whether a better API would be something that >> allows you to create a new readonly file descriptor, instead of >> fiddling with the writability of an existing fd. >=20 > That doesn't work, unfortunately. The ashmem API we're replacing with > memfd requires file descriptor continuity. I also looked into opening > a new FD and dup2(2)ing atop the old one, but this approach doesn't > work in the case that the old FD has already leaked to some other > context (e.g., another dup, SCM_RIGHTS). See > https://developer.android.com/ndk/reference/group/memory. We can't > break ASharedMemory_setProt. Hmm. If we fix the general reopen bug, a way to drop write access from an e= xisting struct file would do what Android needs, right? I don=E2=80=99t kno= w if there are general VFS issues with that.