Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp470imm; Mon, 1 Oct 2018 05:48:44 -0700 (PDT) X-Google-Smtp-Source: ACcGV62gHNv9+i1VG6vFlTbDeYTPmseD5Dv6jq1mCX69h/XSefXXx1a17qnFmDKcHMXABAN4f6Cn X-Received: by 2002:a17:902:bd04:: with SMTP id p4-v6mr11792496pls.265.1538398124436; Mon, 01 Oct 2018 05:48:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538398124; cv=none; d=google.com; s=arc-20160816; b=WJAH8lhoA5L3ygZ2n6EuaU+a6HSFQLyoePWdpLMVSCYesfhqM1UVVv175DDTYCi5JE IL537HYLnbzagS2ElZi+Ae8vdENwNfVJaGaVd0MY0HO4k3h5UsNB59Mh07o5KBu66aK7 Daax6uEO2b31jZfNoj3AUXu+eUaT2Wnq4rmph3Q5NEAN0A6rKTXuSY7NsqJNfkVxP1on VHH7LltJiLjLYfI/0skN1jvrs4MqnMzUJyFeJf1OT4AxVm/j1cY+KKAfWWBh7bYHO870 bHQxl4Qt9KXs3pudZ63X9oQ8M2TYaEZv3VhPmdhZibwzDe8a44BvutDEgJSvdGFayP7w Z/EQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=VnOjUxDtVHfOPJ3vIHFRq8kSlm4sSv32rFSlCnl0cQY=; b=jyT0/fMA5t9im7PB2SoyzI+v7MvGvZQmhnu6uJGQDS1qnCqPyy6nbnf+qzFxQVa5hU EDuJ2XJkLkHTRhPQJQm+kf92ni6kL3sPq0EnigYWYQ8o91pcd0RnWNDRPRbU3TmjoXW+ +/RpU98JwN+33R8wNg5NA4CcIEeNr4KeODhzvBltYhoP2IYH1/U4TtxlzTvxRGfKVNYJ z9m0VDN5N3du7Vz/TdB3s1XeTwOi72IoijAVijmq06H7Zlat/N8qz2ibm0j7IBwiNRkU QhPvd7OL0fmpntIIsYUG3rcAIF7TH+B7EanVuc0KfA09KOxXVCHt/7P83gs1bKYBygEF 9U2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=bdd8fJe7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w24-v6si12943779ply.370.2018.10.01.05.48.29; Mon, 01 Oct 2018 05:48:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=bdd8fJe7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729377AbeJATZy (ORCPT + 99 others); Mon, 1 Oct 2018 15:25:54 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:45008 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729017AbeJATZy (ORCPT ); Mon, 1 Oct 2018 15:25:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=VnOjUxDtVHfOPJ3vIHFRq8kSlm4sSv32rFSlCnl0cQY=; b=bdd8fJe7btu3eWHQb0J250P5r Z7mkH7Dub8JYJKv6r5DUL2UQvXW0opZZCjuw4gTAa9ZtEwJQjGWQc6cB/eU+E/azhLHgzzJ5zL91P Quvb8W0dc/b01Sv8SM9M8rz8+eUAMbygB3xDDabmA9wCkXQAYwajJbbENlHc7W2lLXMnDOnfDUuDC pWauct35pU/ceXYQobAOCIQ1K7G2cT26etMhVKdDU+rropxCBlEz2lbz2FHjWVVSIFn5YOPH4r/8t qTFewCChPzz4FgdGh/ViY8zU/qnwteult9QQkMX9bT4Vdtfm72q42BG79O5EtWL8gvRBXBTHDthYh kOVAkaHEg==; Received: from hch by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1g6xcE-0001ab-78; Mon, 01 Oct 2018 12:47:58 +0000 Date: Mon, 1 Oct 2018 05:47:57 -0700 From: Christoph Hellwig To: Dave Chinner Cc: Jerome Glisse , John Hubbard , john.hubbard@gmail.com, Matthew Wilcox , Michal Hocko , Christopher Lameter , Jason Gunthorpe , Dan Williams , Jan Kara , Al Viro , linux-mm@kvack.org, LKML , linux-rdma , linux-fsdevel@vger.kernel.org, Christian Benvenuti , Dennis Dalessandro , Doug Ledford , Mike Marciniszyn Subject: Re: [PATCH 0/4] get_user_pages*() and RDMA: first steps Message-ID: <20181001124757.GA26218@infradead.org> References: <20180928053949.5381-1-jhubbard@nvidia.com> <20180928152958.GA3321@redhat.com> <4c884529-e2ff-3808-9763-eb0e71f5a616@nvidia.com> <20180928214934.GA3265@redhat.com> <20180929084608.GA3188@redhat.com> <20181001061127.GQ31060@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181001061127.GQ31060@dastard> User-Agent: Mutt/1.9.2 (2017-12-15) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 01, 2018 at 04:11:27PM +1000, Dave Chinner wrote: > This reminds me so much of Linux mmap() in the mid-2000s - mmap() > worked for ext3 without being aware of page faults, And "worked" still is a bit of a stretch, as soon as you'd get ENOSPC it would still blow up badly. Which probably makes it an even better analogy to the current case. > RDMA does not call ->page_mkwrite on clean file backed pages before it > writes to them and calls set_page_dirty(), and hence RDMA to file > backed pages is completely unreliable. I'm not sure this can be > solved without having page fault capable RDMA hardware.... We can always software prefault at gup time. And also remember that while RDMA might be the case at least some people care about here it really isn't different from any of the other gup + I/O cases, including doing direct I/O to a mmap area. The only difference in the various cases is how long the area should be pinned down - some users like RDMA want a long term mapping, while others like direct I/O just need a short transient one. > We could address these use-after-free situations via forcing RDMA to > use file layout leases and revoke the lease when we need to modify > the backing store on leased files. However, this doesn't solve the > need for filesystems to receive write fault notifications via > ->page_mkwrite. Exactly. We need three things here: - notification to the filesystem that a page is (possibly) beeing written to - a way to to block fs operations while the pages are pinned - a way to distinguish between short and long term mappings, and only allow long terms mappings if they can be broken using something like leases I'm also pretty sure we already explained this a long time ago when the issue came up last year, so I'm not sure why this is even still contentious.