Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp1230731imj; Thu, 7 Feb 2019 20:43:28 -0800 (PST) X-Google-Smtp-Source: AHgI3IZ4L9DhI3k+cwaRavssOk+WE3FGOI1qzyaS8H9anm8PMJjSYSfXZKDqhqLO5SYi929bpux9 X-Received: by 2002:a62:b15:: with SMTP id t21mr20837305pfi.136.1549601008414; Thu, 07 Feb 2019 20:43:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549601008; cv=none; d=google.com; s=arc-20160816; b=IAZWkpHg0MKZjiLQVgu8badZoa6ebViEFjXn+NDbR19zA4VszygWa7ttV04n/S80Xk Nlq+DDWiNIi2VsecJtAnxBE8o2HYkLlHxAbpj3FvzMx7CVK+NZFIvmdM0nvuv8XJifUG tBihkWd4e+bD9zLV3KsfyFslX3qEGBEk5/dD/F7pa+Mro0ZMngKXMjqqXP2kRLu9fzxq Gjk3RANdwkQcE60etFr8celGA+Nhbz5Cm63zoaNGjN6Ji9ix4QfFPlk+GdziApbkADxn 3NSJsNZ6nWMEOUyBB0BMl2i6hnnanUvBkHZyv2AF9ycVywkVn3xmu0CTFifAIBZR6W6y r8rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=NGfF3SjXSF+XsniPipOZHVwea5S9QedD9GnnN2j2F5Y=; b=llYgSMlpaCGYgXdTjwaJE3uNfSvg5Bywh6jQarAMViYgS4c62gF5hPyLwvgG9EVGnv 1BBLjTjn6kV83oFCx8AzuuEkI0X+K/6kZ8z8ObEgshGD5YKcKVuODd/r29CiGrTAR1K4 E46xtaVIKIDcT4IdgVIFJo5qY5h1QoYUfjWOidM5TJT32hEQ0DKAOoGjq+QANdxIs+NL 7Sk691H2988bGfBRMS5SC7Iw1k/FmTHZuhH543LPuBlxK5zSxUZ9es7MMzbi3mi7T/m3 57TDChoM+UUEeiZRnWuHKCEqvK0cJMuLZElluR2MxH+CU2YFVG5A5U+NnqVC1eiIQx+J Ni7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e69si1047673pgc.552.2019.02.07.20.43.12; Thu, 07 Feb 2019 20:43:28 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726915AbfBHEnH (ORCPT + 99 others); Thu, 7 Feb 2019 23:43:07 -0500 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:31792 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726793AbfBHEnG (ORCPT ); Thu, 7 Feb 2019 23:43:06 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl6.internode.on.net with ESMTP; 08 Feb 2019 15:13:03 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1gry0E-0005Lj-8g; Fri, 08 Feb 2019 15:43:02 +1100 Date: Fri, 8 Feb 2019 15:43:02 +1100 From: Dave Chinner To: Christopher Lameter Cc: Doug Ledford , Dan Williams , Jason Gunthorpe , Matthew Wilcox , Jan Kara , Ira Weiny , lsf-pc@lists.linux-foundation.org, linux-rdma , Linux MM , Linux Kernel Mailing List , John Hubbard , Jerome Glisse , Michal Hocko Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA Message-ID: <20190208044302.GA20493@dastard> References: <20190206173114.GB12227@ziepe.ca> <20190206175233.GN21860@bombadil.infradead.org> <47820c4d696aee41225854071ec73373a273fd4a.camel@redhat.com> <01000168c43d594c-7979fcf8-b9c1-4bda-b29a-500efe001d66-000000@email.amazonses.com> <20190206210356.GZ6173@dastard> <20190206220828.GJ12227@ziepe.ca> <0c868bc615a60c44d618fb0183fcbe0c418c7c83.camel@redhat.com> <01000168c8e2de6b-9ab820ed-38ad-469c-b210-60fcff8ea81c-000000@email.amazonses.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <01000168c8e2de6b-9ab820ed-38ad-469c-b210-60fcff8ea81c-000000@email.amazonses.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 07, 2019 at 04:55:37PM +0000, Christopher Lameter wrote: > One approach that may be a clean way to solve this: > 3. Filesystems that allow bypass of the page cache (like XFS / DAX) will > provide the virtual mapping when the PIN is done and DO NO OPERATIONS > on the longterm pinned range until the long term pin is removed. So, ummm, how do we do block allocation then, which is done on demand during writes? IOWs, this requires the application to set up the file in the correct state for the filesystem to lock it down so somebody else can write to it. That means the file can't be sparse, it can't be preallocated (i.e. can't contain unwritten extents), it must have zeroes written to it's full size before being shared because otherwise it exposes stale data to the remote client (secure sites are going to love that!), they can't be extended, etc. IOWs, once the file is prepped and leased out for RDMA, it becomes an immutable for the purposes of local access. Which, essentially we can already do. Prep the file, map it read/write, mark it immutable, then pin it via the longterm gup interface which can do the necessary checks. Simple to implement, the reasons for errors trying to modify the file are already documented and queriable, and it's hard for applications to get wrong. Cheers, Dave. -- Dave Chinner david@fromorbit.com