Received: by 2002:ac0:8c8e:0:0:0:0:0 with SMTP id r14csp408942ima; Wed, 6 Feb 2019 02:03:58 -0800 (PST) X-Google-Smtp-Source: AHgI3Ia6/mD57A8bk+Sqqag5ne5psrzZSUvE3ll05umdV10ngJD31B0mCWeryZhW1n95QlqMHLPl X-Received: by 2002:a17:902:8d8e:: with SMTP id v14mr9816066plo.133.1549447437957; Wed, 06 Feb 2019 02:03:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549447437; cv=none; d=google.com; s=arc-20160816; b=siXPNrh99dW4LosTzdS//Ik6BmeJWoUr7oEQ4N1r/OnQVdutcSC7yL6CGFINncLWCP GF13qVMivXpAlRAZNmCoK6vfpl6Q7Q+DkXa35jjTrQYLGXOnsbhDXXBrjn3deyj7ZbsO V+T8bxabHe0ec1/CJtzgGqhYWg4FlyQuSUg0ArCrsWEPTcGJsi/xIVq0pq0xsh9j+b5L KZCy0ayoh5r51K/YNa6G3wrlJMwiwlK91h+/9cRKpNFvE7HhjSyvXT3tZp3t34uXuGne 76HH2sX8cD+veqnNUozT7uE3Ovk7lpQEo5vxx4DUQZBs/KPZQlsc0niBxo9aQzmiF/W0 yiWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=DuY2zviTcaphvLnzxlkAhfID059faVqqhbDuTJeLZVY=; b=vPFPcFTUYw1qvq+zvAAr9lJc3Ci+q5BZmD5MPuD9VyIy1cGOPThXHCr6mlAdPyC1IZ n7e5AvGyK9LTSaSDcOfCs6U8s1kKcNglSV53lP7OROVAu0ZqNSQpXC5tnJE/3E88w8X6 rCqVKYe2ym0M9CNlsaGFeDhjHCGcBhrlPLcmW4ylKptfVfilRRrz6TCXGxsMFJzzKVJa KRqGf89nWDdR6aU+PUF4OASuQatwiuI2bmxKcJg0zbIR1pdB0qH3lgwsBf1xRVRRlClE o4c2LvAsPrirPWDa3BfoZlImSy1I4vT4LJ+0D49BT3iYqhsZVz8K/PeeyIMwRVPhILK9 aRhA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n18si5691646pfj.30.2019.02.06.02.03.41; Wed, 06 Feb 2019 02:03:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728295AbfBFJuE (ORCPT + 99 others); Wed, 6 Feb 2019 04:50:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:42260 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725965AbfBFJuD (ORCPT ); Wed, 6 Feb 2019 04:50:03 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 263D1AE43; Wed, 6 Feb 2019 09:50:02 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 7E0631E3E15; Wed, 6 Feb 2019 10:50:00 +0100 (CET) Date: Wed, 6 Feb 2019 10:50:00 +0100 From: Jan Kara To: Ira Weiny Cc: lsf-pc@lists.linux-foundation.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, John Hubbard , Jan Kara , Jerome Glisse , Dan Williams , Matthew Wilcox , Jason Gunthorpe , Dave Chinner , Doug Ledford , Michal Hocko Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA Message-ID: <20190206095000.GA12006@quack2.suse.cz> References: <20190205175059.GB21617@iweiny-DESK2.sc.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190205175059.GB21617@iweiny-DESK2.sc.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 05-02-19 09:50:59, Ira Weiny wrote: > The problem: Once we have pages marked as GUP-pinned how should various > subsystems work with those markings. > > The current work for John Hubbards proposed solutions (part 1 and 2) is > progressing.[1] But the final part (3) of his solution is also going to take > some work. > > In Johns presentation he lists 3 alternatives for gup-pinned pages: > > 1) Hold off try_to_unmap > 2) Allow writeback while pinned (via bounce buffers) > [Note this will not work for DAX] Well, but DAX does not need it because by definition there's nothing to writeback :) > 3) Use a "revocable reservation" (or lease) on those pages > 4) Pin the blocks as busy in the FS allocator > > The problem with lease's on pages used by RDMA is that the references to > these pages is not local to the machine. Once the user has been given > access to the page they, through the use of a remote tokens, give a > reference to that page to remote nodes. This is the core essence of > RDMA, and like it or not, something which is increasingly used by major > Linux users. > > Therefore we need to discuss the extent by which leases are appropriate and > what happens should a lease be revoked which a user does not respond to. I don't know the RDMA hardware so this is just an opinion of filesystem / mm guy but my idea how this should work would be: MM/FS asks for lease to be revoked. The revoke handler agrees with the other side on cancelling RDMA or whatever and drops the page pins. Now I understand there can be HW / communication failures etc. in which case the driver could either block waiting or make sure future IO will fail and drop the pins. But under normal conditions there should be a way to revoke the access. And if the HW/driver cannot support this, then don't let it anywhere near DAX filesystem. Honza -- Jan Kara SUSE Labs, CR