Received: by 2002:ac0:8c8e:0:0:0:0:0 with SMTP id r14csp24814ima; Wed, 6 Feb 2019 16:24:47 -0800 (PST) X-Google-Smtp-Source: AHgI3IYDCAXEq+BDhVZMyGYwNcfs8xNxdqg9KkFAWjLYIgjCWwAwoVy3ez3Ff34xeJN5TZRzIgoL X-Received: by 2002:a17:902:8f83:: with SMTP id z3mr13256501plo.328.1549499086997; Wed, 06 Feb 2019 16:24:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549499086; cv=none; d=google.com; s=arc-20160816; b=SHuzguBIwwPXXt/UZepvMdp/CdyRKf0wHA7Nswxj5rEp7YuN/OF93eLv+t7kovE1ha d7vkWqNY10ccP91GY0OKfH+Sl0FmaeGxUJ3uSb7JSTQam7+rcwgxy3/ausHSl8v8UNQ8 PTwrIbH4zTMYFFWLBLdN9lHw2nPi/4q39lyl4gzDqZTxvn40K5wF0lqDWhID9fae+ScK nHi9UDRenQjWab6WaM46LJ2DMtL875k4LXD1DN7X3tytrIiYDh5joLDd5u8+vPxF5MOX 0LIcPm7VLpGLUDnzkxdPhB9EFKNFMv+/RI6/Cd8dmbqMY/Fde2gY5Bduos42f0GXdXjr Mm9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=N9lRRqnsHDT7GDO+v7XW7/VJ5v3X9tiPM2p9vi+qGY8=; b=T/D2wD5ww0T0j3K0qodAPP+Ae03iHrMh2F6AA6w+DbkhZgiBlE82alXNnL/zBu1tA5 2iDH+hJ14Y9AHhI4CTSR65GnU2HzCOJ3DTXUWQiTtGJd5m7mTgC6HlVXmrZ270V+iJfV ZrjeqA4SeW3wIt+0YB2tVC7jgVuUCoc0F0YqUZ0h7c2rvOO/XYDd1gNY39hCwXLy1AsW 9eA6jnssGFivrqguhKoHHBzDpD+ZFD6DMtOGJnlHv9gD1i4dmoJJ72i2vkoi6OSNTvVh uabpTzwCuQ8nffOkD9TDQaAm5/vDA0OBbuW2wArThbJIyjy91OwYEgZ/7eWSXdzVi1bN 5ryw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=k8tnarCT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 19si7144672pgp.186.2019.02.06.16.24.09; Wed, 06 Feb 2019 16:24:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=k8tnarCT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726691AbfBGAWa (ORCPT + 99 others); Wed, 6 Feb 2019 19:22:30 -0500 Received: from mail-ot1-f43.google.com ([209.85.210.43]:42871 "EHLO mail-ot1-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726549AbfBGAW3 (ORCPT ); Wed, 6 Feb 2019 19:22:29 -0500 Received: by mail-ot1-f43.google.com with SMTP id v23so15328486otk.9 for ; Wed, 06 Feb 2019 16:22:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=N9lRRqnsHDT7GDO+v7XW7/VJ5v3X9tiPM2p9vi+qGY8=; b=k8tnarCT7do6m4YClkIf2+OB/cSXl05ZHqWyDLOAoSSB9nAV6GR7IOivzKXDJizJie p+38ESxiMKxx8gLcOsFJUqbLhF88jqW94mz6D55BGtR9Uj7xdHLNf1XxqvgLkYl/9M8A TPIzwWGe56Pi0cx6OdmhIAzJqLRWF/6icRk+DuBoGPG5UjFlitPyQYJbSL3g0aZYoSoO vOgQrVay97ebCs4OHx3tcX1mxEaaYMljmc2PyA3njJmuo6TP1JJH7qsXkEaa487WK+qP rEwesohUJbI4NidP95LSmAQ0iNbB5lrGzrY5kEqLLhWOb7KpJBtjqL6aFn2O36hh2qUR WQ2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=N9lRRqnsHDT7GDO+v7XW7/VJ5v3X9tiPM2p9vi+qGY8=; b=k47GT50aGtymfbIln3MgNXLdAHt+hB7chRcIaxBkR+nGrMsAYwKpHd6Nxmqp2JtTVA WKx3c19+kOl7oSjyswQYLO0iZ5cNFFWe3Hec7bMzPIG+mrfJjctNdhlY5ZREdRJW6Sq9 oL7uob6hmC/LzcOB20Fzy8gSegPD0QnwZ99aTGkBf2Oz2vNG+czOLkgCAnxK9E1LrPAB Aru652m76+WgLtoSkiPFh9sQ4p8+xiWxZSdmeQVGF2M+GuW+jvyH0+wl44XsFbnKDZsd RHLKrUMKS2OcWrEYMX4WAHl4J1lOqd9cUuMtzsstO1jryktYll6RrxrRWuesSgRpZpVW elAw== X-Gm-Message-State: AHQUAuaMczI3yg8HHJK17e/Bq3H4r6088hf6Pb9rtdHIHKvnbgNv9VTE Y5rBUNXgqffMksjMhNddxpHXQ6u6ql6RxWGjcndjiQ== X-Received: by 2002:a05:6808:344:: with SMTP id j4mr1089747oie.149.1549498948442; Wed, 06 Feb 2019 16:22:28 -0800 (PST) MIME-Version: 1.0 References: <20190206173114.GB12227@ziepe.ca> <20190206175233.GN21860@bombadil.infradead.org> <47820c4d696aee41225854071ec73373a273fd4a.camel@redhat.com> <01000168c43d594c-7979fcf8-b9c1-4bda-b29a-500efe001d66-000000@email.amazonses.com> <20190206210356.GZ6173@dastard> <20190206220828.GJ12227@ziepe.ca> <0c868bc615a60c44d618fb0183fcbe0c418c7c83.camel@redhat.com> <20190206232130.GK12227@ziepe.ca> <20190206234132.GB15234@ziepe.ca> In-Reply-To: <20190206234132.GB15234@ziepe.ca> From: Dan Williams Date: Wed, 6 Feb 2019 16:22:16 -0800 Message-ID: Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA To: Jason Gunthorpe Cc: Doug Ledford , Dave Chinner , Christopher Lameter , Matthew Wilcox , Jan Kara , Ira Weiny , lsf-pc@lists.linux-foundation.org, linux-rdma , Linux MM , Linux Kernel Mailing List , John Hubbard , Jerome Glisse , Michal Hocko , linux-nvdimm Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 6, 2019 at 3:41 PM Jason Gunthorpe wrote: [..] > > You're describing the current situation, i.e. Linux already implements > > this, it's called Device-DAX and some users of RDMA find it > > insufficient. The choices are to continue to tell them "no", or say > > "yes, but you need to submit to lease coordination". > > Device-DAX is not what I'm imagining when I say XFS--. > > I mean more like XFS with all features that require rellocation of > blocks disabled. > > Forbidding hold punch, reflink, cow, etc, doesn't devolve back to > device-dax. True, not all the way, but the distinction loses significance as you lose fs features. Filesystems mark DAX functionality experimental [1] precisely because it forbids otherwise typical operations that work in the nominal page cache case. An approach that says "lets cement the list of things a filesystem or a core-memory-mangement facility can't do because RDMA finds it awkward" is bad precedent. It's bad precedent because it abdicates core kernel functionality to userspace and weakens the api contract in surprising ways. EBUSY is a horrible status code especially if an administrator is presented with an emergency situation that a filesystem needs to free up storage capacity and get established memory registrations out of the way. The motivation for the current status quo of failing memory registration for DAX mappings is to help ensure the system does not get into this situation where forward progress cannot be guaranteed. [1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019884.html