Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4332254imm; Mon, 18 Jun 2018 13:05:40 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJH9leAnz4AZt8knh21vOw1dABgQu684J/fqjvVMEyEiKPpOVEo3zeyMze9zBoPaYxK0WKd X-Received: by 2002:a17:902:f83:: with SMTP id 3-v6mr15385544plz.282.1529352340949; Mon, 18 Jun 2018 13:05:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529352340; cv=none; d=google.com; s=arc-20160816; b=tR7KMuvVXYR8LRIW43FKgwLMD8ol89SabLeYTUP646/djzSZcZWOMGAqC9FwbsX8Hd MB1wQKVRpIEBS/vNe86WFObLbnBKsxKEqA+IFmesmsqPA+WDPIi7lm8QpUCog6tkc+oD EVDX3bz6hyU7K0E5mcxETrpbfGxiOq1HkVEghg3i3+lWXq3s1kYgnxtInbdqMCyNNuB5 BgCLeFMIYPTGNsyU/l98ntxo4lqzbW/GXAmQJew+w0RW7x0h0p0Z1BENp9D81GXTWAWp HNpmVXweulPYJLBcbdpy2HJVaTYWjT4jmG6kJdZdI+CyZtcp0AVfxet0tju/p/IEW7EW Wlkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=ycvtm2IqbrzlFlVDk0CAc/Jo5GRhSVfrpXFFv/YbED8=; b=s1UO0/JInX98qFwyZps8lHFoM6swgcfh/0bpH4OTCiVIxgVfJayIulLYHDUjmNlCoI KBYLYWcvy4XTXar/46QyfhCayvqE2qriq3bD+QQHKXltXnDsGcOWciii1wC95oQGtsA3 R89I1fA9YgaxvslejJ/xo4XCb1sWo26V3Mr5vcB47ejpAlvvpe9Oh3BSvzZdpaebNiXo /ssqsdMXsgVcpaoU0joa/oiREuGWGZ3q6y4ZTpv7KL5TSIldTbIAvLDMzLZoth8aEzB8 ji+vdZc9U41Fr3gnAPsE5pOqOZlXJ1wUFJYjYhd/p4JeJ6vlze/PWwJ5WCuv0+cJLBc4 66ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=lUptStrV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r65-v6si14873105pfk.83.2018.06.18.13.05.27; Mon, 18 Jun 2018 13:05:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=lUptStrV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755358AbeFRUEj (ORCPT + 99 others); Mon, 18 Jun 2018 16:04:39 -0400 Received: from mail-ot0-f194.google.com ([74.125.82.194]:36057 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754227AbeFRUEh (ORCPT ); Mon, 18 Jun 2018 16:04:37 -0400 Received: by mail-ot0-f194.google.com with SMTP id c15-v6so19956006otl.3 for ; Mon, 18 Jun 2018 13:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=ycvtm2IqbrzlFlVDk0CAc/Jo5GRhSVfrpXFFv/YbED8=; b=lUptStrVk9B4FXgZIzMmiGeC1afJzwr6tF1TD2GVOsBiUjB4SXjTjBAKyV26uivhPM CVz6RBVii3ASOHW34VcylbpPGoWM9CZ5vt2LUreeXkPDAaKjp8CfRQdl2WiFRB1u6qqc rFBelp0h9UI3bkjA2mKpz/6oUIQtQ7jEf74YISE5nyhCplda3Lom964/x7y2XuOvMIqC OGxdHJf1F4jNgJ/wt1Ypfu+a3aNw9VD4mr8ny1VB+kwmULyDjN0S57z2Ave3i7+IoTyQ zSEEiWl0yW8xuWQJFwxD9KoezqcJ+yslavZM0+R0a5ClaA9VoeZu2Pf96v94W0E7Tigh mW+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ycvtm2IqbrzlFlVDk0CAc/Jo5GRhSVfrpXFFv/YbED8=; b=VUH2fPRh5l69WaAp9R90eGOm0Gb1eV6pAtMw0oaFtwEJ/4Jj0eZbVynPiAgT+Mayyq J55gSdXquaUqxHHTwi3/tbdYxv1hiqgpFriOQZkMXWqgxegPs7jrc1wWWZUg/dH9F4Z7 aR+DO2nX4mTTp7hpr7EDf+07wUCVCnae6S76KmMkVwFjCb/a9FF9gAnLcI1Cq8ta3FIg 8RK8XWE0nH9/buEQDs7+PZKtkBLt8rlmVlerPVzRSOxKetn1oxRSInSf/qyQ4KCD0o7R kWZ/EvoQwOX4GP/quV/64D5XQT1VNGdfEBa0X2vmCTdEXgzgyeuY4DYQDC20QJyjLEjJ 2FPg== X-Gm-Message-State: APt69E3+7Pro15C+09Sg7jR3d926+p2aUhFpDqteIV3GoVZc6C96jWbp AtQiG3WS3tGhRvt67HNzjkWbvtxxOuVf7hyRmN3ZXg== X-Received: by 2002:a9d:7311:: with SMTP id e17-v6mr8239361otk.162.1529352277101; Mon, 18 Jun 2018 13:04:37 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:2ea9:0:0:0:0:0 with HTTP; Mon, 18 Jun 2018 13:04:36 -0700 (PDT) In-Reply-To: <20180618193158.GE6805@ziepe.ca> References: <20180617012510.20139-3-jhubbard@nvidia.com> <20180617200432.krw36wrcwidb25cj@ziepe.ca> <311eba48-60f1-b6cc-d001-5cc3ed4d76a9@nvidia.com> <20180618081258.GB16991@lst.de> <3898ef6b-2fa0-e852-a9ac-d904b47320d5@nvidia.com> <20180618193158.GE6805@ziepe.ca> From: Dan Williams Date: Mon, 18 Jun 2018 13:04:36 -0700 Message-ID: Subject: Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*() To: Jason Gunthorpe Cc: John Hubbard , Christoph Hellwig , John Hubbard , Matthew Wilcox , Michal Hocko , Christopher Lameter , Jan Kara , Linux MM , LKML , linux-rdma Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 18, 2018 at 12:31 PM, Jason Gunthorpe wrote: > On Mon, Jun 18, 2018 at 12:21:46PM -0700, Dan Williams wrote: >> On Mon, Jun 18, 2018 at 11:14 AM, John Hubbard wrote: >> > On 06/18/2018 10:56 AM, Dan Williams wrote: >> >> On Mon, Jun 18, 2018 at 10:50 AM, John Hubbard wrote: >> >>> On 06/18/2018 01:12 AM, Christoph Hellwig wrote: >> >>>> On Sun, Jun 17, 2018 at 01:28:18PM -0700, John Hubbard wrote: >> >>>>> Yes. However, my thinking was: get_user_pages() can become a way to indicate that >> >>>>> these pages are going to be treated specially. In particular, the caller >> >>>>> does not really want or need to support certain file operations, while the >> >>>>> page is flagged this way. >> >>>>> >> >>>>> If necessary, we could add a new API call. >> >>>> >> >>>> That API call is called get_user_pages_longterm. >> >>> >> >>> OK...I had the impression that this was just semi-temporary API for dax, but >> >>> given that it's an exported symbol, I guess it really is here to stay. >> >> >> >> The plan is to go back and provide api changes that bypass >> >> get_user_page_longterm() for RDMA. However, for VFIO and others, it's >> >> not clear what we could do. In the VFIO case the guest would need to >> >> be prepared handle the revocation. >> > >> > OK, let's see if I understand that plan correctly: >> > >> > 1. Change RDMA users (this could be done entirely in the various device drivers' >> > code, unless I'm overlooking something) to use mmu notifiers, and to do their >> > DMA to/from non-pinned pages. >> >> The problem with this approach is surprising the RDMA drivers with >> notifications of teardowns. It's the RDMA userspace applications that >> need the notification, and it likely needs to be explicit opt-in, at >> least for the non-ODP drivers. > > Well, more than that, we have no real plan on how to accomplish this, > or any idea if it can even really work.. Most userspace give up > control of the memory lifetime to the remote side of the connection > and have no way to recover it other than a full teardown. > > Given that John is trying to fix a kernel oops, I don't think we > should tie progress on it to the RDMA notification idea. > > .. and given that John is trying to fix a kernel oops, maybe the > weird/bad/ugly behavior of ftruncte is a better bug to have than for > unprivileged users to be able to oops the kernel??? Trading one bug for another is not a fix. We did not fix the DAX-dma-vs-trruncate bug by breaking truncate() guarantees.