Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp3498880img; Mon, 25 Mar 2019 11:30:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqx9zr1vErDItAr/wC84SsoOLx11ypn7hETfj1jbNqw0TraV64iKLujNScu+8xv8nWnWba+j X-Received: by 2002:a63:7154:: with SMTP id b20mr17966729pgn.359.1553538628873; Mon, 25 Mar 2019 11:30:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553538628; cv=none; d=google.com; s=arc-20160816; b=ScPS9XWA3wzD9C1PSK67wBwGGN8v7fX6fe6FdaAmYQjzhrtWlxNAoVdDETyMHr+6xO K3ZV9gUuIhq0lyath0Y90X+EJc9LYKbIj0/O20TcUtV8px6EVdZE/AKERQpf/iwkqgiv 2+Z7r8pfDWLQxd2urEu4Ke6dCNfmUjACmEdFGPjm5hbRrxnRMyHBVGMtkAeLr1cE8z6w NFTBDCbfJrcObgW/XcYHGDyha7Z38+f2AAp5/nQuhHfJyBaB5pyYi1dLAVvEM8pxfv6M 9LqVXhlSviNG8i43s8ZIEIfcTIvXZzBiiGwR68FLgSMh+KR8fLlqJMj5x9h1lIsYz4vw gFZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=N++kpRceu1TrCUh9A0/cWGjt+rsrS+KjvY8q8Tn+5qI=; b=P2D2SU0fM4z7pPxzw3F9E4ZieIRQ9726XhSD94VqC7BzYAS/f5Os1106CzZb+QGkYh dEmoBuvae4L0sGMs+cpt8APhUCe42S/IzjKwMHnfVyy9559yM1MavZ8EVlnllTGVfUGl rROhqg9AxL/BuVZ1qbhak5rriwIjqYMxtUqPqq7maHWvX/yAHBTO5vUwUkKit7zXSjmU mb/hxWH97Qjr4VAIQC+8kO/3W1gEprhy5ErpbBCdNkRntVyFPwvVY0KZ8jiRI3n8fXMd taBNhtmkBUc2Hefr2MvpYcd7aU1VvDMHsWmm0OAddcfSp6olrGhB3jMt+iqd5ONbBmaA s0Zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u3si7935764pfn.281.2019.03.25.11.30.13; Mon, 25 Mar 2019 11:30:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730209AbfCYS2U (ORCPT + 99 others); Mon, 25 Mar 2019 14:28:20 -0400 Received: from mga09.intel.com ([134.134.136.24]:21251 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728912AbfCYS2T (ORCPT ); Mon, 25 Mar 2019 14:28:19 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Mar 2019 11:28:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,269,1549958400"; d="scan'208";a="125755489" Received: from iweiny-desk2.sc.intel.com ([10.3.52.157]) by orsmga007.jf.intel.com with ESMTP; 25 Mar 2019 11:28:17 -0700 Date: Mon, 25 Mar 2019 03:27:05 -0700 From: Ira Weiny To: Dan Williams Cc: Andrew Morton , John Hubbard , Michal Hocko , "Kirill A. Shutemov" , Peter Zijlstra , Jason Gunthorpe , Benjamin Herrenschmidt , Paul Mackerras , "David S. Miller" , Martin Schwidefsky , Heiko Carstens , Rich Felker , Yoshinori Sato , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Ralf Baechle , James Hogan , "Aneesh Kumar K . V" , Michal Hocko , linux-mm , Linux Kernel Mailing List , linux-mips@vger.kernel.org, linuxppc-dev , linux-s390 , Linux-sh , sparclinux@vger.kernel.org, linux-rdma@vger.kernel.org, "netdev@vger.kernel.org" Subject: Re: [RESEND 1/7] mm/gup: Replace get_user_pages_longterm() with FOLL_LONGTERM Message-ID: <20190325102705.GG16366@iweiny-DESK2.sc.intel.com> References: <20190317183438.2057-1-ira.weiny@intel.com> <20190317183438.2057-2-ira.weiny@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.1 (2018-12-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 22, 2019 at 02:24:40PM -0700, Dan Williams wrote: > On Sun, Mar 17, 2019 at 7:36 PM wrote: [snip] > > + * __gup_longterm_locked() is a wrapper for __get_uer_pages_locked which > > s/uer/user/ > > > + * allows us to process the FOLL_LONGTERM flag if present. > > + * > > + * FOLL_LONGTERM Checks for either DAX VMAs or PPC CMA regions and either fails > > + * the pin or attempts to migrate the page as appropriate. > > + * > > + * In the filesystem-dax case mappings are subject to the lifetime enforced by > > + * the filesystem and we need guarantees that longterm users like RDMA and V4L2 > > + * only establish mappings that have a kernel enforced revocation mechanism. > > + * > > + * In the CMA case pages can't be pinned in a CMA region as this would > > + * unnecessarily fragment that region. So CMA attempts to migrate the page > > + * before pinning. > > * > > * "longterm" == userspace controlled elevated page count lifetime. > > * Contrast this to iov_iter_get_pages() usages which are transient. > > Ah, here's the longterm documentation, but if I was a developer > considering whether to use FOLL_LONGTERM or not I would expect to find > the documentation at the flag definition site. > > I think it has become more clear since get_user_pages_longterm() was > initially merged that we need to warn people not to use it, or at > least seriously reconsider whether they want an interface to support > indefinite pins. I will move the comment to the flag definition but... In reviewing this comment it occurs to me that the addition of special casing CMA regions via FOLL_LONGTERM has made it less experimental/temporary and now simply implies intent to the GUP code as to the use of the pages. As I'm not super familiar with the CMA use case I can't say for certain but it seems that it is not a temporary solution. So I'm not going to refrain from a FIXME WRT removing the flag. New suggested text below. diff --git a/include/linux/mm.h b/include/linux/mm.h index 6831077d126c..5db9d8e894aa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2596,7 +2596,28 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ #define FOLL_COW 0x4000 /* internal GUP flag */ #define FOLL_ANON 0x8000 /* don't do file mappings */ -#define FOLL_LONGTERM 0x10000 /* mapping is intended for a long term pin */ +#define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ + +/* + * NOTE on FOLL_LONGTERM: + * + * FOLL_LONGTERM indicates that the page will be held for an indefinite time + * period _often_ under userspace control. This is contrasted with + * iov_iter_get_pages() where usages which are transient. + * + * FIXME: For pages which are part of a filesystem, mappings are subject to the + * lifetime enforced by the filesystem and we need guarantees that longterm + * users like RDMA and V4L2 only establish mappings which coordinate usage with + * the filesystem. Ideas for this coordination include revoking the longterm + * pin, delaying writeback, bounce buffer page writeback, etc. As FS DAX was + * added after the problem with filesystems was found FS DAX VMAs are + * specifically failed. Filesystem pages are still subject to bugs and use of + * FOLL_LONGTERM should be avoided on those pages. + * + * In the CMA case: longterm pins in a CMA region would unnecessarily fragment + * that region. And so CMA attempts to migrate the page before pinning when + * FOLL_LONGTERM is specified. + */ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) {