Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2529267imu; Thu, 17 Jan 2019 16:19:16 -0800 (PST) X-Google-Smtp-Source: ALg8bN7R224k0CiYMMFO4VhbeurO0/NZJRpB8a+dfsRxLWF2Ht7xTsMGYYTMXh6WZcwgQujNxww4 X-Received: by 2002:a62:160d:: with SMTP id 13mr16921362pfw.203.1547770756015; Thu, 17 Jan 2019 16:19:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547770755; cv=none; d=google.com; s=arc-20160816; b=mWM0oZsfmsHzZOscvmWDl7oAexk1EkDb7VIXGtd6XebKtB0cu67n2vCWmPtUe6FDS+ J6T0CWhRgI8qYXGm14xs5UOXw5KBJ6Cckf9TIFoYmJYSsR5TJmakQVs4ZhG1gXAmD25u Ig5r4Deci/wO0ysXL04upOi2UPgFdbUwcrYSiyFoQ/qjuvxfF4d4F+7+1gXfrdRZcI9R ab8mc3oaf6bumard18lSCLA8Ztv+MaxpcnfpJcmDT8qCJKtG4mwMUsj9eJIYlfctUkRM 0wNW+lMwY5tgtOkKWmqN8kT85tMsRaqhwGqUgniREfriSaMm00lTrdEEDWBLT7kAO8ez 9V9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=iWEpWWQEy2JspZ+FJ6Fccxh3om0LDrzUFTBthkXDUW8=; b=NC58OWOB6aWALRaRTKJj55rCVZHYYvOGw1yPPjqvZ1jV8tJoSIvZ6PNCjBSKTWKwCM fI8lwKjkiEPDFSfC7erqFnEcL7memNp0JIfq5uRUCu/afPtMgtilZ4B7/mjJOl1mJzGH d7gfwkm8yjBxXzjXfH42M6QNt0NISrTz9u2opI1fE/vUXHLpE6SWAOlW4UaYQQa6Lb6s 4Hlo4l3HGL+3FjLGn+0+oxR7cDy8GQO9gTNjgNXLd7sxSnp3J+/ljDx4fbu/P2G4a+B5 3uVZV7vj3sZpEG8qIDlg9A7/8pMjz0EmdgOTYL2FvKTH951TAFQivG4zB31hR8ZawNtt s4sQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j22si2569212pgj.244.2019.01.17.16.18.57; Thu, 17 Jan 2019 16:19:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726782AbfARAQP (ORCPT + 99 others); Thu, 17 Jan 2019 19:16:15 -0500 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:41927 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725938AbfARAQO (ORCPT ); Thu, 17 Jan 2019 19:16:14 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 18 Jan 2019 10:46:10 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1gkHpQ-0004hn-T9; Fri, 18 Jan 2019 11:16:09 +1100 Date: Fri, 18 Jan 2019 11:16:08 +1100 From: Dave Chinner To: Jerome Glisse Cc: John Hubbard , Jan Kara , Matthew Wilcox , Dan Williams , John Hubbard , Andrew Morton , Linux MM , tom@talpey.com, Al Viro , benve@cisco.com, Christoph Hellwig , Christopher Lameter , "Dalessandro, Dennis" , Doug Ledford , Jason Gunthorpe , Michal Hocko , mike.marciniszyn@intel.com, rcampbell@nvidia.com, Linux Kernel Mailing List , linux-fsdevel Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions Message-ID: <20190118001608.GX4205@dastard> References: <294bdcfa-5bf9-9c09-9d43-875e8375e264@nvidia.com> <20190112024625.GB5059@redhat.com> <20190114145447.GJ13316@quack2.suse.cz> <20190114172124.GA3702@redhat.com> <20190115080759.GC29524@quack2.suse.cz> <20190116113819.GD26069@quack2.suse.cz> <20190116130813.GA3617@redhat.com> <5c6dc6ed-4c8d-bce7-df02-ee8b7785b265@nvidia.com> <20190117152108.GB3550@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190117152108.GB3550@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 17, 2019 at 10:21:08AM -0500, Jerome Glisse wrote: > On Wed, Jan 16, 2019 at 09:42:25PM -0800, John Hubbard wrote: > > On 1/16/19 5:08 AM, Jerome Glisse wrote: > > > On Wed, Jan 16, 2019 at 12:38:19PM +0100, Jan Kara wrote: > > >> That actually touches on another question I wanted to get opinions on. GUP > > >> can be for read and GUP can be for write (that is one of GUP flags). > > >> Filesystems with page cache generally have issues only with GUP for write > > >> as it can currently corrupt data, unexpectedly dirty page etc.. DAX & memory > > >> hotplug have issues with both (DAX cannot truncate page pinned in any way, > > >> memory hotplug will just loop in kernel until the page gets unpinned). So > > >> we probably want to track both types of GUP pins and page-cache based > > >> filesystems will take the hit even if they don't have to for read-pins? > > > > > > Yes the distinction between read and write would be nice. With the map > > > count solution you can only increment the mapcount for GUP(write=true). > > > With pin bias the issue is that a big number of read pin can trigger > > > false positive ie you would do: > > > GUP(vaddr, write) > > > ... > > > if (write) > > > atomic_add(page->refcount, PAGE_PIN_BIAS) > > > else > > > atomic_inc(page->refcount) > > > > > > PUP(page, write) > > > if (write) > > > atomic_add(page->refcount, -PAGE_PIN_BIAS) > > > else > > > atomic_dec(page->refcount) > > > > > > I am guessing false positive because of too many read GUP is ok as > > > it should be unlikely and when it happens then we take the hit. > > > > > > > I'm also intrigued by the point that read-only GUP is harmless, and we > > could just focus on the writeable case. > > For filesystem anybody that just look at the page is fine, as it would > not change its content thus the page would stay stable. Other processes can access and dirty the page cache page while there is a GUP reference. It's unclear to me whether that changes what GUP needs to do here, but we can't assume a page referenced for read-only GUP will be clean and unchanging for the duration of the GUP reference. It may even be dirty at the time of the read-only GUP pin... Cheers, Dave. -- Dave Chinner david@fromorbit.com