Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp613362ybl; Fri, 9 Aug 2019 10:53:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqwhyGRdjsqsf3ZLilPXD1rpSCsfh2tHEE71M9RnpWq01bV5aCZEDKA9msh/6837hf+VGQLx X-Received: by 2002:a62:640c:: with SMTP id y12mr22415301pfb.166.1565373207691; Fri, 09 Aug 2019 10:53:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565373207; cv=none; d=google.com; s=arc-20160816; b=Pj6JMbgggc1JCr33Rxz/wwiTx3ra+R7Kxg0E2Ec2bbpG4cqMW36cxbMSN30Sr5hvXh T8SpkCTQ9ao2tIGnTfqmrzXuIRaFMlcyBZGjaIvZZSjkzX3CKk+WBquEE7EsDy954TLG PwCmJuughwMIDodG31rQosQw3vyzp/9+HJHN5aE/2PIgn31Jxwe04cdAgCdMwDldSTid qBCkZEUaZZznPFuPo81abQSI5UpmrFMfInU1ojF0MxW7wkxvWUCQFMwPPja8rXeZIIdU KlaE+P94rsu2hjTQITyGYkg+tdgSn92G8Jhal29cD88bq8n279hcFPV4FzdmHzdM9DiW MHNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=+EnQ/hBCrWz4kOJWZvaOZliKLpp2eiWthmSJe+7Vl3o=; b=BJ3H8VJBCWHSTdKVTT03vEvbBE5vRm9Lx8CbxyfQVxVYAqmuYvPrdzgxR/JoiJ8B+O Cm4u3483D2RnfCxboxYB21iNq4PB3OScTLBGmTLCnZjy0N5ygus5oLPqZv+o/aZAqUqM NFnYeZ27IDkIloTLQxy4QcYzlJDCaRA/MuynBU6qWFR/8AKXQ65m0NWBcxWyBcCMspWN HmKAR0r+dVhV7io/luTums+4gyGpHC5MSmU0mfcOzpH0hQ29GBaHKYVY4YGI2JR6jjyy +C+3JCBBkq8msKrNgY/GsOh2/3Cz0HL6d5ZnO5rmEmBZJ1r0tZNUNA8vkixhaHg994SA D7bw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s66si59509739pfs.120.2019.08.09.10.53.11; Fri, 09 Aug 2019 10:53:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436712AbfHIRwT (ORCPT + 99 others); Fri, 9 Aug 2019 13:52:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:52114 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726628AbfHIRwS (ORCPT ); Fri, 9 Aug 2019 13:52:18 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 97480ADCB; Fri, 9 Aug 2019 17:52:16 +0000 (UTC) Date: Fri, 9 Aug 2019 19:52:10 +0200 From: Michal Hocko To: Jan Kara Cc: John Hubbard , Vlastimil Babka , Andrew Morton , Christoph Hellwig , Ira Weiny , Jason Gunthorpe , Jerome Glisse , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Dan Williams , Daniel Black , Matthew Wilcox , Mike Kravetz Subject: Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*() Message-ID: <20190809175210.GR18351@dhcp22.suse.cz> References: <20190805222019.28592-2-jhubbard@nvidia.com> <20190807110147.GT11812@dhcp22.suse.cz> <01b5ed91-a8f7-6b36-a068-31870c05aad6@nvidia.com> <20190808062155.GF11812@dhcp22.suse.cz> <875dca95-b037-d0c7-38bc-4b4c4deea2c7@suse.cz> <306128f9-8cc6-761b-9b05-578edf6cce56@nvidia.com> <420a5039-a79c-3872-38ea-807cedca3b8a@suse.cz> <20190809082307.GL18351@dhcp22.suse.cz> <20190809135813.GF17568@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190809135813.GF17568@quack2.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 09-08-19 15:58:13, Jan Kara wrote: > On Fri 09-08-19 10:23:07, Michal Hocko wrote: > > On Fri 09-08-19 10:12:48, Vlastimil Babka wrote: > > > On 8/9/19 12:59 AM, John Hubbard wrote: > > > >>> That's true. However, I'm not sure munlocking is where the > > > >>> put_user_page() machinery is intended to be used anyway? These are > > > >>> short-term pins for struct page manipulation, not e.g. dirtying of page > > > >>> contents. Reading commit fc1d8e7cca2d I don't think this case falls > > > >>> within the reasoning there. Perhaps not all GUP users should be > > > >>> converted to the planned separate GUP tracking, and instead we should > > > >>> have a GUP/follow_page_mask() variant that keeps using get_page/put_page? > > > >>> > > > >> > > > >> Interesting. So far, the approach has been to get all the gup callers to > > > >> release via put_user_page(), but if we add in Jan's and Ira's vaddr_pin_pages() > > > >> wrapper, then maybe we could leave some sites unconverted. > > > >> > > > >> However, in order to do so, we would have to change things so that we have > > > >> one set of APIs (gup) that do *not* increment a pin count, and another set > > > >> (vaddr_pin_pages) that do. > > > >> > > > >> Is that where we want to go...? > > > >> > > > > > > We already have a FOLL_LONGTERM flag, isn't that somehow related? And if > > > it's not exactly the same thing, perhaps a new gup flag to distinguish > > > which kind of pinning to use? > > > > Agreed. This is a shiny example how forcing all existing gup users into > > the new scheme is subotimal at best. Not the mention the overal > > fragility mention elsewhere. I dislike the conversion even more now. > > > > Sorry if this was already discussed already but why the new pinning is > > not bound to FOLL_LONGTERM (ideally hidden by an interface so that users > > do not have to care about the flag) only? > > The new tracking cannot be bound to FOLL_LONGTERM. Anything that gets page > reference and then touches page data (e.g. direct IO) needs the new kind of > tracking so that filesystem knows someone is messing with the page data. > So what John is trying to address is a different (although related) problem > to someone pinning a page for a long time. OK, I see. Thanks for the clarification. > In principle, I'm not strongly opposed to a new FOLL flag to determine > whether a pin or an ordinary page reference will be acquired at least as an > internal implementation detail inside mm/gup.c. But I would really like to > discourage new GUP users taking just page reference as the most clueless > users (drivers) usually need a pin in the sense John implements. So in > terms of API I'd strongly prefer to deprecate GUP as an API, provide > vaddr_pin_pages() for drivers to get their buffer pages pinned and then for > those few users who really know what they are doing (and who are not > interested in page contents) we can have APIs like follow_page() to get a > page reference from a virtual address. Yes, going with a dedicated API sounds much better to me. Whether a dedicated FOLL flag is used internally is not that important. I am also for making the underlying gup to be really internal to the core kernel. Thanks! -- Michal Hocko SUSE Labs