Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp644563ybl; Fri, 9 Aug 2019 11:23:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqzYQ/6/5b7fhuUN+680wE6YcoFZOBQEWSX7ycWOK2ge/GP2aDXvdgtbKeaPEaQLyKgrb+zm X-Received: by 2002:a17:902:141:: with SMTP id 59mr20714782plb.324.1565375022719; Fri, 09 Aug 2019 11:23:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565375022; cv=none; d=google.com; s=arc-20160816; b=M4GCWgMOeUjYbR3vVVvtA6DLFABHOBzOLlKdMYgh5Yo8cxXNieRQHf+SGKMjrXaAVo RiQmUvhg/mZP81rHPiT0Yf2MRsBSzSfkuMjE8YwaTCWhiCMQNroAxme+0Cet5XFdOeNO 9X0gFQHcFSTrUZt0bDFR5Uxy/CxvaSgBwb0XuiVClxafilxwW2xB9bkYZy6T09P4qHRW ixXRNwT26Zjs8HB61fad8ZGrG8ZOYAZk7zt60T1/gDB6VPWLmMwZenFvcVECwZ2z3A8F TqUXW7M13Y3nk8vnulTNsb3TJexupnO1XeD6w6vkmBc8vaf8XzuNBTVUAUFDqAe3ze2j 9fZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :dlp-reaction:dlp-version:dlp-product:content-language :accept-language:in-reply-to:references:message-id:date:thread-index :thread-topic:subject:cc:to:from; bh=wqH9GPJWbrtR2YuZ5GNPNLfY1iz65o+i7aoXmrc6f+A=; b=nEvEnaGHmUz6LEq/TF51Vp2BpflVmBAUz1iLnpxTi77sTBnoFoDwWEVO3P3bHR8TIF h7ifstp/3e8yMr8zyAKaDE/KatlZxhs4d5fPgENMR42IZqgU4YPqa+JLxOtoTBG5jp7m uoOizQBcE4UfoMAT2lo8hdOY3xpbdDsKPEBe5fyzxW6Rf5dHjRdXSH69bHh6EpN7dZWX iMMBhlg3rauNUNe12T5+OQtuh0wiOjed8Ypn89U8Yk0LaDPNHuB+UQ/9CFMw3//NH/t0 EmyJQstv9mwX27ejQ5sTkTi2SrBF23TuxbdKh9chWIdZcedcvT96PhE8aP9cKcD8R2EE NkFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q85si55585535pfc.85.2019.08.09.11.23.27; Fri, 09 Aug 2019 11:23:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437377AbfHISPB convert rfc822-to-8bit (ORCPT + 99 others); Fri, 9 Aug 2019 14:15:01 -0400 Received: from mga06.intel.com ([134.134.136.31]:6464 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726463AbfHISPA (ORCPT ); Fri, 9 Aug 2019 14:15:00 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 11:15:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,366,1559545200"; d="scan'208";a="180209320" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by orsmga006.jf.intel.com with ESMTP; 09 Aug 2019 11:14:59 -0700 Received: from fmsmsx161.amr.corp.intel.com (10.18.125.9) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.439.0; Fri, 9 Aug 2019 11:14:59 -0700 Received: from crsmsx152.amr.corp.intel.com (172.18.7.35) by FMSMSX161.amr.corp.intel.com (10.18.125.9) with Microsoft SMTP Server (TLS) id 14.3.439.0; Fri, 9 Aug 2019 11:14:58 -0700 Received: from crsmsx101.amr.corp.intel.com ([169.254.1.115]) by CRSMSX152.amr.corp.intel.com ([169.254.5.138]) with mapi id 14.03.0439.000; Fri, 9 Aug 2019 12:14:56 -0600 From: "Weiny, Ira" To: Michal Hocko , Jan Kara CC: John Hubbard , Vlastimil Babka , Andrew Morton , Christoph Hellwig , Jason Gunthorpe , Jerome Glisse , LKML , "linux-mm@kvack.org" , "linux-fsdevel@vger.kernel.org" , "Williams, Dan J" , Daniel Black , "Matthew Wilcox" , Mike Kravetz Subject: RE: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*() Thread-Topic: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*() Thread-Index: AQHVS9wAqKeuPoXzZkyLXp0tqoyMNKbv6+CAgADRpQCAAHJ+gIAAUE4AgACJIACAAD05gIAAmqkAgAAC4oCAAF2ggIAAQV0A//+ec3A= Date: Fri, 9 Aug 2019 18:14:56 +0000 Message-ID: <2807E5FD2F6FDA4886F6618EAC48510E79E7F3E7@CRSMSX101.amr.corp.intel.com> References: <20190805222019.28592-2-jhubbard@nvidia.com> <20190807110147.GT11812@dhcp22.suse.cz> <01b5ed91-a8f7-6b36-a068-31870c05aad6@nvidia.com> <20190808062155.GF11812@dhcp22.suse.cz> <875dca95-b037-d0c7-38bc-4b4c4deea2c7@suse.cz> <306128f9-8cc6-761b-9b05-578edf6cce56@nvidia.com> <420a5039-a79c-3872-38ea-807cedca3b8a@suse.cz> <20190809082307.GL18351@dhcp22.suse.cz> <20190809135813.GF17568@quack2.suse.cz> <20190809175210.GR18351@dhcp22.suse.cz> In-Reply-To: <20190809175210.GR18351@dhcp22.suse.cz> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMzllYmM1OGYtMTcyOS00MGM5LWJhMmMtNWU1NmE2YjQ4ZGJmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiRnhqSFY1Z1NibnZacG1oTUhVT25EaDNENWcxZzJTbVlTaGlyQXF6Z2lSdXNEQmtDR1JLOXJVcFZYN1NqaHFtViJ9 x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [172.18.205.10] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Fri 09-08-19 15:58:13, Jan Kara wrote: > > On Fri 09-08-19 10:23:07, Michal Hocko wrote: > > > On Fri 09-08-19 10:12:48, Vlastimil Babka wrote: > > > > On 8/9/19 12:59 AM, John Hubbard wrote: > > > > >>> That's true. However, I'm not sure munlocking is where the > > > > >>> put_user_page() machinery is intended to be used anyway? These > > > > >>> are short-term pins for struct page manipulation, not e.g. > > > > >>> dirtying of page contents. Reading commit fc1d8e7cca2d I don't > > > > >>> think this case falls within the reasoning there. Perhaps not > > > > >>> all GUP users should be converted to the planned separate GUP > > > > >>> tracking, and instead we should have a GUP/follow_page_mask() > variant that keeps using get_page/put_page? > > > > >>> > > > > >> > > > > >> Interesting. So far, the approach has been to get all the gup > > > > >> callers to release via put_user_page(), but if we add in Jan's > > > > >> and Ira's vaddr_pin_pages() wrapper, then maybe we could leave > some sites unconverted. > > > > >> > > > > >> However, in order to do so, we would have to change things so > > > > >> that we have one set of APIs (gup) that do *not* increment a > > > > >> pin count, and another set > > > > >> (vaddr_pin_pages) that do. > > > > >> > > > > >> Is that where we want to go...? > > > > >> > > > > > > > > We already have a FOLL_LONGTERM flag, isn't that somehow related? > > > > And if it's not exactly the same thing, perhaps a new gup flag to > > > > distinguish which kind of pinning to use? > > > > > > Agreed. This is a shiny example how forcing all existing gup users > > > into the new scheme is subotimal at best. Not the mention the overal > > > fragility mention elsewhere. I dislike the conversion even more now. > > > > > > Sorry if this was already discussed already but why the new pinning > > > is not bound to FOLL_LONGTERM (ideally hidden by an interface so > > > that users do not have to care about the flag) only? > > > > The new tracking cannot be bound to FOLL_LONGTERM. Anything that gets > > page reference and then touches page data (e.g. direct IO) needs the > > new kind of tracking so that filesystem knows someone is messing with the > page data. > > So what John is trying to address is a different (although related) > > problem to someone pinning a page for a long time. > > OK, I see. Thanks for the clarification. Not to beat a dead horse but FOLL_LONGTERM also has implications now for CMA pages which may or may not (I'm not an expert on those pages) need special tracking. > > > In principle, I'm not strongly opposed to a new FOLL flag to determine > > whether a pin or an ordinary page reference will be acquired at least > > as an internal implementation detail inside mm/gup.c. But I would > > really like to discourage new GUP users taking just page reference as > > the most clueless users (drivers) usually need a pin in the sense John > > implements. So in terms of API I'd strongly prefer to deprecate GUP as > > an API, provide > > vaddr_pin_pages() for drivers to get their buffer pages pinned and > > then for those few users who really know what they are doing (and who > > are not interested in page contents) we can have APIs like > > follow_page() to get a page reference from a virtual address. > > Yes, going with a dedicated API sounds much better to me. Whether a > dedicated FOLL flag is used internally is not that important. I am also for > making the underlying gup to be really internal to the core kernel. +1 I think GUP is too confusing. I've been working with the details for many months now and it continues to confuse me. :-( My patches should be posted soon (based on mmotm) and I'll have my flame suit on so we can debate the interface. Ira