Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp349616ybl; Fri, 9 Aug 2019 06:59:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqyqzyIg4GLNR0mnG2mOp4waEuSfBmVIiq44PWU070Vv3rerQ4WqgEBM0UaKw31nSQXq2hnE X-Received: by 2002:a17:902:7c05:: with SMTP id x5mr19420745pll.321.1565359170402; Fri, 09 Aug 2019 06:59:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565359170; cv=none; d=google.com; s=arc-20160816; b=vbaX6qA6foYDwsWU5HudEuUQqWxON/u7rkfzkYVkip2hyklYtJFf1pKKKRlGhrPRQQ nTPmjXBos+6Cve2DLm34plDEo5FOHL9Qz3miwH1AWH10r/4wX+09DN4C/C27wsA+ZKK0 iFq/1tMNrcLpxy/PNnCmSBtpBbLt7B6KByrLd4D763lfp74HhLlXZvlTxndkSo20x5jm inSzWVq8FnQ48NqGZ2zoJYOXcqQfTXPwoiQmcJXg+Xm4EYIlTRrEaGjKk+tykP3KQBDg gKDgECxvjrGplHOAg65PtQPsrl+KVDC4HYuU++YQa8xFxy198/UOkhK8cxthKfEQNwIr tL1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=/qUJhYm2MhNMXvXobP3yuKVscY8LCFIljXVP9AOHw5w=; b=00K7F7dblzAGOW8climxOQQ8O0uTz0HURUZ5qtZXp0R5JCamtYxo2f+6yLHZtZ4ZDw MPHEPKFp3jmYO6PkrtfmypKHm0sY9Rew9IOb04wYpcHep+E3y3anKWyJJ5ym+KN8jLZJ 5RJzIskSABy7wMaFd3O9qPc0FoWg5wTUHBFogJ0UI14PtpHqHUvzzWiWUzhDoeMP5e9L RJVQJAq7H855f8JvcuJqm04ZXNR0a+w9fSxntpsQ4HO8jr0n/P0lbuqkb9ZarJ8LWe7Q r+6xuZ3Qw7EQOGPuJtOk7HIYkLiWIHAGL0Uwu/sE5azgzXW8wYVKHrzNLPhhbhgUCRGe C3nQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r16si52670149pgv.466.2019.08.09.06.59.14; Fri, 09 Aug 2019 06:59:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406829AbfHIN6R (ORCPT + 99 others); Fri, 9 Aug 2019 09:58:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:47208 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726037AbfHIN6R (ORCPT ); Fri, 9 Aug 2019 09:58:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C3118ACC2; Fri, 9 Aug 2019 13:58:14 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id E87631E46DD; Fri, 9 Aug 2019 15:58:13 +0200 (CEST) Date: Fri, 9 Aug 2019 15:58:13 +0200 From: Jan Kara To: Michal Hocko Cc: John Hubbard , Vlastimil Babka , Andrew Morton , Christoph Hellwig , Ira Weiny , Jan Kara , Jason Gunthorpe , Jerome Glisse , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Dan Williams , Daniel Black , Matthew Wilcox , Mike Kravetz Subject: Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*() Message-ID: <20190809135813.GF17568@quack2.suse.cz> References: <20190805222019.28592-1-jhubbard@nvidia.com> <20190805222019.28592-2-jhubbard@nvidia.com> <20190807110147.GT11812@dhcp22.suse.cz> <01b5ed91-a8f7-6b36-a068-31870c05aad6@nvidia.com> <20190808062155.GF11812@dhcp22.suse.cz> <875dca95-b037-d0c7-38bc-4b4c4deea2c7@suse.cz> <306128f9-8cc6-761b-9b05-578edf6cce56@nvidia.com> <420a5039-a79c-3872-38ea-807cedca3b8a@suse.cz> <20190809082307.GL18351@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190809082307.GL18351@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 09-08-19 10:23:07, Michal Hocko wrote: > On Fri 09-08-19 10:12:48, Vlastimil Babka wrote: > > On 8/9/19 12:59 AM, John Hubbard wrote: > > >>> That's true. However, I'm not sure munlocking is where the > > >>> put_user_page() machinery is intended to be used anyway? These are > > >>> short-term pins for struct page manipulation, not e.g. dirtying of page > > >>> contents. Reading commit fc1d8e7cca2d I don't think this case falls > > >>> within the reasoning there. Perhaps not all GUP users should be > > >>> converted to the planned separate GUP tracking, and instead we should > > >>> have a GUP/follow_page_mask() variant that keeps using get_page/put_page? > > >>> > > >> > > >> Interesting. So far, the approach has been to get all the gup callers to > > >> release via put_user_page(), but if we add in Jan's and Ira's vaddr_pin_pages() > > >> wrapper, then maybe we could leave some sites unconverted. > > >> > > >> However, in order to do so, we would have to change things so that we have > > >> one set of APIs (gup) that do *not* increment a pin count, and another set > > >> (vaddr_pin_pages) that do. > > >> > > >> Is that where we want to go...? > > >> > > > > We already have a FOLL_LONGTERM flag, isn't that somehow related? And if > > it's not exactly the same thing, perhaps a new gup flag to distinguish > > which kind of pinning to use? > > Agreed. This is a shiny example how forcing all existing gup users into > the new scheme is subotimal at best. Not the mention the overal > fragility mention elsewhere. I dislike the conversion even more now. > > Sorry if this was already discussed already but why the new pinning is > not bound to FOLL_LONGTERM (ideally hidden by an interface so that users > do not have to care about the flag) only? The new tracking cannot be bound to FOLL_LONGTERM. Anything that gets page reference and then touches page data (e.g. direct IO) needs the new kind of tracking so that filesystem knows someone is messing with the page data. So what John is trying to address is a different (although related) problem to someone pinning a page for a long time. In principle, I'm not strongly opposed to a new FOLL flag to determine whether a pin or an ordinary page reference will be acquired at least as an internal implementation detail inside mm/gup.c. But I would really like to discourage new GUP users taking just page reference as the most clueless users (drivers) usually need a pin in the sense John implements. So in terms of API I'd strongly prefer to deprecate GUP as an API, provide vaddr_pin_pages() for drivers to get their buffer pages pinned and then for those few users who really know what they are doing (and who are not interested in page contents) we can have APIs like follow_page() to get a page reference from a virtual address. Honza -- Jan Kara SUSE Labs, CR