Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp651860imm; Wed, 10 Oct 2018 02:00:32 -0700 (PDT) X-Google-Smtp-Source: ACcGV602vunXO7M+eIKKEa+masboDmYj2jLmWepRY1hDz2zQ+D2090v6GLLnPEQP2fdqmF/upNhr X-Received: by 2002:a17:902:6b45:: with SMTP id g5-v6mr31825569plt.41.1539162032579; Wed, 10 Oct 2018 02:00:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539162032; cv=none; d=google.com; s=arc-20160816; b=uMp7ooEzsqRNhthwVNNE6ERHu9u8HBK0wgCGVmAgC2ADGsSpZdTOxQJ3pH2ZJIpMiz oJYt+QR9TnlqCxxJcp4skLiC16PwTVoEbMx4epIyokEqxEl68LvJ7py4LdqzkS/uaehp e4oY0+gEAIk6APLhHHLLV66DHjn1kBsH4hiefNaux7BDZpRFtddm7UwpPIMuthPSg3mk nOskvYrDuW/eBTpfKYwR088DkUyoVt3RTejb6n+ogZidgT8l3i+Xom1XbB/3AwxazDDl mc/iEIuFWNgJLZj6ooDGtAKsEyi5DekAr6w0mK+H0Jl81LGbDLD2fNh2pWtFlzM8OrSK RlgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=E7I/+OzNyNIP9rURXw2/XYoK0qC8lf9QPeSreWMLiuc=; b=XjCYNGuxr+IPr09uZrkP/C+RLzn4O64IBHA89e+MN2BfAVgO5cm54365VLC234EZnO NFLDiCJ3Xm18XVCtTe6Q2RrWhE5jo6T6MZsovTSpGgkSdaNk5lq3bYUshj9MADMY36VY B7PWHkXktoQZm3ID85W06eCRzcqEYcvXeiI6Pnyp//u1jD9gUOczoTIKrkfW0+2qXc19 RcJyNLsUUEd761IFqPC21op23u3eVj/xlM9TAcDk5Au6uGQSeRkZJF31Ncom5wuRwkuj OElkeBPpjSexcCbIeFJahGQjqsvdexhH1YZiQgAZp4DJIN14PlobEfGdOlpECflYx7D1 Yzsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y1-v6si21709625pgf.78.2018.10.10.02.00.18; Wed, 10 Oct 2018 02:00:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727474AbeJJQUv (ORCPT + 99 others); Wed, 10 Oct 2018 12:20:51 -0400 Received: from mx2.suse.de ([195.135.220.15]:48394 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726515AbeJJQUu (ORCPT ); Wed, 10 Oct 2018 12:20:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 3324DB012; Wed, 10 Oct 2018 08:59:37 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 063231E3617; Wed, 10 Oct 2018 10:59:36 +0200 (CEST) Date: Wed, 10 Oct 2018 10:59:36 +0200 From: Jan Kara To: John Hubbard Cc: Andrew Morton , john.hubbard@gmail.com, Matthew Wilcox , Michal Hocko , Christopher Lameter , Jason Gunthorpe , Dan Williams , Jan Kara , linux-mm@kvack.org, LKML , linux-rdma , linux-fsdevel@vger.kernel.org, Al Viro , Jerome Glisse , Christoph Hellwig , Ralph Campbell Subject: Re: [PATCH v4 2/3] mm: introduce put_user_page*(), placeholder versions Message-ID: <20181010085936.GC11507@quack2.suse.cz> References: <20181008211623.30796-1-jhubbard@nvidia.com> <20181008211623.30796-3-jhubbard@nvidia.com> <20181008171442.d3b3a1ea07d56c26d813a11e@linux-foundation.org> <5198a797-fa34-c859-ff9d-568834a85a83@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5198a797-fa34-c859-ff9d-568834a85a83@nvidia.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 09-10-18 17:42:09, John Hubbard wrote: > On 10/8/18 5:14 PM, Andrew Morton wrote: > > Also, maintainability. What happens if someone now uses put_page() by > > mistake? Kernel fails in some mysterious fashion? How can we prevent > > this from occurring as code evolves? Is there a cheap way of detecting > > this bug at runtime? > > > > It might be possible to do a few run-time checks, such as "does page that came > back to put_user_page() have the correct flags?", but it's harder (without > having a dedicated page flag) to detect the other direction: "did someone page > in a get_user_pages page, to put_page?" > > As Jan said in his reply, converting get_user_pages (and put_user_page) to > work with a new data type that wraps struct pages, would solve it, but that's > an awfully large change. Still...given how much of a mess this can turn into > if it's wrong, I wonder if it's worth it--maybe? I'm certainly not opposed to looking into it. But after looking into this for a while it is not clear to me how to convert e.g. fs/direct-io.c or fs/iomap.c. They pass the reference from gup() via bio->bi_io_vec[]->bv_page and then release it after IO completion. Propagating the new type to ->bv_page is not good as lower layer do not really care how the page is pinned in memory. But we do need to somehow pass the information to the IO completion functions in a robust manner. Hmm, what about the following: 1) Make gup() return new type - struct user_page *? In practice that would be just a struct page pointer with 0 bit set so that people are forced to use proper helpers and not just force types (and the setting of bit 0 and masking back would be hidden behind CONFIG_DEBUG_USER_PAGE_REFERENCES for performance reasons). Also the transition would have to be gradual so we'd have to name the function differently and use it from converted code. 2) Provide helper bio_add_user_page() that will take user_page, convert it to struct page, add it to the bio, and flag the bio as having pages with user references. That code would also make sure the bio is consistent in having only user-referenced pages in that case. IO completion (like bio_check_pages_dirty(), bio_release_pages() etc.) will check the flag and use approprite release function. 3) I have noticed fs/direct-io.c may submit zero page for IO when it needs to clear stuff so we'll probably need a helper function to acquire 'user pin' reference given a page pointer so that that code can be kept reasonably simple and pass user_page references all around. So this way we could maintain reasonable confidence that refcounts didn't get mixed up. Thoughts? Honza -- Jan Kara SUSE Labs, CR