Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp4368153ybx; Mon, 4 Nov 2019 12:10:13 -0800 (PST) X-Google-Smtp-Source: APXvYqzZmw3SfSOLdcm0NfRByJC3DozYAOzq1q21wbKZrcVywRCPmQDC3JWBLNZ2iH4UGltpTGsu X-Received: by 2002:a05:6402:160f:: with SMTP id f15mr22654819edv.41.1572898212929; Mon, 04 Nov 2019 12:10:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1572898212; cv=none; d=google.com; s=arc-20160816; b=GVd+bgAgQ/BPSOw/Uk3vV70TC4e9iICREMmVEOWpYc7GsMn9cIffVJOyizXGX6yDA/ K6YIocy2TRSDuujoLNkiySAqY3jVlu4CB1ql+OFboLjdYuXC9beWcvpYXzgJCbLZ6smd IiTNrNic6tNrwyJyrr5xpw9Ml/Rz0y7B6yzytlwW8O2a6xi23hYoz0if5A+Q9qAokM9V JRUKB/EY6xGRsUgFL/+78QW/PNATlkHtlhbqmmHaeY+4VNLC85ZGAjGWfI9VxcVb1odE Rpe2+aXSmum9g3aWOEeLl3fbYs06UiRWUvaCM9IgWGHhiHhQjlxR1RXFd6OLytAS1YFQ 6G7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=AZ3ymuIlfRc3w4rthQkiIY3Jv3VQvtZMfCQrByMbt/I=; b=o4lRttwsVMB7fzJQPBGCmZ/OiczHBd8qBTYn2MRcpkNHiNa6/Y0+9/xVo1BsEGc8Xv YLlxZo81DmmPfVQ1+Sjywgnqj1ICZbF4yVxseWEia192Cu4nIXUE/cRfmqUXpw/Cq/cO /rQ/uk97nH0W/J0DhvnZOhOHFOd0Y4gbYyfvSzDRkpnyTdh6cBn7oXsNo16JxmaIsKtn kIDdZGNc6n6eK7LayZG0JEra0KHOwO1h4NgtkvBC7Oyk/ljCU2Vy74RKpexH7MAkW0iF xhMdSyfKH0TbdsodUFFhzVBGp7i5qD1OeTux2LtXPjlQSNGJxwAmB66qQwrA8z+rzgk1 /6oA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b="a1s4cFm/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t11si3234864eju.106.2019.11.04.12.09.49; Mon, 04 Nov 2019 12:10:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b="a1s4cFm/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729494AbfKDUJK (ORCPT + 99 others); Mon, 4 Nov 2019 15:09:10 -0500 Received: from hqemgate14.nvidia.com ([216.228.121.143]:3877 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728332AbfKDUJJ (ORCPT ); Mon, 4 Nov 2019 15:09:09 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 04 Nov 2019 12:09:14 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 04 Nov 2019 12:09:07 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 04 Nov 2019 12:09:07 -0800 Received: from [10.110.48.28] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 4 Nov 2019 20:09:05 +0000 Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN To: Jerome Glisse CC: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko , Mike Kravetz , Paul Mackerras , Shuah Khan , Vlastimil Babka , , , , , , , , , , , , , LKML References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> <20191104195248.GA7731@redhat.com> X-Nvconfidentiality: public From: John Hubbard Message-ID: <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> Date: Mon, 4 Nov 2019 12:09:05 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191104195248.GA7731@redhat.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1572898154; bh=AZ3ymuIlfRc3w4rthQkiIY3Jv3VQvtZMfCQrByMbt/I=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=a1s4cFm/krClFLPv0/gGIzoAubroBwrxV++jio7G263hv7Kqd894mJLgyr0Kykvfu E22eZRsWgS82IfVuknmQjT6loOqA9gwu9uWFSuwxGjrMpxiVLz3gspis/scqyrUM8C V6nOQNjxdOZQpCSi9tZuwj/NT7Qad5kYkY0U2dz2agukymc6b2UDmCgdrtVLwMy+f3 QtSOoNDJwHdTVrVqqJcnOKhTpEZCUMCLI/PJyjZAopcsX3CoJLmMWVSKvLvEBN88aE +2SZRTKEAYlSxIg6b9bcQPtRuIkgnhRasSHve+6EqdhZWTSQoe8j4euTWFVlDE/iaG Y8A3JNrgij6ag== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jason, a question for you at the bottom. On 11/4/19 11:52 AM, Jerome Glisse wrote: ... >> CASE 3: ODP >> ----------- >> RDMA hardware with page faulting support. Here, a well-written driver doesn't > > CASE3: Hardware with page fault support > --------------------------------------- > > Here, a well-written .... > Ah, OK. So just drop the first sentence, yes. ... >>>>>> + */ >>>>>> + gup_flags |= FOLL_REMOTE | FOLL_PIN; >>>>> >>>>> Wouldn't it be better to not add pin_longterm_pages_remote() until >>>>> it can be properly implemented ? >>>>> >>>> >>>> Well, the problem is that I need each call site that requires FOLL_PIN >>>> to use a proper wrapper. It's the FOLL_PIN that is the focus here, because >>>> there is a hard, bright rule, which is: if and only if a caller sets >>>> FOLL_PIN, then the dma-page tracking happens, and put_user_page() must >>>> be called. >>>> >>>> So this leaves me with only two reasonable choices: >>>> >>>> a) Convert the call site as above: pin_longterm_pages_remote(), which sets >>>> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation exactly >>>> as it has been so far. When the FOLL_LONGTERM situation is fixed, the call >>>> site *might* not need any changes to adopt the working gup.c code. >>>> >>>> b) Convert the call site to pin_user_pages_remote(), which also sets >>>> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as before. >>>> There would also be a comment at the call site, to the effect of, "this >>>> is the wrong call to make: it really requires FOLL_LONGTERM behavior". >>>> >>>> When the FOLL_LONGTERM situation is fixed, the call site will need to be >>>> changed to pin_longterm_pages_remote(). >>>> >>>> So you can probably see why I picked (a). >>> >>> But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should >>> never have the need for pin_longterm_pages_remote(). My fear is that >>> longterm has implication and it would be better to not drop this implication >>> by adding a wrapper that does not do what the name says. >>> >>> So do not introduce pin_longterm_pages_remote() until its first user >>> happens. This is option c) >>> >> >> Almost forgot, though: there is already another user: Infiniband: >> >> drivers/infiniband/core/umem_odp.c:646: npages = pin_longterm_pages_remote(owning_process, owning_mm, > > odp do not need that, i thought the HMM convertion was already upstream > but seems not, in any case odp do not need the longterm case it only > so best is to revert that user to gup_fast or something until it get > converted to HMM. > Note for Jason: the (a) or (b) items are talking about the vfio case, which is one of the two call sites that now use pin_longterm_pages_remote(), and the other one is infiniband: drivers/infiniband/core/umem_odp.c:646: npages = pin_longterm_pages_remote(owning_process, owning_mm, drivers/vfio/vfio_iommu_type1.c:353: ret = pin_longterm_pages_remote(NULL, mm, vaddr, 1, Jerome, Jason: I really don't want to revert the put_page() to put_user_page() conversions that are already throughout the IB driver--pointless churn, right? I'd rather either delete them in Jason's tree, or go with what I have here while waiting for the deletion. Maybe we should just settle on (a) or (b), so that the IB driver ends up with the wrapper functions? In fact, if it's getting deleted, then I'd prefer leaving it at (a), since that's simple... Jason should weigh in on how he wants this to go, with respect to branching and merging, since it sounds like that will conflict with the hmm branch (ha, I'm overdue in reviewing his mmu notifier series, that's what I get for being late). thanks, John Hubbard NVIDIA