Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp3196830imm; Sun, 17 Jun 2018 13:35:55 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKsZxyxQCD9G970HHvvC7OtoWrRhSnnfC+DXgJcXuWkkkf9dCz1Txw40XRl/ZnmpyHE/Xy9 X-Received: by 2002:a62:234a:: with SMTP id j71-v6mr10491428pfj.221.1529267755462; Sun, 17 Jun 2018 13:35:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529267755; cv=none; d=google.com; s=arc-20160816; b=MDd9HsmZ9/mWTM6VHMUAlO88qq406ONBb7AzQ0TyWHUznrv6UFITImYXalmRu8osJg 2pMBJZxscHnxsZdJoPPTZ5W8VAoKEuiY48bJUZ6elhJxrEq6yNluoFv0xfHFQdvjnCwf fOmhLSZgK19GjPW5U5izFaLkE6xk6WfuHkxl81ofBSc87pbiH/mzZcPVfY+Jx0oxnvBj nZ/XXaYZtP8e29pEEWGWhH0b052Ktcgq1JE9laue5vBuD43Wgo8+tYTTL4I0CZIqLIlz qXXB73jh2UhZulP3Yubm4wEX4wOJybHoIC6XCilDDLMgkBWSjwmqnyKlYVmUNSILZzzd N6uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=w8UAgQYaj5WFDojE2lzl2udDX+lwJEDLLUfKJYdRFjI=; b=aF226tbSoFMVYtXvrPz/QqsxwBEhNb+v3VtJnKafzmwJv4sYoUlHMb+hLdkI7zObky 3aW1/3hLCTQGk5V4Zf7Iv166ff06zLDr5wVvZsEqgkeHgFUaqcFozSmN33xQNdT53skG 0szVZqdYmOVGPtt5sdGRjBUhIKoHcktv8YiiLaiq46+oMJL9BjqFTDEjlmVzAiN7nGsX VvUV1Qx0WxS0KJOV3nsg8onrR0z8IxvCz9KjiQ0TdaOnvtgPpkcnKZkDhESINv5CuTmN 7IJ3/cQYrr3XGuORUK/sGnNACeXLqVtiwdsHhl0djIUaGHF0VcYyaXsESaSxIAmiBWNe 2lkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n9-v6si10883628pgf.497.2018.06.17.13.35.13; Sun, 17 Jun 2018 13:35:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934357AbeFQU2m (ORCPT + 99 others); Sun, 17 Jun 2018 16:28:42 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:10978 "EHLO hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933744AbeFQU2k (ORCPT ); Sun, 17 Jun 2018 16:28:40 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate15.nvidia.com (using TLS: TLSv1, AES128-SHA) id ; Sun, 17 Jun 2018 13:28:20 -0700 Received: from HQMAIL107.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Sun, 17 Jun 2018 13:28:44 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Sun, 17 Jun 2018 13:28:44 -0700 Received: from [10.2.175.123] (10.2.175.123) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Sun, 17 Jun 2018 20:28:39 +0000 Subject: Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*() To: Dan Williams , Jason Gunthorpe CC: , Matthew Wilcox , Michal Hocko , Christopher Lameter , Jan Kara , Linux MM , LKML , linux-rdma , Christoph Hellwig References: <20180617012510.20139-1-jhubbard@nvidia.com> <20180617012510.20139-3-jhubbard@nvidia.com> <20180617200432.krw36wrcwidb25cj@ziepe.ca> X-Nvconfidentiality: public From: John Hubbard Message-ID: <311eba48-60f1-b6cc-d001-5cc3ed4d76a9@nvidia.com> Date: Sun, 17 Jun 2018 13:28:18 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.2.175.123] X-ClientProxiedBy: HQMAIL104.nvidia.com (172.18.146.11) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/17/2018 01:10 PM, Dan Williams wrote: > On Sun, Jun 17, 2018 at 1:04 PM, Jason Gunthorpe wrote: >> On Sun, Jun 17, 2018 at 12:53:04PM -0700, Dan Williams wrote: >>>> diff --git a/mm/rmap.c b/mm/rmap.c >>>> index 6db729dc4c50..37576f0a4645 100644 >>>> +++ b/mm/rmap.c >>>> @@ -1360,6 +1360,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, >>>> flags & TTU_SPLIT_FREEZE, page); >>>> } >>>> >>>> + if (PageDmaPinned(page)) >>>> + return false; >>>> /* >>>> * We have to assume the worse case ie pmd for invalidation. Note that >>>> * the page can not be free in this function as call of try_to_unmap() >>> >>> We have a similiar problem with DAX and the conclusion we came to is >>> that it is not acceptable for userspace to arbitrarily block kernel >>> actions. The conclusion there was: 'wait' if the DMA is transient, and >>> 'revoke' if the DMA is long lived, or otherwise 'block' long-lived DMA >>> if a revocation mechanism is not available. >> >> This might be the right answer for certain things, but it shouldn't be >> the immediate reaction to everthing. There are many user APIs that >> block kernel actions and hold kernel resources. >> >> IMHO, there should be an identifiable objection, eg is blocking going >> to create a DOS, dead-lock, insecurity, etc? > > I believe kernel behavior regression is a primary concern as now > fallocate() and truncate() can randomly fail where they didn't before. > Yes. However, my thinking was: get_user_pages() can become a way to indicate that these pages are going to be treated specially. In particular, the caller does not really want or need to support certain file operations, while the page is flagged this way. If necessary, we could add a new API call. But either way, I think we could reasonably document that "if you pin these pages (either via get_user_pages, or some new, similar-looking API call), you can DMA to/from them, and safely mark them as dirty when you're done, and the right things will happen. And in the interim, you can expect that the follow file system API calls will not behave predictably: fallocate, truncate, ..." Maybe in the near future, we can remove that last qualification, if we find a more comprehensive design for this (as opposed to this cheap fix I'm proposing here).