Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7435225imu; Mon, 3 Dec 2018 12:54:32 -0800 (PST) X-Google-Smtp-Source: AFSGD/WhxY/ynoioqZ+/ds38MCBMuG6o4JDg0cCJpr+DnhXPeAbNQ+hmNWxiKte0eLS7drjKB9jZ X-Received: by 2002:a63:ee4c:: with SMTP id n12mr14063710pgk.21.1543870472102; Mon, 03 Dec 2018 12:54:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543870472; cv=none; d=google.com; s=arc-20160816; b=rj3cO+HfLI/PoPxJrpd1/RZmUsdZT+wAIM+2E0w9yR5MHGqP3OagO5FAjI9uK3sJ0A fsXnFKWFzXUKK+mhOCPHgDKjAdBOsypahBi6vFv/QdD3k39grF+D78DM5O9QVvMdWx9m eU2LQ3wX8jojyyznDAs7BvdSnA1eI/GbJdMTCDarpZz/1hkJwWsO8+vd+6Z8LmBqJNri Byxkq0xbNyRPuGxENi30BqqTFw2q4ZAdcgZlCq/4Ppxk1yOrnksgrnLYnQGYP1O5TpST QmOA+yxSYh4bbd3A1onxnZzqZ+jaedMjddKYII4JdyDYlM2mP3x5anIBXJg8e3/ytTfC HsyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=yT3OXUumtcIHAZoax8BiwHk57BcXZyCAIqk7f2zQS5I=; b=MgpaU2okAO8Sekt0glSn6TRyRVafGG0K7/LmYTNXrw/VoKglcQn6LAmYaxEnkVf9RK 7CPYz2cBf18K0MeUUjXWbuiQTBfX2fZ1kHYQ0uTrB4CJek7ivYhIYFgBf08c0s/L43OS TBfO+OYHkyeB97x7gc+SK6bQIJkefzIPKdOg5N3YeVZF3ih7wYaYBsEVqo5u3pVPZQGh uTn4gLoY5DteG29leeFAcjVQJLEcv3gdJet4Q+Cfrv4z6nvPSP7vBmNjVJCNP/wzeeEx 0UUYjcCPwE7jB5qeCi3/zxc5YtwywTUc0yF9aWsZ+djuwl6iE3IpJ41RdKHIEphkvrgs EZnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 31si7687966plk.310.2018.12.03.12.54.16; Mon, 03 Dec 2018 12:54:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725977AbeLCUxn (ORCPT + 99 others); Mon, 3 Dec 2018 15:53:43 -0500 Received: from mga02.intel.com ([134.134.136.20]:17943 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725908AbeLCUxm (ORCPT ); Mon, 3 Dec 2018 15:53:42 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Dec 2018 12:53:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,311,1539673200"; d="scan'208";a="115620125" Received: from ahduyck-desk1.amr.corp.intel.com ([10.7.198.76]) by orsmga001.jf.intel.com with ESMTP; 03 Dec 2018 12:53:42 -0800 Message-ID: <2a3f70b011b56de2289e2f304b3d2d617c5658fb.camel@linux.intel.com> Subject: Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning From: Alexander Duyck To: Dan Williams Cc: Paolo Bonzini , Zhang Yi , Barret Rhoden , KVM list , linux-nvdimm , Linux Kernel Mailing List , Linux MM , Dave Jiang , "Zhang, Yu C" , Pankaj Gupta , David Hildenbrand , Jan Kara , Christoph Hellwig , rkrcmar@redhat.com, =?ISO-8859-1?Q?J=E9r=F4me?= Glisse Date: Mon, 03 Dec 2018 12:53:42 -0800 In-Reply-To: References: <154386493754.27193.1300965403157243427.stgit@ahduyck-desk1.amr.corp.intel.com> <154386513120.27193.7977541941078967487.stgit@ahduyck-desk1.amr.corp.intel.com> <97943d2ed62e6887f4ba51b985ef4fb5478bc586.camel@linux.intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-2.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2018-12-03 at 12:31 -0800, Dan Williams wrote: > On Mon, Dec 3, 2018 at 12:21 PM Alexander Duyck > wrote: > > > > On Mon, 2018-12-03 at 11:47 -0800, Dan Williams wrote: > > > On Mon, Dec 3, 2018 at 11:25 AM Alexander Duyck > > > wrote: > > > > > > > > Add a means of exposing if a pagemap supports refcount pinning. I am doing > > > > this to expose if a given pagemap has backing struct pages that will allow > > > > for the reference count of the page to be incremented to lock the page > > > > into place. > > > > > > > > The KVM code already has several spots where it was trying to use a > > > > pfn_valid check combined with a PageReserved check to determien if it could > > > > take a reference on the page. I am adding this check so in the case of the > > > > page having the reserved flag checked we can check the pagemap for the page > > > > to determine if we might fall into the special DAX case. > > > > > > > > Signed-off-by: Alexander Duyck > > > > --- > > > > drivers/nvdimm/pfn_devs.c | 2 ++ > > > > include/linux/memremap.h | 5 ++++- > > > > include/linux/mm.h | 11 +++++++++++ > > > > 3 files changed, 17 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c > > > > index 6f22272e8d80..7a4a85bcf7f4 100644 > > > > --- a/drivers/nvdimm/pfn_devs.c > > > > +++ b/drivers/nvdimm/pfn_devs.c > > > > @@ -640,6 +640,8 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) > > > > } else > > > > return -ENXIO; > > > > > > > > + pgmap->support_refcount_pinning = true; > > > > + > > > > > > There should be no dev_pagemap instance instance where this isn't > > > true, so I'm missing why this is needed? > > > > I thought in the case of HMM there were instances where you couldn't > > pin the page, isn't there? Specifically I am thinking of the definition > > of MEMORY_DEVICE_PUBLIC: > > Device memory that is cache coherent from device and CPU point of > > view. This is use on platform that have an advance system bus (like > > CAPI or CCIX). A driver can hotplug the device memory using > > ZONE_DEVICE and with that memory type. Any page of a process can be > > migrated to such memory. However no one should be allow to pin such > > memory so that it can always be evicted. > > > > It sounds like MEMORY_DEVICE_PUBLIC and MMIO would want to fall into > > the same category here in order to allow a hot-plug event to remove the > > device and take the memory with it, or is my understanding on this not > > correct? > > I don't understand how HMM expects to enforce no pinning, but in any > event it should always be the expectation an elevated reference count > on a page prevents that page from disappearing. Anything else is > broken. I don't think that is true for device MMIO though. In the case of MMIO you have the memory region backed by a device, if that device is hot-plugged or fails in some way then that backing would go away and the reads would return and all 1's response. Holding a reference to the page doesn't guarantee that the backing device cannot go away. I believe that is the origin of the original use of the PageReserved check in KVM in terms of if it will try to use the get_page/put_page functions. I believe this is also why MEMORY_DEVICE_PUBLIC specifically calls out that you should not allow pinning such memory. - Alex