Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751804AbcKVFDf (ORCPT ); Tue, 22 Nov 2016 00:03:35 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:56163 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750749AbcKVFDd (ORCPT ); Tue, 22 Nov 2016 00:03:33 -0500 Subject: Re: [HMM v13 04/18] mm/ZONE_DEVICE/free-page: callback when page is freed To: Jerome Glisse References: <1479493107-982-1-git-send-email-jglisse@redhat.com> <1479493107-982-5-git-send-email-jglisse@redhat.com> <5832AF9A.8020808@linux.vnet.ibm.com> <20161121123451.GD2392@redhat.com> Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Hubbard , Dan Williams , Ross Zwisler From: Anshuman Khandual Date: Tue, 22 Nov 2016 10:32:48 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20161121123451.GD2392@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16112205-0040-0000-0000-000002CA3173 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16112205-0041-0000-0000-00000BBCB873 Message-Id: <5833D178.9080300@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-22_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611220089 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3293 Lines: 78 On 11/21/2016 06:04 PM, Jerome Glisse wrote: > On Mon, Nov 21, 2016 at 01:56:02PM +0530, Anshuman Khandual wrote: >> On 11/18/2016 11:48 PM, J?r?me Glisse wrote: >>> When a ZONE_DEVICE page refcount reach 1 it means it is free and nobody >>> is holding a reference on it (only device to which the memory belong do). >>> Add a callback and call it when that happen so device driver can implement >>> their own free page management. >>> >>> Signed-off-by: J?r?me Glisse >>> Cc: Dan Williams >>> Cc: Ross Zwisler >>> --- >>> include/linux/memremap.h | 4 ++++ >>> kernel/memremap.c | 8 ++++++++ >>> 2 files changed, 12 insertions(+) >>> >>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h >>> index fe61dca..469c88d 100644 >>> --- a/include/linux/memremap.h >>> +++ b/include/linux/memremap.h >>> @@ -37,17 +37,21 @@ static inline struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start) >>> >>> /** >>> * struct dev_pagemap - metadata for ZONE_DEVICE mappings >>> + * @free_devpage: free page callback when page refcount reach 1 >>> * @altmap: pre-allocated/reserved memory for vmemmap allocations >>> * @res: physical address range covered by @ref >>> * @ref: reference count that pins the devm_memremap_pages() mapping >>> * @dev: host device of the mapping for debug >>> + * @data: privata data pointer for free_devpage >>> * @flags: memory flags (look for MEMORY_FLAGS_NONE in memory_hotplug.h) >>> */ >>> struct dev_pagemap { >>> + void (*free_devpage)(struct page *page, void *data); >>> struct vmem_altmap *altmap; >>> const struct resource *res; >>> struct percpu_ref *ref; >>> struct device *dev; >>> + void *data; >>> int flags; >>> }; >>> >>> diff --git a/kernel/memremap.c b/kernel/memremap.c >>> index 438a73aa2..3d28048 100644 >>> --- a/kernel/memremap.c >>> +++ b/kernel/memremap.c >>> @@ -190,6 +190,12 @@ EXPORT_SYMBOL(get_zone_device_page); >>> >>> void put_zone_device_page(struct page *page) >>> { >>> + /* >>> + * If refcount is 1 then page is freed and refcount is stable as nobody >>> + * holds a reference on the page. >>> + */ >>> + if (page->pgmap->free_devpage && page_count(page) == 1) >>> + page->pgmap->free_devpage(page, page->pgmap->data); >>> put_dev_pagemap(page->pgmap); >>> } >>> EXPORT_SYMBOL(put_zone_device_page); >>> @@ -326,6 +332,8 @@ void *devm_memremap_pages(struct device *dev, struct resource *res, >>> pgmap->ref = ref; >>> pgmap->res = &page_map->res; >>> pgmap->flags = flags | MEMORY_DEVICE; >>> + pgmap->free_devpage = NULL; >>> + pgmap->data = NULL; >> >> When is the driver expected to load up pgmap->free_devpage ? I thought >> this function is one of the right places. Though as all the pages in >> the same hotplug operation point to the same dev_pagemap structure this >> loading can be done at later point of time as well. >> > > I wanted to avoid adding more argument to devm_memremap_pages() as it already > has a long list. Hence why i let the caller set those afterward. IMHO we should still pass it through this function argument so that by the time the function returns we will have device memory properly setup through ZONE_DEVICE with all bells and whistles enabled.