Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1975768imm; Sun, 15 Jul 2018 23:04:24 -0700 (PDT) X-Google-Smtp-Source: AAOMgpe2NBp76PnLhxcOU7ieY2RIgvdgUsvulzaLxCPvqzQe+Yo/NtGf06bQa2+M7FDPuJ8LKIiG X-Received: by 2002:a17:902:88:: with SMTP id a8-v6mr15131876pla.156.1531721064496; Sun, 15 Jul 2018 23:04:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531721064; cv=none; d=google.com; s=arc-20160816; b=mHi0/0xrnvxtjk/jiZj/zWuMa9XRXuDL/TKKLypyOVDEXX0Gth8AxSsfF9AzgyLqBE qxm73jJHQn1PZz/vgDwCDnHAueGI+sdTcuSgtVG+ld/Wp7sn/liP8bZW6fUV8om6pUed uB01t9S6XLZvbnp/FmOAkZb7ZvY6KkkKprGF9QBFYupOt7ffC3xSJ7QCoHLUlP1khwEe 9DfXv8daXlqCzse4NoiZfoV8xLM3PHhYrE69DQd3MNkGco1F4ObxkaZaTM+usoRcVFZr btcDpaJ3hv2tq8ae/djYc+HXM7FRhWWNuggD1FozLRF2xzRgbqIawlmJZP6H3XFVwj0j udkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :mime-version:user-agent:references:in-reply-to:date:cc:to:from :subject:arc-authentication-results; bh=iUvwcXSdOo21xc6Pmv1liGbQzy+1G/BLXhKIRr2tMOc=; b=KCebo7l19n8X0UP5DE6WFkimIsfeEZk749lCC3hZWRIpriTbArPuiELOnG+D+1tV7l QrJKGTbuRK4bjtsdQIPErW8DPsQEPPhcp5NFvANZBCz0bjZaGvYghtYUmNgdqph6Ssds JuEewFgpuNIv+JUBe+wBGkwqxPcX6MhQcwfvDMsS6gaAdv5uUETjhPSap3UbaLXpH8xi qH+ey0NGthQ4RLsrgJu6/y+GXlE62ZZ72Sz+v2E6TDuquuj4esmBeh1R/XJS3TP9dMbX L4DjCqz3kzYc0t1Yyn9EDQYAEG8anObEJu0VuwoHa/e7tqio58LjFhD4euXncRpJA/ok Vr2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bb5-v6si12983869plb.480.2018.07.15.23.04.09; Sun, 15 Jul 2018 23:04:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728202AbeGPG3L (ORCPT + 99 others); Mon, 16 Jul 2018 02:29:11 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:38812 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727163AbeGPG3L (ORCPT ); Mon, 16 Jul 2018 02:29:11 -0400 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w6G5wcnf028522 for ; Mon, 16 Jul 2018 02:03:27 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 2k8jjpdqk8-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 16 Jul 2018 02:03:27 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 16 Jul 2018 07:03:24 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 16 Jul 2018 07:03:20 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w6G63J4A37159076 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 16 Jul 2018 06:03:20 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6A27111C05E; Mon, 16 Jul 2018 09:03:38 +0100 (BST) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 102F811C066; Mon, 16 Jul 2018 09:03:36 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.109.202.229]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 16 Jul 2018 09:03:35 +0100 (BST) Subject: [RFC PATCH v6 1/4] mm/page_alloc: Introduce an interface to mark reserved memory as ZONE_MOVABLE From: Mahesh J Salgaonkar To: linuxppc-dev , Linux Kernel Cc: Srikar Dronamraju , "Aneesh Kumar K.V" , Anshuman Khandual , Andrew Morton , Joonsoo Kim , Michal Hocko , Hari Bathini , Ananth Narayan , kernelfans@gmail.com Date: Mon, 16 Jul 2018 11:33:16 +0530 In-Reply-To: <153172096333.29252.4376707071382727345.stgit@jupiter.in.ibm.com> References: <153172096333.29252.4376707071382727345.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 18071606-0008-0000-0000-000002537A14 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18071606-0009-0000-0000-000021B9C27A Message-Id: <153172098506.29252.14851475255748792913.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-07-16_02:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1806210000 definitions=main-1807160072 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mahesh Salgaonkar Add an interface to allow a custom reserved memory to be marked as ZONE_MOVABLE. This will help some subsystem's to convert their reserved memory region into ZONE_MOVABLE so that the memory can still be available to user applications. The approach is based on Joonsoo Kim's commit bad8c6c0 (https://github.com/torvalds/linux/commit/bad8c6c0) that uses ZONE_MOVABLE to manage CMA area. Majority of the code has been taken from the Joonsoo Kim's commit mentioned above. But I see above commit has been reverted due to some issues reported on i386. I believe this patch is being reworked and re-posted soon. Like CMA, the other user of ZONE_MOVABLE can be fadump on powerpc, which reserves significant chunk of memory that is used only after system is crashed. Until then the reserved memory is unused. By marking that memory to ZONE_MOVABLE, it can be at least utilized by user applications. This patch proposes a RFC implementation of an interface to mark specified reserved area as ZONE_MOVABLE. Comments are welcome. Signed-off-by: Mahesh Salgaonkar --- include/linux/mmzone.h | 2 + mm/page_alloc.c | 146 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 148 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 32699b2dc52a..2519dd690572 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1288,6 +1288,8 @@ struct mminit_pfnnid_cache { #endif void memory_present(int nid, unsigned long start, unsigned long end); +extern int __init zone_movable_init_reserved_mem(phys_addr_t base, + phys_addr_t size); /* * If it is possible to have holes within a MAX_ORDER_NR_PAGES, then we diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100f1e63..0817ed8843cb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7687,6 +7687,152 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, return true; } +static __init void mark_zone_movable(struct page *page) +{ + unsigned i = pageblock_nr_pages; + struct page *p = page; + struct zone *zone; + unsigned long pfn = page_to_pfn(page); + int nid = page_to_nid(page); + + zone = page_zone(page); + zone->present_pages -= pageblock_nr_pages; + + do { + __ClearPageReserved(p); + set_page_count(p, 0); + + /* Steal pages from other zones */ + set_page_links(p, ZONE_MOVABLE, nid, pfn); + } while (++p, ++pfn, --i); + + zone = page_zone(page); + zone->present_pages += pageblock_nr_pages; + + set_pageblock_migratetype(page, MIGRATE_MOVABLE); + + if (pageblock_order >= MAX_ORDER) { + i = pageblock_nr_pages; + p = page; + do { + set_page_refcounted(p); + __free_pages(p, MAX_ORDER - 1); + p += MAX_ORDER_NR_PAGES; + } while (i -= MAX_ORDER_NR_PAGES); + } else { + set_page_refcounted(page); + __free_pages(page, pageblock_order); + } + + adjust_managed_page_count(page, pageblock_nr_pages); +} + +static int __init zone_movable_activate_area(unsigned long start_pfn, + unsigned long end_pfn) +{ + unsigned long base_pfn = start_pfn, pfn = start_pfn; + struct zone *zone; + unsigned i = (end_pfn - start_pfn) >> pageblock_order; + + zone = page_zone(pfn_to_page(base_pfn)); + while (pfn < end_pfn) { + if (!pfn_valid(pfn)) + goto err; + + if (page_zone(pfn_to_page(pfn)) != zone) + goto err; + pfn++; + } + + do { + mark_zone_movable(pfn_to_page(base_pfn)); + base_pfn += pageblock_nr_pages; + } while (--i); + + return 0; +err: + pr_err("Zone movable could not be activated\n"); + return -EINVAL; +} + +/** + * zone_movable_init_reserved_mem() - create custom zone movable area from + * reserved memory + * @base: Base address of the reserved area + * @size: Size of the reserved area (in bytes), + * + * This function creates custom zone movable area from already reserved memory. + */ +int __init zone_movable_init_reserved_mem(phys_addr_t base, phys_addr_t size) +{ + struct zone *zone; + pg_data_t *pgdat; + unsigned long start_pfn = PHYS_PFN(base); + unsigned long end_pfn = PHYS_PFN(base + size); + phys_addr_t alignment; + int ret; + + if (!size || !memblock_is_region_reserved(base, size)) + return -EINVAL; + + /* ensure minimal alignment required by mm core */ + alignment = PAGE_SIZE << + max_t(unsigned long, MAX_ORDER - 1, pageblock_order); + + if (ALIGN(base, alignment) != base || ALIGN(size, alignment) != size) + return -EINVAL; + + for_each_online_pgdat(pgdat) { + zone = &pgdat->node_zones[ZONE_MOVABLE]; + + /* + * Continue if zone is already populated. + * Should we at least bump up the zone->spanned_pages + * for existing populated zone ? + */ + if (populated_zone(zone)) + continue; + + /* + * Is it possible to allow memory region across nodes to + * be marked as ZONE_MOVABLE ? + */ + if (pfn_to_nid(start_pfn) != pgdat->node_id) + continue; + + /* Not sure if this is a right place to init empty zone. */ + if (zone_is_empty(zone)) { + init_currently_empty_zone(zone, start_pfn, + end_pfn - start_pfn); + zone->spanned_pages = end_pfn - start_pfn; + } + } + + ret = zone_movable_activate_area(start_pfn, end_pfn); + + if (ret) + return ret; + + /* + * Reserved pages for ZONE_MOVABLE are now activated and + * this would change ZONE_MOVABLE's managed page counter and + * the other zones' present counter. We need to re-calculate + * various zone information that depends on this initialization. + */ + build_all_zonelists(NULL); + for_each_populated_zone(zone) { + if (zone_idx(zone) == ZONE_MOVABLE) { + zone_pcp_reset(zone); + setup_zone_pageset(zone); + } else + zone_pcp_update(zone); + + set_zone_contiguous(zone); + } + + return 0; +} + #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || defined(CONFIG_CMA) static unsigned long pfn_max_align_down(unsigned long pfn)