Received: by 10.192.165.148 with SMTP id m20csp3863264imm; Mon, 30 Apr 2018 07:39:50 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqCMuh1i7gL8OlCByhsfzurDNkM8FuSCAxp2gccllGUslsISYppL3e2MN9rntJqSiUqpdLN X-Received: by 2002:a17:902:228:: with SMTP id 37-v6mr12536846plc.141.1525099190159; Mon, 30 Apr 2018 07:39:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525099190; cv=none; d=google.com; s=arc-20160816; b=J74+lBupIq2XogY76pv3VaZCEqBgHRd/bOmupE9u8WR7p4gYKotDm9zF49aSe8arBs lrmt9O/i0wDaNQ/L28zwWlEcD62Agfy51Vg1w7vMpSgTiy+czSjeHLOBPTsK2909/mWL 8PCwNkbo6uCJpmCYMFC5/9I1L6MP6PBJdD5Ot50YvVaE+NrfN281SI/VhXTKzeqB7h4e EYUuVGWXYc9kgp9q1KJyInO9eHU/QUgNlqdEc4SOgPjORvOOUvz0Utl42L2Sf5r5kJ+1 /+pYk7WZcRMF88v98pjEncF8ZumsuItW4xo6ojZ/Te9Ot7v6UwKBBGJSnX2P9mTQF7oV V60g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=b3EBX1CPEeCZbSQpsWvELqhKa6Fpuu7uwVAo+DtsMcU=; b=PoYBxKo2NhzJsCe+X9y/46fiwbQSr3Gsx4JcNRbUbr6YZabgwDWQVkQvN4V98UMbU6 pAHOVucQiM7OGD2gn0VZlT51hz2k9MLcxAVbrRCwxOg5CEX6pbMtxQD2ftvZHg2yQWcx XLtozTR5nr3Gafe+dkwtuQFVNzyTVQZ1fo7WSkx8RPxuQWZ0LpGyZi/1bWVRub3Gectg EHlZvtwMVzAtAs1fYA65usgLoEquSi7p4lCjdnCxwfczA0wL8+MhnN0o+DJaX6M3c4I9 bl025VQha480yEyb5jfvAa16/ocyULINELmRZ/XCMKnPRe8+vU84CHyeGu8tqe/GpaPq XGIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=QIaPHX/i; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6-v6si7831035plt.580.2018.04.30.07.39.35; Mon, 30 Apr 2018 07:39:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=QIaPHX/i; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754359AbeD3OiB (ORCPT + 99 others); Mon, 30 Apr 2018 10:38:01 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:58480 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753827AbeD3Oh7 (ORCPT ); Mon, 30 Apr 2018 10:37:59 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w3UEZklr042456; Mon, 30 Apr 2018 14:36:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=b3EBX1CPEeCZbSQpsWvELqhKa6Fpuu7uwVAo+DtsMcU=; b=QIaPHX/iTn6z9rh1jDPGPf9sCwDMIJWGljiglYniODKceZyoRn7cGDHiCiQkH173MrAP 9uE2JJ0czefKQjG3Tb2B+zsoC7P13Dg1DdiJDdM14sE4ZrXVokp36vjHacICwBUkg7oH GFbGBSi+eEWRGZzJs9kI7BQ4sBIJC2DrUPj9rEc2UPpNsebzPmXK9gRZhkC3s3mWtY0t gsofJ+6B93W0PvdWDRXpFFy5gfXL/EZGssWxYTezQT0bg4OlkbgPxu6rCOVXwGCm8oqn 5EHPZ0GaKUJr9zEVKVDgAWaQRu8L2ntVBQltGSmaNScjh1OTjq5ov2Ok4WFLKwd6CZsw eQ== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2hmgdjcbqd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 30 Apr 2018 14:36:06 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w3UEa4ck004134 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 30 Apr 2018 14:36:04 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w3UEa0fQ031192; Mon, 30 Apr 2018 14:36:00 GMT Received: from [192.168.1.10] (/73.69.118.222) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 30 Apr 2018 07:36:00 -0700 Subject: Re: [PATCH RCFv2 1/7] mm: introduce and use PageOffline() To: David Hildenbrand , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Greg Kroah-Hartman , Ingo Molnar , Andrew Morton , Philippe Ombredanne , Thomas Gleixner , Dan Williams , Michal Hocko , Jan Kara , "Kirill A. Shutemov" , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Matthew Wilcox , Souptick Joarder , Hugh Dickins , Huang Ying , Miles Chen , Vlastimil Babka , Reza Arbab , Mel Gorman , Tetsuo Handa References: <20180430094236.29056-1-david@redhat.com> <20180430094236.29056-2-david@redhat.com> From: Pavel Tatashin Message-ID: <4d112f60-3c24-585e-152e-b42d68c899a2@oracle.com> Date: Mon, 30 Apr 2018 10:35:57 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180430094236.29056-2-david@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8878 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1804300141 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Dave, A few comments below: > + for (i = 0; i < PAGES_PER_SECTION; i++) { Performance wise, this is unfortunate that we have to add this loop for every hot-plug. But, I do like the finer hot-plug granularity that you achieve, and do not have a better suggestion how to avoid this loop. What I also like, is that you call init_single_page() only one time. > + unsigned long pfn = phys_start_pfn + i; > + struct page *page; > + if (!pfn_valid(pfn)) > + continue; > + page = pfn_to_page(pfn); > + > + /* dummy zone, the actual one will be set when onlining pages */ > + init_single_page(page, pfn, ZONE_NORMAL, nid); Is there a reason to use ZONE_NORMAL as a dummy zone? May be define some non-existent zone-id for that? I.e. __MAX_NR_ZONES? That might trigger some debugging checks of course.. In init_single_page() if WANT_PAGE_VIRTUAL is defined it is used to set virtual address. Which is broken if we do not belong to ZONE_NORMAL. 1186 if (!is_highmem_idx(zone)) 1187 set_page_address(page, __va(pfn << PAGE_SHIFT)); Otherwise, if you want to keep ZONE_NORMAL here, you could add a new function: #ifdef WANT_PAGE_VIRTUAL static void set_page_virtual(struct page *page, and enum zone_type zone) { /* The shift won't overflow because ZONE_NORMAL is below 4G. */ if (!is_highmem_idx(zone)) set_page_address(page, __va(pfn << PAGE_SHIFT)); } #else static inline void set_page_virtual(struct page *page, and enum zone_type zone) {} #endif And call it from init_single_page(), and from __meminit memmap_init_zone() in "context == MEMMAP_HOTPLUG" if case. > > -static void __meminit __init_single_page(struct page *page, unsigned long pfn, > +extern void __meminit init_single_page(struct page *page, unsigned long pfn, I've seen it in other places, but what is the point of having "extern" function in .c file? > #ifdef CONFIG_MEMORY_HOTREMOVE > -/* Mark all memory sections within the pfn range as online */ > +static bool all_pages_in_section_offline(unsigned long section_nr) > +{ > + unsigned long pfn = section_nr_to_pfn(section_nr); > + struct page *page; > + int i; > + > + for (i = 0; i < PAGES_PER_SECTION; i++, pfn++) { > + if (!pfn_valid(pfn)) > + continue; > + > + page = pfn_to_page(pfn); > + if (!PageOffline(page)) > + return false; > + } > + return true; > +} Perhaps we could use some counter to keep track of number of subsections that are currently offlined? If section covers 128M of memory, and offline/online is 4M granularity, there are up-to 32 subsections in a section, and thus we need 5-bits to count them. I'm not sure if there is a space in mem_section for this counter. But, that would eliminate the loop above. Thank you, Pavel