Received: by 10.192.165.148 with SMTP id m20csp3903965imm; Mon, 30 Apr 2018 08:18:38 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoTr7addg0Lc79d5mtubxKJGWz7NIBvhdBk3EwhZfprzsmFIwf9F3hV9LFpFy8MBQ9o2DYj X-Received: by 2002:a63:6783:: with SMTP id b125-v6mr10100895pgc.177.1525101518621; Mon, 30 Apr 2018 08:18:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525101518; cv=none; d=google.com; s=arc-20160816; b=hP7nj0ysv7qTRHAkM0IHqQ3oKSiOTVQCAPMXX4vO63W+3hde9wP/bEtA2wrzT/nR0R miRO2rJREPKPxxGVqtzAdeVRhVDiDtHpjKyVTLAlXahgmGTLADKt5868hc+T1WIylptq v+ORGqK4ckMN9wWVVzicNeI23IAR1AQDucCbdw4RSSv/1CW5TJAi+ylakVTo2G1miXjr 1zVW7jE+Au79HHcXYrneIBlRqnUvPwejqwA1W4xyyrgA74QMu/br4XTWGlJQqg8V/qXZ YsYYAK0lSz/fm8OZderhZzMlX++5t4sGElFy0SBUCW7KKXeoDhFvsl1g9Cp7UtPyWFwk 6x/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=tpm/qtpW6QPD1UcPIMRmgcaXm8Dbi5m3wTVGwb6HKS0=; b=P1/oOQW6bJQLyC+dfFsxGnI5Gr/2RbcoU79rXdGSlBuEIMWSOFiyMdpxVx8B+UZP3Q /fyykM7F3pKr/rZ0nLNFlouwbmQlYZAskIsx1BlEX32TafFgO99qO6ClIDnzoYgarmuQ idb7zLJfqM6UTNvhaleXRyZeEA03F9IwikS3ClMIqEBvlAs5+QCJkJ4VKZ+IagzOcRPc Fcm5aZLUxDMzs9Nx/twRv83TeNMjQ38XkKlNZmQFwS4mnnR+h3l1zHyy3oJiRyUvkkoo BSZ+e85+2RRdu3YkPiDb0bDkY3vwSFuZe53pIViQWc4g+ocqYk2Y+ViFWwLRSCFzFGXb U2LA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x15-v6si5908306pgq.442.2018.04.30.08.18.24; Mon, 30 Apr 2018 08:18:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753840AbeD3PRk (ORCPT + 99 others); Mon, 30 Apr 2018 11:17:40 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:37712 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751938AbeD3PRj (ORCPT ); Mon, 30 Apr 2018 11:17:39 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 16F14F95CD; Mon, 30 Apr 2018 15:17:39 +0000 (UTC) Received: from [10.36.116.188] (ovpn-116-188.ams2.redhat.com [10.36.116.188]) by smtp.corp.redhat.com (Postfix) with ESMTP id B0AEF111AF06; Mon, 30 Apr 2018 15:17:32 +0000 (UTC) Subject: Re: [PATCH RCFv2 1/7] mm: introduce and use PageOffline() To: Pavel Tatashin , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Greg Kroah-Hartman , Ingo Molnar , Andrew Morton , Philippe Ombredanne , Thomas Gleixner , Dan Williams , Michal Hocko , Jan Kara , "Kirill A. Shutemov" , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Matthew Wilcox , Souptick Joarder , Hugh Dickins , Huang Ying , Miles Chen , Vlastimil Babka , Reza Arbab , Mel Gorman , Tetsuo Handa References: <20180430094236.29056-1-david@redhat.com> <20180430094236.29056-2-david@redhat.com> <4d112f60-3c24-585e-152e-b42d68c899a2@oracle.com> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <28068791-bee4-095e-7338-cda4d229c3de@redhat.com> Date: Mon, 30 Apr 2018 17:17:31 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <4d112f60-3c24-585e-152e-b42d68c899a2@oracle.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Mon, 30 Apr 2018 15:17:39 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Mon, 30 Apr 2018 15:17:39 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'david@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30.04.2018 16:35, Pavel Tatashin wrote: > Hi Dave, > > A few comments below: > >> + for (i = 0; i < PAGES_PER_SECTION; i++) { > > Performance wise, this is unfortunate that we have to add this loop for every hot-plug. But, I do like the finer hot-plug granularity that you achieve, and do not have a better suggestion how to avoid this loop. What I also like, is that you call init_single_page() only one time. Thanks! Yes, unfortunately we cannot live with the single loop when onlining pages for this feature. > >> + unsigned long pfn = phys_start_pfn + i; >> + struct page *page; >> + if (!pfn_valid(pfn)) >> + continue; >> + page = pfn_to_page(pfn); >> + >> + /* dummy zone, the actual one will be set when onlining pages */ >> + init_single_page(page, pfn, ZONE_NORMAL, nid); > > Is there a reason to use ZONE_NORMAL as a dummy zone? May be define some non-existent zone-id for that? I.e. __MAX_NR_ZONES? That might trigger some debugging checks of course.. Than it could happen that we consume more bits in pageflags than we actually need. But it could be an opt-in debugging option later on, right? > > In init_single_page() if WANT_PAGE_VIRTUAL is defined it is used to set virtual address. Which is broken if we do not belong to ZONE_NORMAL. > Grr, missed that. Thanks for your very good eyes! > 1186 if (!is_highmem_idx(zone)) > 1187 set_page_address(page, __va(pfn << PAGE_SHIFT)); > > Otherwise, if you want to keep ZONE_NORMAL here, you could add a new function: > > #ifdef WANT_PAGE_VIRTUAL > static void set_page_virtual(struct page *page, and enum zone_type zone) > { > /* The shift won't overflow because ZONE_NORMAL is below 4G. */ > if (!is_highmem_idx(zone)) > set_page_address(page, __va(pfn << PAGE_SHIFT)); > } > #else > static inline void set_page_virtual(struct page *page, and enum zone_type zone) > {} > #endif > > And call it from init_single_page(), and from __meminit memmap_init_zone() in "context == MEMMAP_HOTPLUG" if case. Was thinking about moving it to set_page_zone() and conditionally setting it to 0 or set_page_address(page, __va(pfn << PAGE_SHIFT)). What do you prefer? > >> >> -static void __meminit __init_single_page(struct page *page, unsigned long pfn, >> +extern void __meminit init_single_page(struct page *page, unsigned long pfn, > > I've seen it in other places, but what is the point of having "extern" function in .c file? I've seen it all over the place, that's why I am using it :) (as I basically had the same question). Can somebody answer that? > > >> #ifdef CONFIG_MEMORY_HOTREMOVE >> -/* Mark all memory sections within the pfn range as online */ >> +static bool all_pages_in_section_offline(unsigned long section_nr) >> +{ >> + unsigned long pfn = section_nr_to_pfn(section_nr); >> + struct page *page; >> + int i; >> + >> + for (i = 0; i < PAGES_PER_SECTION; i++, pfn++) { >> + if (!pfn_valid(pfn)) >> + continue; >> + >> + page = pfn_to_page(pfn); >> + if (!PageOffline(page)) >> + return false; >> + } >> + return true; >> +} > > Perhaps we could use some counter to keep track of number of subsections that are currently offlined? If section covers 128M of memory, and offline/online is 4M granularity, there are up-to 32 subsections in a section, and thus we need 5-bits to count them. I'm not sure if there is a space in mem_section for this counter. But, that would eliminate the loop above. Yes, that would also be an optimization. At least I optimized it for now so ordinary offline/online is not harmed. As we need PageOffline() also for kdump (and maybe later also for safety checks when onlining/offlining pages), we would right now store duplicate information, so I would like to defer that. Thanks a lot Pavel! > > Thank you, > Pavel > -- Thanks, David / dhildenb