Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1908816imm; Thu, 11 Oct 2018 01:40:39 -0700 (PDT) X-Google-Smtp-Source: ACcGV60scXah/H3e7FLuEgfcnls1J5xfsfZycrOxR99caVh2yOLZHM2nrq/Hy8C9/LN7Srh2dotQ X-Received: by 2002:a17:902:ab8a:: with SMTP id f10-v6mr641087plr.203.1539247239752; Thu, 11 Oct 2018 01:40:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539247239; cv=none; d=google.com; s=arc-20160816; b=Ti/s+mBPDL7VZPdX3HeXUgedG4YnrR/fEC9T6TjS9Bq9ZfL+2hq0wjdg97yhm+yPHE x5oFK1UC/s8/oPPzSXxP8HmX2zXNYKwap1xX9kGd6u+0F94xYWjRK9p71ls9uBt8PPKe XZCnFx7euf6NZnJ6kUCkXALiE602ZoSJjiFAWNr0jqniu5tllMle3e0XDbFhz27gsXiJ S4UBU8Se9kJ+AqurDTxQfZ/4XYxG6hRQwdUsEaAgIj12rOAzSxfrhvc5BihFdmr1OF4z J8+rGFMcgnirKVvIF7mCf4N4a/TSLkoO80xqAzbdPMFzLLTd0yYpVn3zPFbj04iyOw4+ j3Hg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=TQCaSsXC60aN0DT7pbEKe8mYAfB87zZnHlugA8CElWI=; b=SXOJ2SF5Bk2XoYoxERfYvqON0O2zV9/Vxhd0LC9JyH3Q0VvRKMaZf8DejRBLnpUAkz iFxDAvPsm5MqmFBxDlUmubTD3iAogb2zCYPIulgfowmK12mXZaqPcw9vK355ixL4cNv2 e3l12iPYh5CsXpR5Du3yGfgZg1XwtarNdh9HpO1U8jINGkVQQaTbzYaIbPEVqYakIw4b r9Zq9LJ9KlkHrjGgyf7McpO/uGMxQkLZBmHUAONTuXItpwzHJrzjpcdrvpms8ZBHaxoD elNJP9Kpqvi8WqHueEi6yUKzkhFdBZx60pUDkVEFbrcZfE6ylZG2hyfaAt2ZMNggRXml O56A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g8-v6si28434242pli.338.2018.10.11.01.40.25; Thu, 11 Oct 2018 01:40:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727328AbeJKPlY (ORCPT + 99 others); Thu, 11 Oct 2018 11:41:24 -0400 Received: from mx2.suse.de ([195.135.220.15]:57278 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726008AbeJKPlY (ORCPT ); Thu, 11 Oct 2018 11:41:24 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 90E99AE8A; Thu, 11 Oct 2018 08:15:12 +0000 (UTC) Date: Thu, 11 Oct 2018 10:15:10 +0200 From: Michal Hocko To: Vlastimil Babka Cc: Arun KS , kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, boris.ostrovsky@oracle.com, jgross@suse.com, akpm@linux-foundation.org, dan.j.williams@intel.com, iamjoonsoo.kim@lge.com, gregkh@linuxfoundation.org, osalvador@suse.de, malat@debian.org, kirill.shutemov@linux.intel.com, jrdr.linux@gmail.com, yasu.isimatu@gmail.com, mgorman@techsingularity.net, aaron.lu@intel.com, devel@linuxdriverproject.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, xen-devel@lists.xenproject.org, vatsa@codeaurora.org, vinmenon@codeaurora.org, getarunks@gmail.com Subject: Re: [PATCH v5 1/2] memory_hotplug: Free pages as higher order Message-ID: <20181011081510.GR5873@dhcp22.suse.cz> References: <1538727006-5727-1-git-send-email-arunks@codeaurora.org> <72215e75-6c7e-0aef-c06e-e3aba47cf806@suse.cz> <97d8db4c-f117-8216-5f48-d5991692c867@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <97d8db4c-f117-8216-5f48-d5991692c867@suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 11-10-18 10:07:02, Vlastimil Babka wrote: > On 10/10/18 6:56 PM, Arun KS wrote: > > On 2018-10-10 21:00, Vlastimil Babka wrote: > >> On 10/5/18 10:10 AM, Arun KS wrote: > >>> When free pages are done with higher order, time spend on > >>> coalescing pages by buddy allocator can be reduced. With > >>> section size of 256MB, hot add latency of a single section > >>> shows improvement from 50-60 ms to less than 1 ms, hence > >>> improving the hot add latency by 60%. Modify external > >>> providers of online callback to align with the change. > >>> > >>> Signed-off-by: Arun KS > >> > >> [...] > >> > >>> @@ -655,26 +655,44 @@ void __online_page_free(struct page *page) > >>> } > >>> EXPORT_SYMBOL_GPL(__online_page_free); > >>> > >>> -static void generic_online_page(struct page *page) > >>> +static int generic_online_page(struct page *page, unsigned int order) > >>> { > >>> - __online_page_set_limits(page); > >> > >> This is now not called anymore, although the xen/hv variants still do > >> it. The function seems empty these days, maybe remove it as a followup > >> cleanup? > >> > >>> - __online_page_increment_counters(page); > >>> - __online_page_free(page); > >>> + __free_pages_core(page, order); > >>> + totalram_pages += (1UL << order); > >>> +#ifdef CONFIG_HIGHMEM > >>> + if (PageHighMem(page)) > >>> + totalhigh_pages += (1UL << order); > >>> +#endif > >> > >> __online_page_increment_counters() would have used > >> adjust_managed_page_count() which would do the changes under > >> managed_page_count_lock. Are we safe without the lock? If yes, there > >> should perhaps be a comment explaining why. > > > > Looks unsafe without managed_page_count_lock. I think better have a > > similar implementation of free_boot_core() in memory_hotplug.c like we > > had in version 1 of patch. And use adjust_managed_page_count() instead > > of page_zone(page)->managed_pages += nr_pages; > > > > https://lore.kernel.org/patchwork/patch/989445/ > > Looks like deferred_free_range() has the same problem calling > __free_pages_core() to adjust zone->managed_pages. deferred initialization has one thread per node AFAIR so we cannot race on managed_pages updates. Well, unless some of the mentioned can run that early which I dunno. > __free_pages_bootmem() is OK because at that point the system is still > single-threaded? > Could be solved by moving that out of __free_pages_core(). > > But do we care about readers potentially seeing a store tear? If yes > then maybe these counters should be converted to atomics... I wanted to suggest that already but I have no idea whether the lock instructions would cause more overhead. -- Michal Hocko SUSE Labs